We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 2513

Summary: stack overflow in X11_CreateWindow
Product: SDL Reporter: Brad Smith <brad>
Component: videoAssignee: Ryan C. Gordon <icculus>
Status: RESOLVED FIXED QA Contact: Sam Lantinga <slouken>
Severity: major    
Priority: P2 CC: icculus, r
Version: 2.0.2   
Hardware: All   
OS: OpenBSD   

Description Brad Smith 2014-04-26 08:40:30 UTC
Trying to switch QEMU 2.0.0 over from SDL 1 to SDL 2 and I ran into a crash when
simply starting up QEMU with "qemu-system-i386 -display sdl". This works fine for SDL 1. I am filing it here as this looks more like a bug within SDL.

#0  0x00001167d3b2ff5a in kill () at <stdin>:2
2       <stdin>: No such file or directory.
        in <stdin>
(gdb) bt full
#0  0x00001167d3b2ff5a in kill () at <stdin>:2
No locals.
#1  0x00001167d3b68fac in __stack_smash_handler (func=0x1167e0c65080 "X11_CreateWindow", damaged=Variable "damaged" is not available.
) at /usr/src/lib/libc/sys/stack_protector.c:61
        sdata = {log_file = -1, connected = 0, opened = 1, log_stat = 0, log_tag = 0x1165d3d34860 "qemu-system-i386", log_fac = 8, log_mask = 255}
        message = "stack overflow in function %s"
        sa = {__sigaction_u = {__sa_handler = 0, __sa_sigaction = 0}, sa_mask = 0, sa_flags = 0}
        mask = 4294967263
#2  0x00001167e0b54ce7 in X11_CreateWindow (_this=0x1167dc774800, window=0x1167de05b600) at /home/ports/pobj/sdl2-2.0.2/SDL2-2.0.2/src/video/x11/SDL_x11window.c:609
        colorcells = Variable "colorcells" is not available.
Comment 1 Sam Lantinga 2014-04-26 19:36:22 UTC
I've never seen this before. Can you reproduce it with the test programs?
Comment 2 Brad Smith 2014-04-27 03:52:55 UTC
Yes, using the tests I see the same crash with for example testgles / testgles2.
Comment 3 Brad Smith 2014-05-04 03:38:45 UTC
Bump.
Comment 4 Ryan C. Gordon 2014-05-10 20:38:32 UTC
I don't see anything obviously wrong in that function, but it's complex and calls a lot of Xlib things, too.

Is there a way you can isolate what part smashes the stack?

- Is Valgrind available?
- Is AddressSanitizer (in the latest GCC, or Clang) available?
- Can you coerce gdb's "watch" command to watch the canary on the stack to see if something overwrites it?

It's not clear to me what is smashing the stack (or even if it's a false-positive), so it's hard to fix it with the information we've currently got.

--ryan.
Comment 5 Ryan C. Gordon 2014-05-11 01:58:05 UTC
(In reply to Ryan C. Gordon from comment #4)
> I don't see anything obviously wrong in that function, but it's complex and
> calls a lot of Xlib things, too.
> 
> Is there a way you can isolate what part smashes the stack?
> 
> - Is Valgrind available?
> - Is AddressSanitizer (in the latest GCC, or Clang) available?
> - Can you coerce gdb's "watch" command to watch the canary on the stack to
> see if something overwrites it?

And desperation: start adding "return" calls to the function, and exit() immediately after the call if _stack_smash_handler() didn't trigger, so we can isolate it to the line of code in question?

--ryan.
Comment 6 rapha 2014-05-22 13:45:55 UTC
This diff seems to fix the issue, I'm not sure why..
Any thoughts?

--- ./video/x11/SDL_x11window.c.orig	Sat Mar  8 05:36:50 2014
+++ ./video/x11/SDL_x11window.c	Tue May 20 14:01:34 2014
@@ -356,7 +356,7 @@ X11_CreateWindow(_THIS, SDL_Window * window)
     XSizeHints *sizehints;
     XWMHints *wmhints;
     XClassHint *classhints;
-    const long _NET_WM_BYPASS_COMPOSITOR_HINT_ON = 1;
+    unsigned char _NET_WM_BYPASS_COMPOSITOR_HINT_ON = 1;
     Atom _NET_WM_BYPASS_COMPOSITOR;
     Atom _NET_WM_WINDOW_TYPE;
     Atom _NET_WM_WINDOW_TYPE_NORMAL;
@@ -540,7 +540,7 @@ X11_CreateWindow(_THIS, SDL_Window * window)
     _NET_WM_BYPASS_COMPOSITOR = X11_XInternAtom(display, "_NET_WM_BYPASS_COMPOSITOR", False);
     X11_XChangeProperty(display, w, _NET_WM_BYPASS_COMPOSITOR, XA_CARDINAL, 32,
                     PropModeReplace,
-                    (unsigned char *)&_NET_WM_BYPASS_COMPOSITOR_HINT_ON, 1);
+                    &_NET_WM_BYPASS_COMPOSITOR_HINT_ON, 1);
 
     {
         Atom protocols[] = {
Comment 7 rapha 2014-05-24 12:14:03 UTC
Sorry, please ignore my previous diff, it's obviously wrong.
Hopefully the following makes more sense: 

--- src/video/x11/SDL_x11window.c.orig  Sat Mar  8 05:36:50 2014
+++ src/video/x11/SDL_x11window.c       Sat May 24 13:47:35 2014
@@ -362,7 +362,7 @@ X11_CreateWindow(_THIS, SDL_Window * window)
     Atom _NET_WM_WINDOW_TYPE_NORMAL;
     Atom _NET_WM_PID;
     Atom XdndAware, xdnd_version = 5;
-    Uint32 fevent = 0;
+    long fevent = 0;

 #if SDL_VIDEO_OPENGL_GLX || SDL_VIDEO_OPENGL_EGL
     if ((window->flags & SDL_WINDOW_OPENGL) &&


(A long value will be written to &fevent by XGetICValues)
Comment 8 Brad Smith 2014-05-24 19:12:09 UTC
The last patch works for me. Looking at where XGetICValues is used in the 1.2 tree, from SDL_x11video.c..


   558 				unsigned long mask = 0;
   559 				char *ret = pXGetICValues(SDL_IC, XNFilterEvents, &mask, NULL);
   560 				if (ret != NULL) {
   561 					XUnsetICFocus(SDL_IC);
   562 					XDestroyIC(SDL_IC);
   563 					SDL_IC = NULL;
   564 					SDL_SetError("no input context could be created");
   565 					XCloseIM(SDL_IM);
   566 					SDL_IM = NULL;
   567 				} else {
   568 					XSelectInput(SDL_Display, WMwindow, app_event_mask | mask);
   569 					XSetICFocus(SDL_IC);
   570 				}

The code uses an unsigned long and the input mask for XSelectInput() is a long.
Comment 9 Brad Smith 2014-05-24 19:14:39 UTC
Also I think this just started to show up when OpenBSD switched from regular stack protector by default to stack protector strong.
Comment 10 Ryan C. Gordon 2014-05-25 01:02:56 UTC
(In reply to rapha from comment #7)
> (A long value will be written to &fevent by XGetICValues)

Ah, good catch, that's definitely the problem.

That's weird that the compiler didn't catch that, though. The OpenBSD buildbot compiles that with -Wall without a peep.
Comment 11 Ryan C. Gordon 2014-05-25 01:04:38 UTC
(In reply to Ryan C. Gordon from comment #10)
> That's weird that the compiler didn't catch that, though. The OpenBSD
> buildbot compiles that with -Wall without a peep.

Nevermind, it's because XGetICValues() looks like this:

    char * XGetICValues(XIC ic, ...);

...so the compiler doesn't know it requires a long in this case.

--ryan.
Comment 12 Ryan C. Gordon 2014-05-25 01:07:26 UTC
This is now fixed in https://hg.libsdl.org/SDL/rev/b74a32894c02, thank you everyone!

--ryan.
Comment 13 Sam Lantinga 2014-05-25 04:48:14 UTC
Nice work, everyone!