We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 2330

Summary: Debian bug report: SDL2 X11 driver buffer overflow with large X11 file descriptor
Product: SDL Reporter: manuel.montezelo
Component: *don't know*Assignee: Ryan C. Gordon <icculus>
Status: RESOLVED FIXED QA Contact: Sam Lantinga <slouken>
Severity: normal    
Priority: P2 CC: sven+sdl
Version: 2.0.0   
Hardware: x86_64   
OS: Linux   
Attachments: buffer_overflow_fdset.patch
testkeys.c
normal run with SDL2 2.0.1
backtrace when running SDL2 2.0.1 with modified testkey
testcase.c
0001-Replace-select-for-X11-messages-with-poll-in-X11_Pen.patch

Description manuel.montezelo 2013-12-28 13:27:52 UTC
Created attachment 1502 [details]
buffer_overflow_fdset.patch

Original bug report (note that it was against 2.0.0, it might have been fixed in between):  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733015

--------------------------------------------------------
Package: libsdl2-2.0-0
Version: 2.0.0+dfsg1-3
Severity: normal
Tags: patch

I have occasional crashes here caused by the X11 backend of SDL2. It seems to
be caused by the X11_Pending function trying to add a high number (> 1024)
file descriptor to a fd_set before doing a select on it to avoid busy waiting
on X11 events. This causes a buffer overflow because the file descriptor is
larger (or equal) than the limit FD_SETSIZE.

Attached is a possible workaround patch.

Please also keep in mind that fd_set are also used in following files which
may have similar problems.

src/audio/bsd/SDL_bsdaudio.c
src/audio/paudio/SDL_paudio.c
src/audio/qsa/SDL_qsa_audio.c
src/audio/sun/SDL_sunaudio.c
src/joystick/linux/SDL_sysjoystick.c


--------------------------------------------------------

On Tuesday 24 December 2013 00:43:13 Sven Eckelmann wrote:
> I have occasional crashes here caused by the X11 backend of SDL2. It seems
> to be caused by the X11_Pending function trying to add a high number (>
> 1024) file descriptor to a fd_set before doing a select on it to avoid busy
> waiting on X11 events. This causes a buffer overflow because the file
> descriptor is larger (or equal) than the limit FD_SETSIZE.

I personally experienced this problem while hacking on the python bindings 
package for SDL2 [1] (while doing make runtest). But it easier to reproduce in 
a smaller, synthetic testcase.

It can be build + tested with:

$ gcc `sdl2-config --cflags` testkeys.c `sdl2-config --libs` -o testkeys
$ ./testkeys


--------------------------------------------------------
Comment 1 manuel.montezelo 2013-12-28 13:29:07 UTC
Created attachment 1503 [details]
testkeys.c
Comment 2 Sven Eckelmann 2013-12-28 21:49:26 UTC
Created attachment 1504 [details]
normal run with SDL2 2.0.1

Output when running on Debian with X11 video backend (build with buffer overflow checks like fortify and co as seen at https://buildd.debian.org/status/fetch.php?pkg=libsdl2&arch=i386&ver=2.0.1%2Bdfsg1-1&stamp=1388235943 )

You will not notice this problem when running without buffer overflow checks using the GCC/glibc functionality or valgrind. It will most likely just do something weird because it writes at the wrong position in memory.
Comment 3 Sven Eckelmann 2013-12-28 22:03:21 UTC
Created attachment 1505 [details]
backtrace when running SDL2 2.0.1 with modified testkey

The problem can be seen when checking the variables on stack at

#6  0x00007ffff7b9d460 in X11_Pending (display=0x607dc0) at /tmp/buildd/libsdl2-2.0.1+dfsg1/src/video/x11/SDL_x11events.c:945
        __d = 2055
        x11_fd = 2055
        fdset = {__fds_bits = {0 <repeats 16 times>}}
        zero_time = {tv_sec = 0, tv_usec = 0}

FD_SET has to enable the the bit for x11_fd (value 2055). the fdset just contains 16x 64 bit integers. These are enough to store 1024 bit. But the x11_fd is outside this range [0, 1023] and therefore the write to enable the corresponding bit is outside of fdset -> a bit somewhere else on the stack is changed to 1.

This example was run with the 2.0.1 SDL2 build from Debian. So it is not a SDL2 2.0.0-only problem.

There are other APIs (platform dependent like epoll or kqueue) which support event wait calls for high-value FDs but every fdset based one only supports FDs up to FD_SETSIZE-1.

The attached patch is just to avoid this blocking wait and immediately return with a "not positive" result
Comment 4 Sven Eckelmann 2013-12-29 09:24:01 UTC
Before I forget it. Please check your ulimit settings because a relative low setting for number of file descriptors (lower than ~2060) will result in following early abort of the test:

# ./testkeys    
Couldn't initialize SDL: No available video device
Comment 5 Sven Eckelmann 2013-12-29 09:33:42 UTC
Created attachment 1506 [details]
testcase.c

Modified version of testkeys.c which opens FD_SETSIZE * 2 filedescriptors (and checks that they are really opened) to force an buffer overflow in this function when using the X11 backend.

The earlier version of this test did not check if the open call was successful and thus didn't warn the user that the testcase setup part failed.
Comment 6 Sam Lantinga 2013-12-29 18:12:20 UTC
Wow, I didn't realize that the system allowed file descriptors above the fdset size. This probably bites a whole bunch of other software as well.

What's the recommended general fix for this? Simply stubbing out select calls will break functionality.
Comment 7 Sven Eckelmann 2013-12-29 18:36:22 UTC
Good question. There are different very good select replacements which are very platform specific. But you are only wanting to poll for some bytes in the fd buffer (immediately return because zero_time is... all zero), right?

So my first suggestion: use poll with a single poolfd entry (and mostly likely polling for POLLIN | POLLPRI). This should be good enough if you really don't want to drop the "select like" behavior.

I am not sure how many systems support the POSIX 1003.1g poll() but I would guess that more systems support select() (but I they are not really important for SDL2 anyway).

Another (weird idea) is: do a recv MSG_PEEK | MSG_DONTWAIT for one byte on the fd. But MSG_DONTWAIT is for example not supported under Windows.
Comment 8 Sven Eckelmann 2013-12-29 18:54:29 UTC
Created attachment 1507 [details]
0001-Replace-select-for-X11-messages-with-poll-in-X11_Pen.patch

Updated patch description:

Replace select() for X11 messages with poll() in X11_Pending

ConnectionNumber in X11_Pending may return a value outside of the range [0,
FD_SETSIZE). This value cannot be stored inside a fd_set and will crash the
program. This buffer overflow problem occasionally happens when a lot of file
descriptors are used.

The poll system call can be used instead. It doesn't use a fixed size bit-array
to store the file descriptors to wait for and instead depends on an array of
descriptors + requested events. This allows file descriptors up to INT_MAX and
automatically handles invalid (negative) file descriptors.