We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 1340

Summary: NULL pointer access in ALSA_GetDeviceBuf results in a crash
Product: SDL Reporter: Ellie <etc0de>
Component: audioAssignee: Ryan C. Gordon <icculus>
Status: RESOLVED FIXED QA Contact: Sam Lantinga <slouken>
Severity: critical    
Priority: P1 Keywords: target-2.0.0
Version: HG 2.0   
Hardware: x86   
OS: Linux   

Description Ellie 2011-12-09 16:13:33 UTC
Apparently when using alsa as sound device on a pulseaudio system (where alsa is emulated by pulseaudio), there is a condition that can lead to the current SDL code failing to open audio (which is pulseaudio's fault) and then subsequently returning NULL for ALSA_GetDeviceBuf and crashing (which is SDL's fault). The result is that the application cannot handle missing sound gracefully, but simply crashes.

I am not sure if critical is the correct severity, but since it is a crash I suppose it is (feel free to adjust if that's not the case).

Valgrind output:

ALSA lib pcm_pulse.c:734:(pulse_prepare) PulseAudio: Unable to create stream: Connection terminated

==6271== Thread 3:
==6271== Invalid read of size 4
==6271==    at 0x80997FA: ALSA_GetDeviceBuf (SDL_alsa_audio.c:336)
==6271==    by 0x805F778: SDL_RunAudio (SDL_audio.c:485)
==6271==    by 0x808DEDE: SDL_RunThread (SDL_thread.c:209)
==6271==    by 0x809CC2E: RunThread (SDL_systhread.c:54)
==6271==    by 0x41E6651D: clone (clone.S:133)
==6271==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==6271== 
==6271== 
==6271== HEAP SUMMARY:
==6271==     in use at exit: 23,031,767 bytes in 24,109 blocks
==6271==   total heap usage: 101,501 allocs, 77,392 frees, 101,062,741 bytes allocated
==6271== 
==6271== LEAK SUMMARY:
==6271==    definitely lost: 645 bytes in 12 blocks
==6271==    indirectly lost: 0 bytes in 0 blocks
==6271==      possibly lost: 1,043,698 bytes in 21,802 blocks
==6271==    still reachable: 21,987,424 bytes in 2,295 blocks
==6271==         suppressed: 0 bytes in 0 blocks
==6271== Rerun with --leak-check=full to see details of leaked memory
==6271== 
==6271== For counts of detected and suppressed errors, rerun with: -v
==6271== Use --track-origins=yes to see where uninitialised values come from
==6271== ERROR SUMMARY: 11 errors from 5 contexts (suppressed: 0 from 0)
Killed

(Ignore the leaks which are from other unrelated application code written by me that isn't fully leak-proof yet)
Comment 1 Ellie 2011-12-09 16:15:15 UTC
Sorry, it seems that rather this->hidden or the _THIS parameter is null. But you'll figure out yourself if you look at the code line valgrind points out.
Comment 2 Ellie 2012-07-24 05:58:04 UTC
Any additional information I could possibly provide? Is this possibly a pulseaudio bug which SDL cannot fix?
Comment 3 Ryan C. Gordon 2013-07-12 18:52:32 UTC
(Sorry if you get a lot of copies of this email, we're touching dozens of bug reports right now.)

Tagging a bunch of bugs as target-2.0.0, Priority 1.

This means we're in the final stretch for an official SDL 2.0.0 release! These are the bugs we really want to fix before shipping if humanly possible.

That being said, we don't promise to fix them because of this tag, we just want to make sure we don't forget to deal with them before we bless a final 2.0.0 release, and generally be organized about what we're aiming to ship.

Hopefully you'll hear more about this bug soon. If you have more information (including "this got fixed at some point, nevermind"), we would love to have you come add more information to the bug report when you have a moment.

Thanks!
--ryan.
Comment 4 Ryan C. Gordon 2013-07-15 22:35:40 UTC
tl;dr: we already fixed this, unintentionally.

Ok, here's what's going on with this bug.

I accidentally fixed this here...

    http://hg.libsdl.org/SDL/rev/7f22b9ba218f#l2.84

...where I changed OpenDevice to return -1 on failure instead of 0. You'll note that one I changed there was already returning -1, though, which the higher level code would have treated as success for being non-zero.

Compliments of the Steam Runtime SDK's debug builds, I could set a breakpoint on that failing pulse_prepare() to see where it gets called, and sure enough, it's on the stack above where that wrong return value gets used. It doesn't fail on my machine (but the first call to it fails correctly without killing the connection, so it's probably something they fixed in libasound), but let's pretend it still does fail epically.

Having failed, our ALSA code has cleaned itself up, including freeing/NULLing this->hidden, and returned an incorrect -1 instead of the zero it should have.

Amazingly, we get as far as spinning a thread, and that background thread is the first thing that calls into the ALSA code again: GetDeviceBuf. It then dereferences a NULL pointer and blows up.

Since we corrected the return value, the ALSA code reports the failure properly, and the SDL audio code cleans up and reports the OpenDevice failure to the app as it should, without crashing.

So we're good here!   :)

--ryan.