| Summary: | SDL audio crashes on multicore Windows XP machines | ||
|---|---|---|---|
| Product: | SDL | Reporter: | James Haley <haleyjd> |
| Component: | audio | Assignee: | Ryan C. Gordon <icculus> |
| Status: | RESOLVED ABANDONED | QA Contact: | Sam Lantinga <slouken> |
| Severity: | blocker | ||
| Priority: | P2 | CC: | fraggle+libsdl, sezeroz |
| Version: | HG 2.0 | Keywords: | target-2.0.0 |
| Hardware: | x86 | ||
| OS: | Windows (XP) | ||
| URL: | http://doomworld.com/eternity/ | ||
|
Description
James Haley
2008-09-22 17:05:15 UTC
Thanks for the info. I don't suppose you have a good test case? :) Bumping priority on a set of bugs. --ryan. This is a follow-up to verify that this bug is still occurring in SDL 1.2.14 with the latest SDL_mixer version under Windows 7 on a quad core machine. I recently built this machine and so this is the first opportunity I have had to capture this crash myself - all previous data was obtained from end users. The crash as I was able to capture it occurred in ntdll.dll at 0x773f8c39 as called from winmm.dll - evidently this bug is related to MIDI and not digital sound, and evidence from other users (Simon Howard of Chocolate Doom and Andrey Budko of PrBoom-Plus) now agrees with this. The code accessed a NULL pointer as a structure, and thus crashed trying to write to 0x00000014. I believe there are possible race conditions in the Win32 native MIDI code relating to the callback which is passed to MCI and is used to fill the buffer with MIDI events. This callback is dispatched asynchronously from deep inside the winmm system each time it wants more events, so any data it touches needs protection from concurrent modification. Just to clarify, because this bug seems to be poorly named; it doesn't seem to be a Windows-only thing. Under Linux I get lockups on multicore machines if the thread affinity workaround is not used. (In reply to comment #4) > Just to clarify, because this bug seems to be poorly named; it doesn't seem to > be a Windows-only thing. Under Linux I get lockups on multicore machines if > the thread affinity workaround is not used. Are you using native MIDI on Linux, or the Timidity codepath? --ryan. Timidity. (In reply to comment #6) > Timidity. Intriguing! Can someone get me a stacktrace? Windows, Linux, doesn't matter. It would be useful to know exactly where it crashes, and who called the crashing function. Thanks, --ryan. I've tried in the past to attach a debugger to get a stack trace, but each time, I was unable to get a meaningful trace - the stack just seemed to be junk. I'm not sure why. In the Linux case, the program just hard locks-up and stops responding to any input. I'm assuming that the Windows situation is the same. Anyway, right now I'm currently running Chocolate Doom on my EeePC with the affinity workaround disabled. The CPU is an Intel Atom with hyperthreading, and I managed to reproduce the bug on here before. I've set up the level so that two monsters are constantly shooting me (ie. playing sound effects), so hopefully this should trigger it. It has been a while since I last looked into this bug and reproduced it, so it's worth my re-confirming that it's still a problem. I have a stack dump, but since the Windows crash occurs inside one of the winmm threads that is related to the MIDI event pump callback, there's very little information to be obtained from it, thanks to Windows' closed-source nature. It looks like this: 773f8c39 < IP on offending thread is here 773df871 773f8c64 773f8b48 7432f651 (etc for dozens of lines) The first 4 call addresses are inside ntdll.dll - the majority of the remainder are in the winmm (aka MCI) system. The crash was an access violation at 0x00000014, and the instruction involved was moving the contents of an address to a register after having added 0x14 to the address - quite likely a structure field offset, but the structure pointer was unexpectedly NULL. All the addresses I could see on threads other than the app's main thread seemed to be inside system DLLs. It is notable that my application itself was stopped in the middle of the rendering process for the current frame, and was not performing any action to manipulate or change the state of the music playback. If you think it will do any good, I can try to get a better record of the crash, including the full state of all threads. I just have to wait for it to happen again after disabling the process affinity mask. It's been almost a year since the last update. Is this bug still an issue? If so, could you point me to source code of a project known to have these issues, and example files known to have this occur? Also, what are you using to trace the exception on Windows? MSVC debugger, GCD, SetUnhandledExceptionFilter(), Structured Exception Handling, or some method I don't know of? Do Windows and Linux crashes seem to be totally related (IE, the same exact files and circumstances cause a crash)? (In reply to comment #10) > It's been almost a year since the last update. > > Is this bug still an issue? If so, could you point me to source code of a > project known to have these issues, and example files known to have this occur? > > Also, what are you using to trace the exception on Windows? MSVC debugger, GCD, > SetUnhandledExceptionFilter(), Structured Exception Handling, or some method I > don't know of? > > Do Windows and Linux crashes seem to be totally related (IE, the same exact > files and circumstances cause a crash)? To the best of our knowledge the Windows crash is unresolved. Simon and Andrey Budko of the PrBoom-Plus project no longer believe that the Linux crash was related, and have stated that it seems to be resolved. The source code of my project is at http://mancubus.net/svn/hosted/eternity/trunk - you need one or more commercial or shareware DOOM IWAD files in order to run this program, however. They can be placed either in the program's working directory or into the individual corresponding folder for that game under the "base" folder. The source code for Chocolate Doom, another SDL-based DOOM port, can be found on Sourceforge - its project homepage is the first hit for a Google search. The exception which I caught 3 years ago was caught in Visual Studio 6. Our development has moved to VC++ 2008 since then (the program can also be compiled with a CMake system for Linux, but you probably won't find the bug there). To disable the process affinity mask workaround, you must edit base/system.cfg and set the affinity mask setting to 0 (it defaults to 1 due to the fact this problem destabilizes the program if left uncorrected). Ryan, can you take a look at this for the 2.0 release? (Sorry if you get a lot of copies of this email, we're touching dozens of bug reports right now.) Tagging a bunch of bugs as target-2.0.0, Priority 2. This means we're in the final stretch for an official SDL 2.0.0 release! These are the bugs we really want to fix before shipping if humanly possible. That being said, we don't promise to fix them because of this tag, we just want to make sure we don't forget to deal with them before we bless a final 2.0.0 release, and generally be organized about what we're aiming to ship. Hopefully you'll hear more about this bug soon. If you have more information (including "this got fixed at some point, nevermind"), we would love to have you come add more information to the bug report when you have a moment. Thanks! --ryan. Hello, and sorry if you're getting dozens of copies of this message by email. We are closing out bugs that appear to be abandoned in some form. This can happen for lots of reasons: we couldn't reproduce it, conversation faded out, the bug was noted as fixed in a comment but we forgot to mark it resolved, the report is good but the fix is impractical, we fixed it a long time ago without realizing there was an associated report, etc. Individually, any of these bugs might have a better resolution (such as WONTFIX or WORKSFORME or INVALID) but we've added a new resolution of ABANDONED to make this easily searchable and make it clear that it's not necessarily unreasonable to revive a given bug report. So if this bug is still a going concern and you feel it should still be open: please feel free to reopen it! But unless you respond, we'd like to consider these bugs closed, as many of them are several years old and overwhelming our ability to prioritize recent issues. (please note that hundred of bug reports were sorted through here, so we apologize for any human error. Just reopen the bug in that case!) Thanks, --ryan. |