We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 1128 - SDL Mutex Implementation Subpar
Summary: SDL Mutex Implementation Subpar
Status: RESOLVED FIXED
Alias: None
Product: SDL
Classification: Unclassified
Component: thread (show other bugs)
Version: 2.0.0
Hardware: x86 Windows (All)
: P2 enhancement
Assignee: Sam Lantinga
QA Contact: Sam Lantinga
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-16 22:58 UTC by Patrick Baggett
Modified: 2011-02-17 09:27 UTC (History)
1 user (show)

See Also:


Attachments
Use CriticalSection API rather than Mutex API (2.39 KB, patch)
2011-02-17 02:39 UTC, Patrick Baggett
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Baggett 2011-02-16 22:58:33 UTC
This enhancement is for both x86/x64 Windows.

The SDL implementation of mutexes uses the Win32 API interprocess synchronization primitive called a "Mutex". This implementation is subpar because it has a much higher overhead than an intraprocess mutex. The exact technical details are below, but my tests have shown that for reasonably high contention (10 threads on 4 physical cores), it has 13x higher overhead than the Win32 CriticalSection API.

If this enhancement is accepted, I will write a patch to implement SDL mutexes using the critical section API, which should dramatically reduce overhead and improve scalability.


-- Tech details --
Normally, Win32 Mutexes are used across process boundaries to synchronize separate processes. In order to lock or unlock them, a user->kernel space transition is necessary, even in the uncontented case on a single CPU machine. Win32 CriticalSection objects can only be used within the same process virtual address space and thus to lock one, does not require a user->kernel space transition for the uncontended case, and additionally may spin a short while before going into kernel wait. This small spin allows a thread to obtain the lock if the mutex is released shortly after the thread starts spinning, in effect bypassing the overhead of user->kernel space transition which has higher overhead than the spinning itself.
Comment 1 Sam Lantinga 2011-02-17 01:04:41 UTC
Sure, thanks!

Do you give me permission to release your code with SDL 1.3 and future
versions of SDL under both the LGPL and a closed-source commercial
license?
Comment 2 Patrick Baggett 2011-02-17 01:11:45 UTC
(In reply to comment #1)
> Sure, thanks!
> 
> Do you give me permission to release your code with SDL 1.3 and future
> versions of SDL under both the LGPL and a closed-source commercial
> license?

Yep. I'd like to see if there are any other low hanging fruits (read: easy bugs) that I can fix as well. I'm subscribing to SDL development mailing list right now as well. Patch in progress.
Comment 3 Sam Lantinga 2011-02-17 02:29:41 UTC
Thanks, I appreciate it. :)
Comment 4 Patrick Baggett 2011-02-17 02:39:31 UTC
Created attachment 578 [details]
Use CriticalSection API rather than Mutex API