We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 1296 - SDL_SetVideoMode crashes because of unaligned MOVAPS instruction
Summary: SDL_SetVideoMode crashes because of unaligned MOVAPS instruction
Status: RESOLVED FIXED
Alias: None
Product: SDL
Classification: Unclassified
Component: video (show other bugs)
Version: HG 2.0
Hardware: x86 Windows 7
: P2 major
Assignee: Sam Lantinga
QA Contact: Sam Lantinga
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-30 14:22 UTC by t.grundner
Modified: 2012-01-02 20:34 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description t.grundner 2011-08-30 14:22:06 UTC
Two days ago I got myself a fresh HG1.3 copy and compiled it on my windows 7 box using the latest mingw/msys package.(20110802, gcc-4.5.2). Straightaway I wrote a little SDL test app that opens a 640x480x32 window. So far so good. I then added a function to my test app called SetPixel. This caused the app to crash with SIGSEGV. At first I thought something is wrong with my SetPixel function but the app crashed before this function got called at all. According to the stacktrace it crashed within the SDL_SetVideoMode call. I started up GDB and stepped through SDL_SetVideoMode and found the application crashes within SDL_FillRect4SSE when MOVAPS is executed to move four fill color values to one of the MMX 128 bit registers. Here is the instruction used:

 movaps -0x38(%ebp),%xmm0

In my case EBP is pointing to 0x28FDD0. This means MOVAPS is trying to move 16 bytes starting from 0x28FD98. This is not possible because for MOVAPS to work the memory address has to be a multiple of 16. The result is the before mentioned segmentation fault.

So I took a look at the source code and found the macro which creates the MOVAPS instruction. It is called SSE_BEGIN and is part of SDL_fillrect.c:

#ifdef _MSC_VER
#define SSE_BEGIN \
    __m128 c128; \
    c128.m128_u32[0] = color; \
    c128.m128_u32[1] = color; \
    c128.m128_u32[2] = color; \
    c128.m128_u32[3] = color;
#else
#define SSE_BEGIN \
    DECLARE_ALIGNED(Uint32, cccc[4], 16); \
    cccc[0] = color; \
    cccc[1] = color; \
    cccc[2] = color; \
    cccc[3] = color; \
    __m128 c128 = *(__m128 *)cccc;
#endif

Within SSE_BEGIN there is another macro called DECLARE_ALIGNED which ensures the four color values cccc[4] are 16 byte aligned:

#if defined(__GNUC__)
#define DECLARE_ALIGNED(t,v,a)  t __attribute__((aligned(a))) v
#elif defined(_MSC_VER)
#define DECLARE_ALIGNED(t,v,a)  __declspec(align(a)) t v
#else
#define DECLARE_ALIGNED(t,v,a)  t v
#endif

I spent some time wondering why this is not working and came to the following conclusion. I think it crashes because cccc[4] is allocated on the stack. I cannot think of a way how the compiler could ensure alignment for such values as it cannot know where EBP and ESP are pointing to at runtime. It will without a doubt work for global or static variables but then again those are not MT safe. IMHO a quick solution is to force the use of MOVUPS instead of MOVAPS.

If you don't mind I could try to build a patch.
Comment 1 t.grundner 2011-08-31 23:46:58 UTC
I did a bit of googling to find more about this problem and came across the following link:

http://stackoverflow.com/questions/841433/gcc-attribute-alignedx-explanation

There is some code showing off how alignment of values on the stack fails. I tried it and actually it worked. I guess things have changed since gcc 4.1.2. I disassembled the code and learned that the compiler is capable of aligning stack values by using a simple AND on the ESP register.

So I took a deeper look at the SDL_FillRect4SSE function but could not find any AND alignments of the stack. I played a bit with the source code and as I changed the following line from

    DECLARE_ALIGNED(Uint32, cccc[4], 16); \

to

    DECLARE_ALIGNED(Uint32, cccc[4], 32); \

the AND appeared:

   0x68178df0 <SDL_FillRect+0>:  push   %ebp
   0x68178df1 <SDL_FillRect+1>:  mov    %esp,%ebp

   0x68178df3 <SDL_FillRect+3>:  and    $0xffffffe0,%esp

   0x68178df6 <SDL_FillRect+6>:  push   %edi
   0x68178df7 <SDL_FillRect+7>:  push   %esi
   0x68178df8 <SDL_FillRect+8>:  push   %ebx
   0x68178df9 <SDL_FillRect+9>:  sub    $0xe4,%esp
   0x68178dff <SDL_FillRect+15>: mov    0x8(%ebp),%esi
   0x68178e02 <SDL_FillRect+18>: mov    0xc(%ebp),%eax
   0x68178e05 <SDL_FillRect+21>: mov    0x10(%ebp),%ebx
=> 0x68178e08 <SDL_FillRect+24>: test   %esi,%esi


On the one hand I am quite happy that my app stopped crashing but on the other hand I am still wondering what is going on. Is GCC making some kind of assumption that the stack is always 16 byte aligned on my system?
Comment 2 t.grundner 2011-09-01 03:59:17 UTC
I figured out what is going on. GCC 4.5.2 assumes the stack is 16 byte aligned by default. Therefore there are no AND alignment corrections necessary if we wish to align a stack variable to a 16 byte boundary. That is bad if your OS ABI is not 16 byte aligned. Windows 32 bit stacks are 4 byte aligned. This results in the above mentioned SIGSEGV. This is also no problem if I compile both SDL.dll and my app with MingW because MinGW/GCC inserts a

    	andl	$-16, %esp

instruction right in the beginning of the main function. So at least the stack of the thread calling the main function is 16 byte aligned. But as soon as I start to use the SDL.dll from an application not compiled by MinGW there is no ANDL safing my app.

However there is a GCC option that can change the default stack alignment:

        -mpreferred-stack-boundary=num

Setting num=2 assumes a the stack is aligned to a 4 byte boundary. This results in GCC inserting the necessary 

    	andl	$-16, %esp

into SDL_FillRect. Rebuilding SDL with

       ./configure "CFLAGS=-mpreferred-stack-boundary=2 -g -O3"

solved the problem.

IMHO this should also be a problem on Solaris.

The following links contain further information:

http://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options

http://www.agner.org/optimize/calling_conventions.pdf
Comment 3 Sam Lantinga 2011-12-29 02:37:15 UTC
Wow, good work!

Can you try out the latest snapshot?  I think I've added a fix:
http://hg.libsdl.org/SDL/rev/66f48d6bf735
Comment 4 t.grundner 2012-01-02 14:01:33 UTC
Thanks for the "Wow". I am glad I could help. Actually it was fun doing some low level assembler stuff again.

The latest snapshot fixed the problem. Thanks a lot!!!
Comment 5 Sam Lantinga 2012-01-02 20:34:42 UTC
Great, thanks!