| Summary: | Classic open source engine Exult (Ultima VII) segfaults on exit with the KMSDRM backend | ||
|---|---|---|---|
| Product: | SDL | Reporter: | Manuel Alfayate <redwindwanderer2> |
| Component: | video | Assignee: | Sam Lantinga <slouken> |
| Status: | RESOLVED FIXED | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | CC: | icculus, sylvain.becker |
| Version: | HG 2.1 | Keywords: | target-2.0.10 |
| Hardware: | All | ||
| OS: | Linux | ||
| Attachments: |
Valgrind log for Exult on SDL2 on KMSDRM
test case |
||
|
Description
Manuel Alfayate
2019-03-01 17:53:44 UTC
Most of the log is video driver and can be ignore (really ?).
The part which is very likely the issue is:
==28799==
==28799== Invalid read of size 8
==28799== at 0x4A52FFA: KMSDRM_WarpMouseGlobal (SDL_kmsdrmmouse.c:425)
==28799== by 0x4A52FB4: KMSDRM_WarpMouse (SDL_kmsdrmmouse.c:415)
==28799== by 0x4888A77: SDL_WarpMouseInWindow_REAL (SDL_mouse.c:695)
==28799== by 0x4941E82: SDL_RestoreMousePosition (SDL_video.c:1177)
==28799== by 0x4942246: SDL_UpdateFullscreenMode (SDL_video.c:1344)
==28799== by 0x4944A5E: SDL_HideWindow_REAL (SDL_video.c:2174)
==28799== by 0x494688C: SDL_DestroyWindow_REAL (SDL_video.c:2696)
==28799== by 0x4946C3A: SDL_VideoQuit_REAL (SDL_video.c:2821)
==28799== by 0x4862166: SDL_QuitSubSystem_REAL (SDL.c:318)
==28799== by 0x48622A4: SDL_Quit_REAL (SDL.c:378)
==28799== by 0x487A5C0: SDL_Quit (SDL_dynapi_procs.h:89)
==28799== by 0x52B142B: __run_exit_handlers (exit.c:108)
==28799== Address 0x83417c8 is 8 bytes inside a block of size 16 free'd
==28799== at 0x483897B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28799== by 0x48DA2B0: SDL_free_REAL (SDL_malloc.c:5372)
==28799== by 0x4A52F8B: KMSDRM_FreeCursor (SDL_kmsdrmmouse.c:406)
==28799== by 0x4888858: SDL_MouseQuit (SDL_mouse.c:612)
==28799== by 0x4946C0E: SDL_VideoQuit_REAL (SDL_video.c:2813)
==28799== by 0x4862166: SDL_QuitSubSystem_REAL (SDL.c:318)
==28799== by 0x48622A4: SDL_Quit_REAL (SDL.c:378)
==28799== by 0x487A5C0: SDL_Quit (SDL_dynapi_procs.h:89)
==28799== by 0x52B142B: __run_exit_handlers (exit.c:108)
==28799== by 0x52B1559: exit (exit.c:139)
==28799== by 0x52910A1: (below main) (libc-start.c:342)
What happens:
First SDL_MouseQuit() which free mouse
Then SDL_VideoQuit() which call KMSDRM_WarpMouseGlobal and invalid read.
KMSDRM_WarpMouseGlobal, which does:
if (mouse != NULL && mouse->cur_cursor != NULL && mouse->cur_cursor->driverdata != NULL) {
mouse->cur_cursor is freed, but isn't cleared.
Patch: clear "mouse->cur_cursor" in SDL_QuitMouse()
--- a/src/events/SDL_mouse.c Sat Feb 23 09:36:56 2019 +0100
+++ b/src/events/SDL_mouse.c Fri Mar 08 18:18:59 2019 +0100
@@ -607,6 +607,7 @@
cursor = next;
}
mouse->cursors = NULL;
+ mouse->cur_cursor = NULL;
if (mouse->def_cursor && mouse->FreeCursor) {
mouse->FreeCursor(mouse->def_cursor);
Let me know if this works
(In reply to Sylvain from comment #1) > Most of the log is video driver and can be ignore (really ?). > > The part which is very likely the issue is: > > ==28799== > ==28799== Invalid read of size 8 > ==28799== at 0x4A52FFA: KMSDRM_WarpMouseGlobal (SDL_kmsdrmmouse.c:425) > ==28799== by 0x4A52FB4: KMSDRM_WarpMouse (SDL_kmsdrmmouse.c:415) > ==28799== by 0x4888A77: SDL_WarpMouseInWindow_REAL (SDL_mouse.c:695) > ==28799== by 0x4941E82: SDL_RestoreMousePosition (SDL_video.c:1177) > ==28799== by 0x4942246: SDL_UpdateFullscreenMode (SDL_video.c:1344) > ==28799== by 0x4944A5E: SDL_HideWindow_REAL (SDL_video.c:2174) > ==28799== by 0x494688C: SDL_DestroyWindow_REAL (SDL_video.c:2696) > ==28799== by 0x4946C3A: SDL_VideoQuit_REAL (SDL_video.c:2821) > ==28799== by 0x4862166: SDL_QuitSubSystem_REAL (SDL.c:318) > ==28799== by 0x48622A4: SDL_Quit_REAL (SDL.c:378) > ==28799== by 0x487A5C0: SDL_Quit (SDL_dynapi_procs.h:89) > ==28799== by 0x52B142B: __run_exit_handlers (exit.c:108) > ==28799== Address 0x83417c8 is 8 bytes inside a block of size 16 free'd > ==28799== at 0x483897B: free (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==28799== by 0x48DA2B0: SDL_free_REAL (SDL_malloc.c:5372) > ==28799== by 0x4A52F8B: KMSDRM_FreeCursor (SDL_kmsdrmmouse.c:406) > ==28799== by 0x4888858: SDL_MouseQuit (SDL_mouse.c:612) > ==28799== by 0x4946C0E: SDL_VideoQuit_REAL (SDL_video.c:2813) > ==28799== by 0x4862166: SDL_QuitSubSystem_REAL (SDL.c:318) > ==28799== by 0x48622A4: SDL_Quit_REAL (SDL.c:378) > ==28799== by 0x487A5C0: SDL_Quit (SDL_dynapi_procs.h:89) > ==28799== by 0x52B142B: __run_exit_handlers (exit.c:108) > ==28799== by 0x52B1559: exit (exit.c:139) > ==28799== by 0x52910A1: (below main) (libc-start.c:342) > > What happens: > > First SDL_MouseQuit() which free mouse > Then SDL_VideoQuit() which call KMSDRM_WarpMouseGlobal and invalid read. > > KMSDRM_WarpMouseGlobal, which does: > if (mouse != NULL && mouse->cur_cursor != NULL && > mouse->cur_cursor->driverdata != NULL) { > > mouse->cur_cursor is freed, but isn't cleared. > > Patch: clear "mouse->cur_cursor" in SDL_QuitMouse() > > --- a/src/events/SDL_mouse.c Sat Feb 23 09:36:56 2019 +0100 > +++ b/src/events/SDL_mouse.c Fri Mar 08 18:18:59 2019 +0100 > @@ -607,6 +607,7 @@ > cursor = next; > } > mouse->cursors = NULL; > + mouse->cur_cursor = NULL; > > if (mouse->def_cursor && mouse->FreeCursor) { > mouse->FreeCursor(mouse->def_cursor); > > > Let me know if this works Thanks for your input on this, really. Having the KMSDRM driver in good shape is very important, I believe! :) This worked, yes. No more crashes in KMSDRM_WarpMouseGlobal()! However, there are other crashes. This one seems impossible to debug for me, too, so I would be very thankful if you gave me ideas on this one. It's a segfault on KMSDRM_VideoQuit(), when this call is done: KMSDRM_gbm_device_destroy(vdata->gbm); This seems to be the typical case where I am destroying something already destroyed, but the ONLY other place where vdata->gbm is destroyed is in KMSDRM_VideoInit(), on the "cleaup:" label. There was a missing return before the "cleanup:" label, so I added "return ret;" after the "KMSDRM_InitMouse(_this);" call there. But I am STILL getting the segfault on KMSDRM_VideoQuit(): [code] Thread 1 "exult" received signal SIGSEGV, Segmentation fault. 0x00007ffff71854a8 in malloc_consolidate (av=av@entry=0x7ffff72d9c40 <main_arena>) at malloc.c:4442 4442 malloc.c: No such file or directory. #0 0x00007ffff71854a8 in malloc_consolidate (av=av@entry=0x7ffff72d9c40 <main_arena>) at malloc.c:4442 #1 0x00007ffff718758c in _int_free (av=0x7ffff72d9c40 <main_arena>, p=0x555555bcf8d0, have_lock=<optimized out>) at malloc.c:4348 #2 0x00007ffff355e8c8 in ?? () from /usr/lib/x86_64-linux-gnu/dri/i965_dri.so #3 0x00007ffff3593d19 in ?? () from /usr/lib/x86_64-linux-gnu/dri/i965_dri.so #4 0x00007ffff3525bdf in ?? () from /usr/lib/x86_64-linux-gnu/dri/i965_dri.so #5 0x00007ffff3db5740 in ?? () from /usr/lib/x86_64-linux-gnu/libgbm.so.1 #6 0x00007ffff7f2cb42 in KMSDRM_VideoQuit (_this=0x55555583eb60) at /home/manuel/src/SDL2-2.0.9/src/video/kmsdrm/SDL_kmsdrmvideo.c:536 #7 0x00007ffff7e1ec74 in SDL_VideoQuit_REAL () at /home/manuel/src/SDL2-2.0.9/src/video/SDL_video.c:2823 #8 0x00007ffff7d3a167 in SDL_QuitSubSystem_REAL (flags=62001) at /home/manuel/src/SDL2-2.0.9/src/SDL.c:318 #9 0x00007ffff7d3a2a5 in SDL_Quit_REAL () at /home/manuel/src/SDL2-2.0.9/src/SDL.c:378 #10 0x00007ffff7d525c1 in SDL_Quit () at /home/manuel/src/SDL2-2.0.9/src/dynapi/SDL_dynapi_procs.h:89 #11 0x00007ffff713942c in __run_exit_handlers (status=0, listp=0x7ffff72d9718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108 #12 0x00007ffff713955a in __GI_exit (status=<optimized out>) at exit.c:139 #13 0x00007ffff71190a2 in __libc_start_main (main=0x5555555a6080 <main>, argc=3, argv=0x7fffffffe3d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe3c8) at ../csu/libc-start.c:342 #14 0x00005555555a8eea in _start () frame 6 #6 0x00007ffff7f2cb42 in KMSDRM_VideoQuit (_this=0x55555583eb60) at /home/manuel/src/SDL2-2.0.9/src/video/kmsdrm/SDL_kmsdrmvideo.c:536 536 KMSDRM_gbm_device_destroy(vdata->gbm); print vdata->gbm $1 = (struct gbm_device *) 0x55555583b5e0 [/code] So this is REALLY the first time KMSDRM_gbm_device_destroy() is called, because of course KMSDRM_VideoInit() returns before the "cleanup:" label. Any ideas on what could be going on here? Just wondering, how do you try kmsdrm ? need a RPY ? Really not sure but some guess in KMSDRM_VideoQuit: - maybe SDL_GL_UnloadLibrary() should be the last thing to do - if (vdata->drm_fd > 0 drm_fp 0 should also be checked no ? it's a valid file descriptor. Only -1 is invalid ? (In KMSDRM_VideoQuit, that means save_crtc isn't restored) - also, it seems to be call by through an at_exit(), maybe it's not the best place and it's always confusing how it gets called. Maybe it should be done by calling SDL_Quit before the main returns. - I would also run Valgrind to see what happens. Let me know if this works Added the missing "return ret;" in VideoInit(); I'm surprised it could even start without this. https://hg.libsdl.org/SDL/rev/8e52484226af SDL_MouseQuit(): clear mouse->cur_cursor in https://hg.libsdl.org/SDL/rev/0010b9a064d3 (In reply to Sylvain from comment #3) > Just wondering, how do you try kmsdrm ? need a RPY ? > > > Really not sure but some guess in KMSDRM_VideoQuit: > > - maybe SDL_GL_UnloadLibrary() should be the last thing to do > > - if (vdata->drm_fd > 0 > > drm_fp 0 should also be checked no ? it's a valid file descriptor. Only -1 > is invalid ? > > (In KMSDRM_VideoQuit, that means save_crtc isn't restored) > > > - also, it seems to be call by through an at_exit(), maybe it's not the best > place and it's always confusing how it gets called. Maybe it should be done > by calling SDL_Quit before the main returns. > > - I would also run Valgrind to see what happens. > > > Let me know if this works Thanks a lot for these ideas, Sylvain! I tried them and things worked (somehow!): In KMSDRM_VideoQuit(): -Putting the SDL_GL_UnloadLibrary() as the last thing to do did not make a difference regarding segfaults, but it seems a good idea to me, too, if you want to move it. -I'm not sure what you mean about the if (vdata->drm_fd > 0): in KMSDRM_VideoQuit(), it's "if (vdata->drm_fd => 0)", because the the file descriptor is an integer that, as far as I know, could be zero too. Anyway, nothing to change there, as far as I can understand. NOW, what did things work, NO MORE SEGFAULTS ON QUIT: I moved SDL_Quit() from the AtExit to the end of the main() function on Exult. And voila!! No more segfaults, no matter how hard I try :D You don't need a Raspberry Pi to test the SDL2 KMSDRM video driver: that's precisely the good thing about KMSDRM, it works on any X86 workstation and on a Raspberry Pi. Just build SDL2 with: ./configure --prefix=/usr --enable-video-kmsdrm --enable-video-opengles make sudo make install (you should end up with SDL2 libs in /usr/lib) Beware you may need to manually remove your distro's libSDL2 from /usr/lib/x86_64-linux-gnu, or your programs will keep trying to use that instead of your freshly built KMSDRM-enabled version. And you need to be in a TTY console ouside the Xorg server to use it! CTRL+ALT+F2 should give you one. If you run a KMSDRM program under GDB and it segfaults, GDB will be waiting for your input while the program is frozen in fullscreen, so debug via SSH if you try this. So, two questions: 1) What do you think about the solution of moving SDL_Quit() to the end of main()? If it's a good idea (and it seems to be) then why doesn't it crash when using the SDL2 X11 backend? 2) Could you *please* give it a try under Valgrind and give me ideas and opinions? Feel free to make any fixes, of course! (Now that you know that you can test this on your very computer, no Pi needed!) I gave a try and got a non initialized memory warning for ioctcl. Fixed at https://hg.libsdl.org/SDL/rev/141e8355078d Otherwise: - I can run a KMSDRM helloworld or my apps. But I have no crash in the end. Using atexit(SDL_Quit) or not. Not sure what is exactly the difference when it's called from atexit. - For me, mouse and keyboards doesn't work. But joysticks does. - Exult didn't work. It starts but remain white screen. About KMSDRM_VideoQuit() and "if(vdata->drm_fd > 0" - Sorry I try to explain: drm_fd -1 is invalid and drm_fd >= 0 are correct fd so checking "vdata->drm_fd > 0" is at first strange: you want a valid fd, but not the one which has value 0. (maybe it would make sense) (In reply to Sylvain from comment #7) > I gave a try and got a non initialized memory warning for ioctcl. Fixed at > https://hg.libsdl.org/SDL/rev/141e8355078d > > Otherwise: > - I can run a KMSDRM helloworld or my apps. But I have no crash in the end. > Using atexit(SDL_Quit) or not. > Not sure what is exactly the difference when it's called from atexit. > > - For me, mouse and keyboards doesn't work. But joysticks does. > > - Exult didn't work. It starts but remain white screen. > > About KMSDRM_VideoQuit() and "if(vdata->drm_fd > 0" > - Sorry I try to explain: > drm_fd -1 is invalid and drm_fd >= 0 are correct fd > so checking "vdata->drm_fd > 0" is at first strange: you want a valid fd, > but not the one which has value 0. (maybe it would make sense) Thanks A LOT about that! :) Exult does work here. White screen? How strange... Never saw that! What video chip does your test computer have? I am on Intel here ("Intel Corporation HD Graphics 620" according to lspci) and it works fine. It also works on the VC4 (Raspberry Pi). Do other SDL2 games work in KMSDRM mode for you? Like SDLPop (https://github.com/NagyD/SDLPoP), for example. About the keyboard and mouse not working: keep in mind this X-less solution uses udev for input reading. So put yourself on the input group, or try this: https://docs.libretro.com/guides/input-joypad-drivers/#setting-up-udev-permissions That should make mouse and keyboard work. Now, about the (vdata->drm_fd > 0) comparision, sorry, now I understand. And yes, it makes sense to do vdata->drm_fd >= 0) instead. Since you can do commits directly, feel free to make this one :) I haven't yet checked more precisely on kmsdrm, but on x11 for instance, it displays a splash screen asking to provide the game static data. Not sure how to get them. btw, you're the author of the kmsdrm code ? (from bug 3690), so you're more aware of the modification that should me done on this side ? Can you provide test patch ? (In reply to Sylvain from comment #9) > I haven't yet checked more precisely on kmsdrm, but on x11 for instance, it > displays a splash screen asking to provide the game static data. Not sure > how to get them. > > btw, you're the author of the kmsdrm code ? (from bug 3690), so you're more > aware of the modification that should me done on this side ? Can you provide > test patch ? I'm not the only author of this: I just polished the work done by Sigmaris (Hugh Cole-Baker), who did the bulk of the driver, so there are things outside my understanding. As for the change, I just meant that you change: if(vdata->drm_fd > 0 && vdata->saved_conn_id > 0) { for: if(vdata->drm_fd >= 0 && vdata->saved_conn_id > 0) { in KMSDRM_VideoQuit(). That's according to your previous comments, where I think you are right about the right comparision being >=. Ok. So I fixed this in https://hg.libsdl.org/SDL/rev/958835bd12c2 at two places. Please double-check it! (In reply to Sylvain from comment #11) > Ok. So I fixed this in https://hg.libsdl.org/SDL/rev/958835bd12c2 > at two places. Please double-check it! Thanks! It seems good! Did you get Exult to work in KMSDRM mode? Other games like SDLPop work for you in KMSDRM mode? Thanks to your link, I got the mouse and keyboard working. I tried my apps and all is correct. I also try to use atexit(SDL_Quit). And it also works fine. -> Please double if you still have the issue ? run with valgrind gdb. I can't run exult because it requires the mods ? / eg games datas, which I haven't. I think it's the same with SDLPop. (But do you have the atexit issue with it?) @Sylvain: I don't have any problems with SDLPop, since it does not use atexit(), but it calls SDL_Quit() properly instead. I just wanted to know if SDLPop worked for you in KMSDRM mode, since it works great here. I have uploaded the datas for both SDLPop and Exult, so you can try them: https://www.dropbox.com/s/bfr8wy005lck42b/sdlpop_datas.zip?dl=0 https://www.dropbox.com/s/mvo6j3w0filsixv/exult_datas.zip?dl=0 In Exult, with latest SDL2 revision, I still have the segfault on quit but only if I let it use "atexit()" to call SDL_Quit(). If I move SDL_Quit() to the end of the main function (in exult.cc), there are no problems at all. I have ran exult (with the original atexit(SDL_Quit), which causes the segfault on quit) with both valgrind and gdb: here are the results. file:///home/manuel/ultima7/exult_valgrind_log002.txt file:///home/manuel/ultima7/exult_gdb_log002.txt What could be causing this, is beyond my understanding. Thanks, Can you upload the valgrind files ? (In reply to Sylvain from comment #15) > Thanks, > Can you upload the valgrind files ? What valgrind files? I copied all valgrind output to this file: file:///home/manuel/ultima7/exult_valgrind_log002.txt (In reply to Manuel Alfayate from comment #16) > (In reply to Sylvain from comment #15) > > Thanks, > > Can you upload the valgrind files ? > > What valgrind files? I copied all valgrind output to this file: > file:///home/manuel/ultima7/exult_valgrind_log002.txt Oh, sorry!!! How stupid on my part... Here they are! https://www.dropbox.com/s/ges504rcz24vwiv/exult_valgrind_log002.txt?dl=0 https://www.dropbox.com/s/5twgfltscfjwriq/exult_gdb_log002.txt?dl=0 Sorry! :D Thanks, I got exult running. It also crashes at the end. But I can't run gdb. As you said, I should connect with ssh and I have nothing setup like that. Last message is "SDL restoring keyboard". After putting some printf, it seems to happens in: src/core/linux/SDL_evdev_kbd.c in the "raise(sig);" and if I comment this is even after. Maybe there are threading issues, and this is not related. Can you try to add an "atexit(SDL_Quit)" into SDLPop and see if it crashes the same way ? (In reply to Sylvain from comment #18) > Thanks, I got exult running. It also crashes at the end. But I can't run > gdb. As you said, I should connect with ssh and I have nothing setup like > that. > > Last message is "SDL restoring keyboard". > After putting some printf, it seems to happens in: > src/core/linux/SDL_evdev_kbd.c > in the "raise(sig);" > and if I comment this is even after. Maybe there are threading issues, and > this is not related. > > > Can you try to add an "atexit(SDL_Quit)" into SDLPop and see if it crashes > the same way ? I have tried moving the SDL_Quit() call in SDLPop to atexit(SDL_Main), and IT DOES CRASH IN THE SAME WAY. So... This is what was going on. Confirmed. But why? and why ONLY on KMSDRM and not on X11? (In reply to Manuel Alfayate from comment #19) > (In reply to Sylvain from comment #18) > > Thanks, I got exult running. It also crashes at the end. But I can't run > > gdb. As you said, I should connect with ssh and I have nothing setup like > > that. > > > > Last message is "SDL restoring keyboard". > > After putting some printf, it seems to happens in: > > src/core/linux/SDL_evdev_kbd.c > > in the "raise(sig);" > > and if I comment this is even after. Maybe there are threading issues, and > > this is not related. > > > > > > Can you try to add an "atexit(SDL_Quit)" into SDLPop and see if it crashes > > the same way ? > > I have tried moving the SDL_Quit() call in SDLPop to atexit(SDL_Main), and > IT DOES CRASH IN THE SAME WAY. > So... This is what was going on. Confirmed. But why? and why ONLY on KMSDRM > and not on X11? Also, the "restoring keyboard" message is not related to the problem. It's what modern SDL2 does when it segfaults (if it wouldn't do it, keyboard in X-less mode would be blocked after any segfault, so it's a good thing). Created attachment 3698 [details]
test case
Here's a very small testcase. I'm not sure of what happens.
but to reproduce the issue, it seems to need:
1) a user atexit before CreateWindow, or only before SDL_init.
2) that the user atexit does the SDL_DestroyWindow() (via SDL_Quit), but not the SDL_DestroyRenderer()
because 1), maybe it conflicts with the other atexit: kbd_cleanup_atexit
in src/core/linux/SDL_evdev_kbd.c which is probably installed in the CreateWindow
when calling: CreateWindow -> KMSDRM_VideoInit -> SDL_EVDEV_Init > SDL_EVDEV_kbd_init
I looked again, and it doesn't seem to conflict with kbd_cleanup_atexit. (If I comment it out, it still crash). I think it's more like: atexit(SDL_Quit) is called before/after shared libraries destructor, whether it's called before/after those shared libaries are loaded. If so, solution would just to call atexit after SDL_CreateWindow (maybe even after SDL_CreateRenderer). Maybe you can try to move the atexit after SDL_CreateWindow/Renderer in exult and sdlpop ? Valgrind log is: my_at_exist() is doing an invalid write, because egl context was freed first, by exit/__run_exit_handlers (libEGL_mesage). And it was allocated first, KMSDRM_GLES_LoadLibrar / SDL_GL_LoadLibrary INFO: my_at_exit ==7337== Invalid write of size 8 ==7337== at 0x12520E05: ??? (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x12521524: eglMakeCurrent (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x52AFA1B: ??? (in /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0) ==7337== by 0x49737BF: SDL_EGL_MakeCurrent (SDL_egl.c:800) ==7337== by 0x4A7C1F0: KMSDRM_GLES_MakeCurrent (SDL_kmsdrmopengles.c:185) ==7337== by 0x4987A63: SDL_GL_MakeCurrent_REAL (SDL_video.c:3509) ==7337== by 0x4988FB7: SDL_GL_DeleteContext_REAL (SDL_video.c:3604) ==7337== by 0x48D28BE: GL_DestroyRenderer (SDL_render_gl.c:1452) ==7337== by 0x48C93FF: SDL_DestroyRenderer_REAL (SDL_render.c:3216) ==7337== by 0x4892F83: SDL_DestroyRenderer (SDL_dynapi_procs.h:379) ==7337== by 0x4011E8: my_at_exit (main_bug_kmsdrm_2.c:12) ==7337== by 0x531A42B: __run_exit_handlers (exit.c:108) ==7337== Address 0x122fc700 is 32 bytes inside a block of size 48 free'd ==7337== at 0x483897B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7337== by 0x1252B3F1: ??? (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x1252C8A3: ??? (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x531A42B: __run_exit_handlers (exit.c:108) ==7337== by 0x531A559: exit (exit.c:139) ==7337== by 0x52FA0A1: (below main) (libc-start.c:342) ==7337== Block was alloc'd at ==7337== at 0x4839775: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7337== by 0x1252B41F: ??? (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x12520DF8: ??? (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x1252113E: eglGetProcAddress (in /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0.0.0) ==7337== by 0x52B33CB: ??? (in /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0) ==7337== by 0x52B39D9: ??? (in /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0) ==7337== by 0x64990C6: __pthread_once_slow (pthread_once.c:116) ==7337== by 0x52B3B8E: ??? (in /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0) ==7337== by 0x52AF7D0: ??? (in /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0) ==7337== by 0x4972691: SDL_EGL_LoadLibrary (SDL_egl.c:421) ==7337== by 0x4A7BBA0: KMSDRM_GLES_LoadLibrary (SDL_kmsdrmopengles.c:40) ==7337== by 0x4981D8D: SDL_GL_LoadLibrary_REAL (SDL_video.c:2867) @Sylvain: In both SDLPop and Exult: I have moved atexit(SDL_Quit) after SDL_CreateWindow() and it still segfaults. BUT moving it after SDL_CreateRenderer(), it does NOT segfault! Does it help to solve the puzzle on what's going on here? I am not sure how it should be fixed ... - ask exult/sdlpop to move their atexit but it isn't a fix. - maybe there is an API to see if mesa has already called its atexit handlers .. or to see if egl drivers are already terminated ... See: https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/eglglobals.c https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldisplay.c#L184 https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldevice.c#L53 https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldriver.c#L121 - or maybe could do some workaround and add immediatly an atexit in SDL_Init(), as soon as it starts. So we can first the exit ... I'd say ask some help on the mesa mailing list :/ (In reply to Sylvain from comment #24) > I am not sure how it should be fixed ... > > - ask exult/sdlpop to move their atexit but it isn't a fix. > > - maybe there is an API to see if mesa has already called its atexit > handlers .. > or to see if egl drivers are already terminated ... > > See: > https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/eglglobals. > c > > https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldisplay. > c#L184 > https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldevice. > c#L53 > https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/egl/main/egldriver. > c#L121 > > > - or maybe could do some workaround and add immediatly an atexit in > SDL_Init(), as soon as it starts. So we can first the exit ... > > I'd say ask some help on the mesa mailing list :/ So it's not KMSDRM's SDL2 fault, is it? It's a MESA problem as I understand your comments. Not really a mesa problem, but an issue that comes up using both atexit(SDL_Quit) + mesa which also has its atexit(). According to my previous valgrind log: mesa -through an atexit- cleans up what wasn't cleaned up when the application exits, which sounds ok. But, what happens is that exult also wants to clean up those things with an atexit() (the first thing is egl_display, allocated by SDL_EGL_LoadLibrary (SDL_egl.c:421), but what crashes it actually when re-destroying gbm). if SDL2 cleans up first, then mesa knows it and won't try to clean up it again. But if mesa cleans up first, SDL2 doesn't know it, and it tries to release it a second time, and that start making invalid writes.
> if SDL2 cleans up first, then mesa knows it and won't try to clean up it
> again.
> But if mesa cleans up first, SDL2 doesn't know it, and it tries to release
> it a second time, and that start making invalid writes.
This isn't an immediate solution, but perhaps Mesa shouldn't be using atexit() but rather a function marked __attribute__((destructor)) so it runs on library unload instead of process termination?
--ryan.
Yes, it sounds also a possibility. I notice there is a Mesa function: _eglCheckDisplayHandle(EGLDisplay dpy); but I am not sure if we can access it since SDL is a generic layer. I'll write a message to the mailing list to have some advice Tagging a bunch of bugs with "target-2.0.10" so we have a clear list of things to address before a 2.0.10 release. Please note that "addressing" one of these bugs might mean deciding to defer on it until after 2.0.10, or resolving it as WONTFIX, etc. This is just here to tell us we should look at it carefully, and soon. If you have new information or feedback on this issue, this is a good time to add it to the conversation, as we're likely to be paying attention to this specific report in the next few days/weeks. Thanks! --ryan. Answer I got from mesa mailing list was simply not to use atexit(). I think this is a bad response from Mesa, but I also don’t think this is SDL’s bug: at this point, it’s either Mesa’s or Exult’s, depending on where you are standing. Should we close this? Yes, I think so, I'm closing it: Exult issue is here: https://github.com/exult/exult/issues/30 Mesa message and reply here: https://lists.freedesktop.org/archives/mesa-users/2019-March/001519.html |