| Summary: | kmsdrm render reorders events? | ||
|---|---|---|---|
| Product: | SDL | Reporter: | Stas Sergeev <stsp2> |
| Component: | render | Assignee: | Manuel Alfayate Corchete <redwindwanderer> |
| Status: | RESOLVED ENDOFLIFE | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | ||
| Version: | 2.0.12 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
|
Description
Stas Sergeev
2020-10-09 20:52:23 UTC
In fact, the situation described, happens with the software renderer. Now I tried to create a hw renderer, and the situation is much worse, because then, if I call SDL_RenderPresent() periodically (which helped in case of a software renderer), I get a continuous flipping between the current and prev frame. So in case of a hw renderer, no work-around is found. @Stas Sergeev Thanks for this report. I am decided to fix this, but I don't have much time for this. I don't understand the problem very well; when you do a SDL_RenderPresent() call, in the KMSDRM backend the function KMSDRM_SwapWindowFenced() is called in the end, which simply requests a pageflip: https://hg.libsdl.org/SDL/file/7a3c36e0f598/src/video/kmsdrm/SDL_kmsdrmopengles.c#l110 How could this reorder events? What events? I just don't get it, any help to understand the problem is really welcome. Such a problem can't be allowed in the backend. @Stas Sergeev: One thing you can do is export the enviroment variable SDL_VIDEO_DOUBLE_BUFFER=1 (or simply do "SDL_VIDEO_DOUBLE_BUFFER=1 <your_program>") and tell us if it's still happening. Exporting SDL_VIDEO_DOUBLE_BUFFER env variable causes KMSDRM_GLES_SwapWindowDoubleBuffered() to be called instead of KMSDRM_GLES_SwapWindowFenced() on pageflip. Since KMSDRM_GLES_SwapWindowFenced() is more complicated, we could at least know if KMSDRM_GLES_SwapWindowFenced() is the problem. Additional problem here is that when I switch
too much between SDL/drm app and X to test what
you say, I eventually get the full DRI hangup,
and need to reboot machine. :)
But this of course is not an SDL fault, so I
tested
SDL_VIDEO_DOUBLE_BUFFER=1 <my_program>
and found no observable differences with the
SW renderer.
> How could this reorder events? What events?
Any SDL_RenderPresent() is preceded with 1 or
more SDL_RenderClear()/SDL_RenderCopy() calls.
My guess is that these calls are "batched", so
SDL_RenderPresent() can flip to the not-yet-ready
frame. Which is the only explanation I can think
of, when seeing the switch to an old frame!
Under normal circumstances I call SDL_RenderPresent()
only after a SDL_RenderClear()/SDL_RenderCopy()
sequences. But to work around the problem, I
tried to call SDL_RenderPresent() unconditionally,
periodically. And for SW renderer this indeed
worked around the problem. But such work-around
breaks HW renderer even in X. Which leads to a
question: shouldn't SDL_RenderPresent() check
if renderer was updated, and not switch to the
"outdated" content, but return an error instead?
@Stas Sergeev: What I understand is that you want to do several SDL_RenderClear()/SDL_RenderCopy() calls, and then ONE SDL_RenderPresent() call and that should work but it does not, right? Can you give me a small code example where I can see it fail so I can understand better? The smaller the example, the better. Something that tries to simply render a triangle, or a pixel.. something very simple that I can see that fails in KMSDRM and succeeds in X11. > What I understand is that you want to do several > SDL_RenderClear()/SDL_RenderCopy() I tried serializing them, so that SDL_RenderPresent() is called after every SDL_RenderClear()/SDL_RenderCopy() pair. No change. So I think render operations gets delayed, except for SDL_RenderPresent(), which goes immediately. > and that should work but it does not, right? It works, but I have artefacts. Like old frames sometimes. > Can you give me a small code example where I can see it fail There is no fail, just artefacts. :) Also the pipeline is not all that small: SDL_LockSurface() SDL_UpdateTexture() (multiple times) SDL_UnlockSurface() SDL_RenderClear() SDL_RenderCopy() SDL_RenderPresent() IIRC sdl sources had many tests. Are there any tests that do similar to the above pipeline? If so, I can take such test as a start, and see if I "break" it for you (or fix my code from it). diff -r 83c96b1d973c test/testoverlay2.c
--- a/test/testoverlay2.c Fri Oct 09 04:28:00 2020 +0300
+++ b/test/testoverlay2.c Sat Oct 10 18:06:27 2020 +0300
@@ -341,7 +341,7 @@
quit(4);
}
- renderer = SDL_CreateRenderer(window, -1, 0);
+ renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_SOFTWARE);
if (!renderer) {
SDL_LogError(SDL_LOG_CATEGORY_APPLICATION, "Couldn't set create renderer: %s\n", SDL_GetError());
SDL_free(RawMooseData);
With this change you can see the problem.
The motion is sometimes jerky.
I have to admit this is not the best test
to see the problem, but its still visible.
Or, if you run testoverlay2 without the
above patch, then you will probably get
the HW redering problem, which is quite
different, but I have it too. Namely,
after some time the motion will became
very slow and you will see the black bar
floating vertically around the screen.
I haven't mentioned that problem in this
report because it may be specific to my
amdgpu driver. See if you can reproduce it,
@Stas Sergeev: I have tried the testoverlay2 test while looking closely at the angry moose for several minutes, without being able to notice anything different in KMSDRM from what I see in X11. And I tried really hard. I have to say that I have a good eye for these things and live in perpetual obsession with perfect screen refresh with no "micro-stalls" or stutters: I can tell a single lost frame in a refresh sequence, but for the love of me I can't see anything in testoverlay2 with the SDL_RENDERER_SOFTWARE or without it. Really the animation doesn't run at 60fps (it has few animation frames) but I can't see anything strange besides that: of course it shows the same frames in X11 than it shows in KMSDRM. If you have anything else that I can reproduce, I will investigate as much as needed as my time allows, so I will leave this open and I will be waiting for more input, but as things are just now, I can't imagine what the problem could be and I can't even see it. I should add that I have amdgpu here (according to lspci: 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Stoney [Radeon R2/R3/R4/R5 Graphics] (rev e2)). The amdgpu driver stinks. It's being fixed by the awesome MESA devs in #dri-devel, or so it was this summer when fried them with questions about the atomic drm interface. But currently I have some workarounds specifically put in place so the kmsdrm does not blackscreen on quit, etc. Other than that (and that is working too), it's working fine here, as well as VC4 (Raspberry Pi) and Intel. Maybe there's another problem in your system? Can you please try on a different computer even if it has amdgpu too? More information: since KMS/DRM are mostly a kernel thing (well, there's libDRM too), here are some relevant version numbers about what my system is running: manuel@hp15db0:~/src/SDL-RDY/test$ uname -r 5.4.0-48-generic manuel@hp15db0:~/src/SDL-RDY/test$ apt list mesa-common-dev Listing... Done mesa-common-dev/focal-updates,now 20.0.8-0ubuntu1~20.04.1 amd64 [installed] manuel@hp15db0:~/src/SDL-RDY/test$ apt list libdrm2 Listing... Done libdrm2/focal,now 2.4.101-2 amd64 [installed] Also, anybody being able to see this problem is welcome to post. You applied the patch I showed in comment #7, didn't you? (In reply to Stas Sergeev from comment #11) > You applied the patch I showed > in comment #7, didn't you? Yes, I did. As I said, I tried testoverlay2 with and without the patch (which simply forces the creation of a HW renderer). OK, thanks. I'll try to collect the info from other machines, and will either post it here, or close that bug if no info can be collected (not that I have many other PCs around). In a mean time, of course it would be good if people here to also patch the test as in comment #7, and see how it goes. Hi Stas, we do batch render calls and execute them all at once before the present. You should assume that the back buffer is completely randomized at the end of the present and you need write the entire screen between the time of the last present and the current frame's present. This isn't true on all drivers, but needs to be done for correct rendering in all cases. For situations where you only need to redraw small portions of the screen as things change, people often draw to a target texture and then copy that to the screen each frame. Keep in mind that in some cases you can lose the contents of the target texture (D3D reset, alt-tab, etc.) and will occasionally have to redraw the entire thing. Does that help? Yes, this is exactly what I do: update the texture by parts, and then copy it to the render entirely. Then do RenderPresent(). The problem is that I can see the glitches even with your testoverlay2 (see comment #7 to see how I switch it to the software renderer). So the bug is definitely not on my side. But maybe something is wrong with my amdgpu driver, so the testing from others might be good. OK, so this is specific to 2.0.12. I don't have that with hg code. Sorry for noise! Ok, don't worry :) |