Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: main thread is blocked when user resizes or moves a window #1059

Closed
SDLBugzilla opened this issue Feb 10, 2021 · 63 comments
Closed
Assignees
Labels
enhancement New feature or request wontfix This will not be worked on
Milestone

Comments

@SDLBugzilla
Copy link
Collaborator

SDLBugzilla commented Feb 10, 2021

This bug report was migrated from our old Bugzilla tracker.

These attachments are available in the static archive:

Reported in version: HG 2.1
Reported for operating system, platform: Windows (All), All

Comments on the original bug report:

On 2013-08-30 01:00:19 +0000, wrote:

When user clicks on window title or border, system generates WM_NCLBUTTONDOWN message. When DispatchMessage receives this message, it handles window resizing or moving and doesn't return until user releases mouse. It also sends WM_WINDOWPOSCHANGING to window proc. When using WinAPI directly there is no big problem that DispatchMessage blocks, because it is possible to handle WM_WINDOWPOSCHANGING message (or use WM_TIMER) to do actions that must be performed regularly. But on SDL it seems to be impossible to do anything in main thread while user moves or resizes window.

On 2014-01-20 05:04:25 +0000, Nathaniel Fries wrote:

I actually spent my weekend fixing this.

I'm not sure when I'll have time again to work on something, but I did upload my code for it, so someone should be able to whip up a patch fairly easily.

For an SDL-specific patch, I wouldn't bother with using a thread-local (SDL doesn't enable the creation of new GUI threads), sending WM_SIZING, or with MINMAXINFO at all.

It even includes (untested) code for child windows, so it should hopefully work in cases where SDL is used from a widget provided by Qt or some other toolkit.

it's this sourceforge project here: https://sourceforge.net/projects/win32loopl/

On 2014-01-20 16:48:16 +0000, Nathaniel Fries wrote:

*** Bug 2316 has been marked as a duplicate of this bug. ***

On 2014-01-20 16:54:40 +0000, Nathaniel Fries wrote:

Just a heads up, the above fix for this probably shouldn't be default behavior because it can cause resizing and moving to become choppy to the user if rendering or other main loop code takes too long. It could cause new bug reports from developers of pre-existing SDL2 applications who are simply passing on bug reports from users who updated their SDL2 dll. I'd recommend it as a feature that can be turned on or off by the programmer, and defaults to off.

On 2014-02-06 05:21:48 +0000, Nathaniel Fries wrote:

Created attachment 1549
patch

Finally found time to get around to making a proper patch. Code is mostly the same as I wrote before, but adapted for what SDL looks like internally. Doesn't make modeless behavior optional, though.

On 2014-02-09 10:08:44 +0000, Sam Lantinga wrote:

Thanks! We'll take a look at this after the 2.0.2 release.
This also potentially fixes issues with dragging the titlebar when the cursor is grabbed?

On 2014-02-09 12:43:34 +0000, Nathaniel Fries wrote:

"This also potentially fixes issues with dragging the titlebar when the cursor is grabbed?"

Not sure what you mean by this. This code has to acquire mouse focus in order to receive all necessary mouse movements.

On 2014-02-09 20:36:01 +0000, Sam Lantinga wrote:

Yes, but we're in control of the movement process so we can account for our own grab state. It's not a fix, it just makes it possible to fix. :)

On 2014-02-09 23:31:57 +0000, Nathaniel Fries wrote:

I suppose that if mouse focus is lost, we should take it back. Might be a possible bug in this code. MSDN says not to call SetCapture when processing WM_CAPTURECHANGE, so I can see how this could have been a difficult issue previously.
Now, of course, we can add a simple fix in SDL_PumpEvents or elsewhere, but we'd still have a chance of losing some events that way. A better fix would simply be to use the cursor pos attached to a windows message (__tagMSG::pt), compare it to the capture position, and work from there instead of WM_MOUSEMOVE. Then we won't even need to worry about mouse capture. Just a thought and haven't had a chance to test it, though.

Also, there was a (quite obvious once I noticed it) bug in my last patch. This is what I get for not testing thoroughly. When handling WM_MOUSEMOVE, I use lParam instead of the result from GetMessagePos. lParam is relative to the client area, which means it can be negative; GetMessagePos is in screen coordinates. This is what you get for not taking your time. :)
here's a hand-written patch for my patch to correct this:

             if(data->in_modeless_resize)
             {
                 POINT ptPos;
+                DWORD dwPos = GetMessagePos();
-                ptPos.x = GET_X_LPARAM(lParam);
-                ptPos.y = GET_Y_LPARAM(lParam);
+                ptPos.x = GET_X_LPARAM(dwPos);
+                ptPos.y = GET_Y_LPARAM(dwPos);
                 WIN_DoResize(hwnd, data, ptPos, SDL_FALSE);
             }

On 2014-02-20 21:05:23 +0000, Nathaniel Fries wrote:

Created attachment 1568
better patch

Attached is a much better patch. I wasn't sure whether the values returned by SDL_GetWindow[Min/Max]imumSize were client size or window size, so that may need to be corrected inside WIN_DoResize.

Still doesn't add a new window flag for modeless behavior.

When mouse capture is lost, the modeless resize/movement operation is finalized. This is because:

  1. Attempting to reclaim mouse capture in handling WM_CAPTURECHANGED actually crashes the program.
    Without mouse capture, we won't get messages for mouse movement outside the current boundaries of the window.
  2. MSDN states that only the Foreground Window can capture the mouse, and presumably there's a good reason for an app to claim Foreground Window status (either in response to user input, or an application had to alert the user of something).
    The user will probably interact with this foreground window, even if just to remove its foreground window status, so it seems silly for SDL to continue acting as if the user is interactively resizing a Window.

On 2014-02-25 12:53:59 +0000, Sam Lantinga wrote:

Reviewing the code, it looks pretty good. I'm looking forward to trying it out after 2.0.2 is released.

The values returned by SDL_GetWindow[Min/Max]imumSize are client size.

On 2014-03-05 11:12:47 +0000, Andreas Ertelt wrote:

There's one tiny issue I experience with this code - in multi-monitor setups the window always jumps back to the primary monitor when being picked up.

Also, since this patch was submitted, handling for WM_NCLBUTTONDOWN was added - the current code would have to be added to the default-case of this patch and the return statement should be removed.

On 2014-03-06 10:06:56 +0000, Andreas Ertelt wrote:

Another minor issue is that I am receiving SDL_MOUSEMOTION events when moving or resizing the window in a way that the mouse cursor temporarily hovers the window.
I also receive two of those events when just clicking and holding the window on the title or border as well as another when releasing.

On 2014-03-09 01:08:36 +0000, Nathaniel Fries wrote:

"There's one tiny issue I experience with this code - in multi-monitor setups the window always jumps back to the primary monitor when being picked up."
something, isn't it?
I don't know what would cause this and I don't have a multi-monitor setup to test on, so I'm afraid I'll have to leave that fix to someone else.

I've identified the likely cause of that minor issue (double SDL_MOUSEMOTION events). When WIN_DoResize is called in response to WM_MOUSEMOVE, the last argument should be SDL_FALSE instead of SDL_TRUE (SDL_TRUE indicates that it should "force" the cursor position to a correct value after resizing).

On 2014-03-11 07:11:10 +0000, Andreas Ertelt wrote:

Didn't have much time to test this, but the change you suggested caused the application to crash (quite literally).

The multi monitor issue is related to the GetSystemMetrics() call which is fed with
SM_CXSCREEN/SM_CYSCREEN which limits the routine to the primary monitor. Instead you would have to use MonitorFromRect() to find the current/nearest monitor (using mouse coordinates and MONITOR_DEFAULTTONEAREST) and then retreive its size/coordinates using GetMonitorInfo().
Alternatively using SM_CXVIRTUALSCREEN/SM_CYVIRTUALSCREEN would be a quick fix, but that would make that part a bit pointless for setups where monitors don't use the same resolutions and/or aren't properly aligned.

Another sideeffect I found is that the aero features snap and shake stop working. Not quite sure how to emulate those correctly (especially since snap delivers visual feedback as well).

[HKEY_CURRENT_USER\Software\Policies\Microsoft\Windows\Explorer] "NoWindowMinimizingShortcuts" defines the state of shake.

[HKEY_CURRENT_USER\Control Panel\Desktop] "WindowArrangementActive" defines the state of snap.

On 2014-03-12 19:09:40 +0000, Nathaniel Fries wrote:

Actually MSDN makes it seem like the default maximum window tracking dimension is GetSystemMetrics(SM_C[X/Y]MAXTRACK) regardless of the monitor the window is on.

http://msdn.microsoft.com/en-us/library/windows/desktop/ms724385%28v=vs.85%29.aspx

I never knew of the shake and snap features. I've played around with them briefly, and shake appears to be relatively simple to implement using documented User32 calls (however, because I'm using documented User32 calls to change window position, they require a full redraw which appears to take longer than whatever User32 does internally - this may make the shake feature appear laggy as well as require we take some liberty with the timing). Snap would require I somehow cover the entire screen with a blue highlight, and I'm not sure how to do that without creating a window the size of the desktop (and we return to mouse focus issues).

On 2014-03-12 20:40:33 +0000, Nathaniel Fries wrote:

Actually, by playing around I think I've found a relatively simple way to highlight the entire screen, but it will require different code for windows on extra monitors so I can only guarantee something that would work on single-monitor systems. Might be awhile before I can materialize a complete fix, though.

On 2014-03-16 09:42:03 +0000, Nathaniel Fries wrote:

Believe it or not, I'm finding it harder to get shake just right than snap. I have basically functional versions of both in my little project on sourceforge now.

I won't be making another patch for SDL until I've got all the little quirks worked out though. Might be some time.

On 2019-12-07 17:00:27 +0000, Jake Del Mastro wrote:

Has there been any progress on this bug? I'm noticing this still seems to be an issue in SDL 2.0.10

On 2020-03-24 21:13:36 +0000, Ryan C. Gordon wrote:

(In reply to Jake Del Mastro from comment # 18)

Has there been any progress on this bug? I'm noticing this still seems to be
an issue in SDL 2.0.10

Reading through all these comments, is this something we really want? It sounds like something that we're going to have to maintain every time Microsoft adds/changes a UI mechanic, and never get quite right, and introduce a bunch of risky behaviors, just to be more responsive when someone drags the window.

I'd be inclined to mark this WONTFIX, but I'll let Sam make that decision if he wants.

--ryan.

On 2020-04-16 16:50:42 +0000, Ron Aaron wrote:

It's not just an issue on Windows. macOS has the same problem (don't know if it's for a similar reason)

On 2020-04-16 19:19:48 +0000, Andreas Ertelt wrote:

Ryan is correct, the way this patch approaches the issue would require changes over time to stay consistent with Windows behavior and there are too many corner cases to consider.

But a problem should definitely not be marked WONTFIX just because a suggested solution is inadequate.

While I'm fairly sure there is no feasible solution that fully fixes the issue as it was reported here, the likely prime issue most people are concerned with is not being able to perform drawing operations / simulation anymore.

This could be addressed on Windows by allowing developers to register a callback (per window) to be performed on its WM_SIZING(!), WM_PAINT and likely also WM_ERASEBACKGROUND events. If this feature is used, the message loop would also have to call InvalidateRect on the window whenever no more messages are in the queue and upon completion of the callback a ValidateRect on the window would have to be issued (this is to make sure WM_PAINT events keep getting issued when nothing else is happening).

I'm confident most other platforms could be handled in a similar fashion.

This approach wouldn't affect existing programs in any way and provide developers who care about not being interrupted for an unreasonable amount of time with the means to address the issue with minimal changes and without having to hijack the window's message handler.

On 2020-04-18 13:08:58 +0000, Andreas Ertelt wrote:

I just checked my engine code and there are three more corner cases to be considered on Windows that I didn't think of anymore.

One is system/context menus, the other when a modular window is opened (eg. message box) and the last is picking the window up without moving it (can also be the case when moving isn't configured to redraw the window in Window's performance options).

I worked around all of this by starting a timer on the window that triggers the redraws. This timer is started under the following conditions:

  1. When WM_SYSCOMMAND is called with the (wparam & 0xfff0) == SC_MOVE (this also happens when the the regular window menu is opened).
  2. it must also be started when WM_ENTERMENULOOP is received to stop context menus from interrupting the program.
  3. The WM_ENABLE message is received with a wparam of 0.

The only slight annoyance I could notice at this point is when you hold down the caption bar with the mouse, it takes a second to actually call the first timer-event. This can be slightly alleviated by allowing WM_GETICON to trigger a draw while the timer is active. The WM_GETICON-behavior has likely been introduced with Vista - I currently have no older machine to verify this on.

This redraw timer can then be deleted on the next proper WM_PAINT message received while the window is active again (WM_ENABLE).

In my program I trigger this timer at the refresh rate and make sure there is no more message like it in the queue before issuing the draw call (to avoid clogging the message queue).

I can't think of an alternative to using a timer here, being that the control over the message loop is being temporarily diverted and the only event being reliably triggered being WM_GETICOn at a 1Hz frequency. At least I couldn't find any other way to introduce events under these conditions.

On 2020-07-12 09:53:51 +0000, Jack C wrote:

Any updates to this bug? I like Andreas Ertelt's idea of introducing optional callbacks for those events. I know Blender's approach to drawing while resizing the window is handled in WM_SIZE/WM_SIZING event. There is an event dispatch call under "case WM_SIZE:" that will lead to a draw call.

You can find the code I am referring to here.

https://github.com/blender/blender/blob/404486e66c6a4ebebb085700d58b396597146add/intern/ghost/intern/GHOST_SystemWin32.cpp#L1659

@SDLBugzilla SDLBugzilla added bug waiting Waiting on user response labels Feb 10, 2021
@FluffyFoxBunny
Copy link

Fix it already its been 8 years!

@icculus icculus added enhancement New feature or request wontfix This will not be worked on and removed bug waiting Waiting on user response labels Feb 17, 2021
@icculus
Copy link
Collaborator

icculus commented Feb 17, 2021

Making the executive decision to close this bug as wontfix; this isn't worth all the known problems and unknown risks that fixing it would cause.

@icculus icculus closed this as completed Feb 17, 2021
@FluffyFoxBunny
Copy link

FluffyFoxBunny commented Feb 17, 2021 via email

@Lokathor
Copy link
Contributor

So the fundamental issue is that the way SDL gives you events is fundamentally at odds with how Win32 wants your program to handle events. What you're supposed to do (according to Win32) is have a "window procedure" (callback) which runs for each event. SDL provides this callback for you, but the SDL callback just records events into the event queue for you to respond to later.

One of the events that you're supposed to respond to during your callback is is a repaint event. SDL can't repaint for you but usually this isn't an issue because sometime shortly after SDL puts all the events in the queue then you grab events from the queue and repaint yourself.

The problem is that while the user is holding the mouse button down during a resize of the window, control never returns to the main program. The user32 events loop will just hold on to your program's control flow and continually call your window procedure, giving you resize related events and paint events.

If you respond to the paint events by painting immediately within the window procedure then you'll get a program that behaves "properly" during a resizing. However, this runs totally counter to SDL's event queue system.

The only way to fix this is to entirely replace one of SDL's core components.

In other words, the wontfix assessment is fair.

@icculus
Copy link
Collaborator

icculus commented Feb 17, 2021

How is there risk to making it so you can drag a window without it pausing a program?

This thread lists multiple problems and potential future incompatibilities.

@slouken
Copy link
Collaborator

slouken commented Feb 18, 2021

However, you're welcome to use the attached patch in your code, if you're comfortable with the drawbacks.

@FluffyFoxBunny
Copy link

FluffyFoxBunny commented Feb 18, 2021 via email

@StrikerMan780
Copy link

StrikerMan780 commented Mar 8, 2021

There needs to be some kind of SDL hint, or something along those lines to fix this behavior, because this is making SDL2 borderline unusable for games that have lockstep netcode. (one person decides to drag or resize their window, and the whole session dies, pissing off all of the players trying to play. YAY!)

One shouldn't need to rely on a patch that is old and insanely hard to find (I've been searching for such a thing for weeks, only found this now.) just to get past such an obvious and horrid issue. Not to mention said patch likely can't even be merged with current SDL2 anymore due to its age.

Been struggling with this problem for years over multiple projects, and I'm tired, frustrated, and fucking desperate for something, anything, that can remedy it.

@icculus
Copy link
Collaborator

icculus commented Mar 8, 2021

one person decides to drag or resize their window, and the whole session dies, pissing off all of the players trying to play.

What happens to the other players when someone unplugs their network cable in this scenario?

@npip99
Copy link

npip99 commented Mar 8, 2021

@icculus Unplugging your network cable as the host of a multiplayer game would indeed disconnect everyone else playing the game (If you're not the host, as with a client/server model, it would at minimum disconnect yourself). But that's completely expected by the user who unplugged their network cable, both the client/server and the p2p results of that action are well-understood by the user and by the game dev, and as game devs we can add a message like "The host has disconnected" which the users would be able to figure out in a crystal clear manner that because Robert unplugged his network cable, and Robert was (presumably) the host in the p2p game, everyone got disconnected. E.g., No bug tickets for us the dev team, because the users fully understood exactly what happened.

Having everyone (or even just yourself) disconnect just because you dragged a window is very subtle and frustrating, and it would be difficult for the user to even realize that it was the dragging that caused the issue, as opposed to just thinking your application is sucky. It took me as the dev countless hours of debugging to realize that the reason why my application client was disconnecting from the server every once and a while was because I was dragging the window, dragging the window just isn't something that I interpret as an action that could affect my application, it's just a subconscious thing I do to ensure that things are placed well. Additionally, for me, dragging the window only disconnected client from server like 1/4th of the time which makes it even harder to make that association, it just looked like a completely random bug that we couldn't figure out how to reproduce consistently for the longest time, but made the application somewhat annoying to use for long periods of time, and its not like our users ever reported that they were dragging the window when it happened, they had no idea how to replicate it either, it just happened randomly from their point of view. Once we figured out the association it wasn't hard to find this github issue, but something better can be done here.

vvvvvvvvvvvvvvvvvvvvvvv

Imo, at the absolute minimum, the documentation of SDL_PollEvent desperately needs to say that it will block if the user drags or resizes the window on the Windows OS. Then at least developers can work around the issue and maintain network connections on another thread without it being an unnecessarily large refactor after the fact [as it was for us].

^^^^^^^^^^^^^^^^^^^^

@StrikerMan780
Copy link

StrikerMan780 commented Mar 8, 2021

^ This. Very much this.

One player (not even the host) was dragging their window in a match I had and everyone was confused as to why everyone was suddenly lagging. (To be specific, I'm working on a netplay-centric port of Duke Nukem 3D, which uses a master/slave lockstep form of networking, so if ANYONE so much as sneezes on their window, it'll hang the whole match until the operation is done, and add a bunch of persistent lag over the next minute or two as the input lag buffer gets inflated to hell and back to compensate.)

Wouldn't be the first time this has happened, either. Adding to my frustration and abrasive demeanour right now is getting blamed for it and/or being told my port sucks because of something out of my control.

What happens to the other players when someone unplugs their network cable in this scenario?

The entire game hangs for everyone, and they have to quit. Doesn't matter if it's the host or a client. (The unfortunate downside to lockstep netcode)

@TerminX
Copy link

TerminX commented Mar 10, 2021

Would it be possible to set a custom WindowProc function on the window that receives the WM_MOVE, etc. and handles whatever updates need to be done application side before passing the events off to SDL's WindowProc?

@Xeverous
Copy link

@TerminX See https://stackoverflow.com/questions/32294913/getting-contiunous-window-resize-event-in-sdl-2 for something of this sort which uses SDL_AddEventWatch.

@icculus
Copy link
Collaborator

icculus commented Mar 24, 2021

is getting blamed for it and/or being told my port sucks because of something out of my control.

I wrote one of the first UDP implementations for Duke3D back in the day, so I totally get this. But the fragility of Duke's system is going to bite you sooner or later, window dragging or not. The extremely non-trivial but correct approach would be to replace that netcode with something more robust...but dear lord, that would be a painful effort.

Some other approaches to try:

  • Abuse the hit test API:

    /**
     * Callback used for hit-testing.
     *
     * \param win the SDL_Window where hit-testing was set on
     * \param area an SDL_Point which should be hit-tested
     * \param data what was passed as `callback_data` to SDL_SetWindowHitTest()
     * \return an SDL_HitTestResult value.
     *
     * \sa SDL_SetWindowHitTest
     */
    typedef SDL_HitTestResult (SDLCALL *SDL_HitTest)(SDL_Window *win,
                                                     const SDL_Point *area,
                                                     void *data);
    
    /**
     * Provide a callback that decides if a window region has special properties.
     *
     * Normally windows are dragged and resized by decorations provided by the
     * system window manager (a title bar, borders, etc), but for some apps, it
     * makes sense to drag them from somewhere else inside the window itself; for
     * example, one might have a borderless window that wants to be draggable from
     * any part, or simulate its own title bar, etc.
     *
     * This function lets the app provide a callback that designates pieces of a
     * given window as special. This callback is run during event processing if we
     * need to tell the OS to treat a region of the window specially; the use of
     * this callback is known as "hit testing."
     *
     * Mouse input may not be delivered to your application if it is within a
     * special area; the OS will often apply that input to moving the window or
     * resizing the window and not deliver it to the application.
     *
     * Specifying NULL for a callback disables hit-testing. Hit-testing is
     * disabled by default.
     *
     * Platforms that don't support this functionality will return -1
     * unconditionally, even if you're attempting to disable hit-testing.
     *
     * Your callback may fire at any time, and its firing does not indicate any
     * specific behavior (for example, on Windows, this certainly might fire when
     * the OS is deciding whether to drag your window, but it fires for lots of
     * other reasons, too, some unrelated to anything you probably care about _and
     * when the mouse isn't actually at the location it is testing_). Since this
     * can fire at any time, you should try to keep your callback efficient,
     * devoid of allocations, etc.
     *
     * \param window the window to set hit-testing on
     * \param callback the function to call when doing a hit-test
     * \param callback_data an app-defined void pointer passed to **callback**
     * \returns 0 on success or -1 on error (including unsupported); call
     *          SDL_GetError() for more information.
     *
     * \since This function is available since SDL 2.0.4.
     */
    extern DECLSPEC int SDLCALL SDL_SetWindowHitTest(SDL_Window * window,
                                                     SDL_HitTest callback,
                                                     void *callback_data);

    ...which will call a function that you specify constantly while the mouse is dragging; it's meant to be used to say "treat this coordinate as part of the title bar, etc" so you can do things like draw a window from the middle, but you could also use it to update state, send a non-blocking packet if it's time to do so, etc, as long as you do it fast in general and return right away if it's not time to do anything yet. This would avoid adding any windows-specific code to your app. This would be SDL_SetWindowHitTest(), and your callback would just always return SDL_HITTEST_NORMAL.

  • If you don't mind poking at win32, you can try SDL_WindowsMessageHook:

    typedef void (SDLCALL * SDL_WindowsMessageHook)(void *userdata, void *hWnd, unsigned int message, Uint64 wParam, Sint64 lParam);
    
    /**
     * Set a callback for every Windows message, run before TranslateMessage().
     *
     * \param callback The SDL_WindowsMessageHook function to call.
     * \param userdata a pointer to pass to every iteration of `callback`
     */
    extern DECLSPEC void SDLCALL SDL_SetWindowsMessageHook(SDL_WindowsMessageHook callback, void *userdata);

    ...which literally just gives you first shot at win32-level events, before SDL does anything with them, and this might be enough.

  • SDL_AddEventWatch is similar, but you only see SDL-level events, and you only see them when pumping the event queue, which may or may not be enough.

@StrikerMan780
Copy link

StrikerMan780 commented Mar 26, 2021

I wrote one of the first UDP implementations for Duke3D back in the day, so I totally get this. But the fragility of Duke's system is going to bite you sooner or later, window dragging or not.

Thankfully I've spent a few years at this point refactoring the whole thing, it's in a much better state than the old days. Basically impossible to go out of sync now unless someone makes a mod with faulty behaviour like making RNG calls during display events.

If the network is suffering packet loss, or extreme latency, it just waits before advancing (however, if there's a full connection loss, it'll stay waiting forever, but menus and stuff still work. This is the case right now if someone drags their window for too long), unlike DOS Duke which often would just have a massive hernia and then continue while remaining out of sync, fully locking up once you attempt to quit or start a new game.

Prediction code is also in the process of being completely overhauled. The plan is to implement a full rollback system and in-game joining at some point, as well. Failing that, I do have a WIP client/server branch which is partially functional, but buggy as shit simply due to how Duke3D was designed.

Just, the only major problem I'm suffering with now is window events. Hoping perhaps with these functions listed, I can figure something out. Thanks.

@ell1e
Copy link
Contributor

ell1e commented Aug 9, 2021

@slouken I read above discussions & the patch. My apologies if I got it wrong, but here are my takeaways:

Why the patch looks not too terribly useful: from what I can tell from the comments, the patch completely replaces the regular resizing with a "manual" one that breaks default desktop handling like window snapping. (Is that correct?) To me, that sounds like a fundamentally not useful approach. at all. I also think that the redraw issue really is the secondary problem here, so I don't see the point in getting stuck on that one if it's so hard, so the patch seems like a dead end.

What I would suggest instead: why can't we have a "let me do non-UI app processing" callback that is guaranteed to still be on the main thread, but is banned from calling any SDL2 event/draw functions? This way one can do a nested call to e.g. netcode or audio or physics updates to keep things running while just skipping drawing & input processing. I think this would fix the pressing issue of total functionality drop-outs like netcode desync, internet connection losses, complete cutscene audio desync, ... while hopefully being way more feasible for SDL2 to provide? The original issue title talks about the blocked main thread after all, and I agree that's the way bigger problem, especially for multiplayer.

Edit: additional note: it would also most likely be way, way easier for many code bases to make use of such a callback if it is still on the main thread, than try to make their entire gameplay happen on a separate thread. It's just a different magnitude of headaches. So while it might seem like not much to work with, it could really help this situation massively.

Edit2: #1059 (comment) this also sounds very alike to what I am suggesting. I'd just prefer a proper, documented solution. It can still be marked as experimental. What about SDL_SetWindowsResizeProcessingHook or something similar as a name? The frequency in which it is called really wouldn't matter much, as long as it is "multiple times a second or more." Most proper code will know how to deal with game loop time fluctuations, after all.

In conclusion, I don't see much value in testing the patch. But is such a callback maybe more feasible? If yes, could this issue be reopened to reconsider that? It won't fix the redraw, but I really think the discussion got too sidetracked on that.

@slouken
Copy link
Collaborator

slouken commented Aug 9, 2021

You're welcome to create a callback approach, but please create a new issue and/or pull request for that, since it's fundamentally different from this one.

@ell1e
Copy link
Contributor

ell1e commented Aug 9, 2021

@slouken would it make sense to reopen #4614 then? However, I find that reopening this one (instead) is also useful, since I don't see that it started with this drawing-focused fix. That kind of just happened later in the discussion, not the initial "opener" as far as I can see

@slime73
Copy link
Contributor

slime73 commented Aug 9, 2021

Are you worried about other platforms? This issue only deals with Windows, but similar things can happen for other platforms. For example on macOS if you click-and-hold on the close, minimize, or maximize window buttons, or open any of the app's menu bar tabs, the OS won't return from its event poll until that's done.

I don't know what a cross-platform 'solution' to event-thread-blocking would be (if one even exists) aside from restructuring your code to not have timing-critical things run on the only thread that has arbitrary blocking due to user and OS interaction, but if one exists I think it'd make more sense to discuss it in a cross-platform context rather than in a Windows issue thread.

@ell1e
Copy link
Contributor

ell1e commented Aug 9, 2021

@slime73 I was simply unaware of that, since Linux doesn't seem to have any comparable issues, and I only have test environments for Windows and Linux. However:

don't know what a cross-platform 'solution' to event-thread-blocking would be (if one even exists)

I think from the SDL2 API side this is trivial. Just name it SDL_SetOSBlockingWindowOperationsProcessingHook or something. I mean that's a terrible name, but you get the idea. Now whether macOS's window management API even allows implementing that I wouldn't know. I personally usually don't port my apps to macOS, for various reasons. (I actually also don't know if Winapi allows it, I just read some comments above that suggested it does - I do use quite some Winapi stuff directly, but the windowing-related things.)

aside from restructuring your code to not have timing-critical things run on the only thread that has arbitrary blocking due to user and OS interaction

In my opinion this is not as a necessarily "brilliant" design as some make it to be, so let's just agree to disagree here. I think many others would see it like me. And this can often be solved too, by sticking with libraries that respect this problem better, instead of just hand-waving with "uh, throw threads at it or something." (Granted, SDL2 usually does respect this well outside of these few corner cases.) I could discuss this for a long time, but maybe can we just work under the premise that it's useful if people aren't forced to work around this with threads?

@icculus
Copy link
Collaborator

icculus commented Aug 10, 2021

(I just want to reiterate that any program that can't deal with the process being starved of CPU time is fundamentally broken no matter what we do or do not do with window resizing. If you replace "user is resizing the window" with "daily virus scanner started running and nothing is moving quickly now" or "system ran out of memory and started swapping heavily to disk" you still have a bug in your program if the audio goes out of sync or network connections drop, etc.)

@ell1e
Copy link
Contributor

ell1e commented Aug 10, 2021

@icculus I don't understand. At face value your comment just seems irrelevant to me. Any networkied action game will fundamentally drop out of the session if the entire PC hangs... so, huh?

I am really surprised I even need to go into this, since SDL2 seems to encourage a less-threads-is-better design in general, so why is my request apparently so weird? How in particular is it strange to want to not make the game misbehave and drop out just when I resize the window?

Yes, disk I/O should be loading screen only, or in threads. (Or non-blocking I/O! Threads are not always the only answer.) And yes, you can thread game logic and netcode, too. Should you? Should you just to make resizing not break everything massively? How is this scenario so contentious all of a sudden? I'm legit stumped.

So to get back to the issue, would it be possible to add a "let me do non-UI things on the main thread while the OS blocks the window" to SDL2? I find it really hard to believe it's just me finding that useful, even if I just scroll to previous comments. I don't understand this discussion. I don't understand either why "you HAVE to use threads" is an acceptable answer.

@iactix
Copy link

iactix commented Dec 7, 2022

It seems very odd that this kind of industry leading library is unable to let my code run when the window is dragged. For like a decade, from what I'm reading here? Just don't render anything, let all SDL code fail horribly, anything, but for FSM sake, don't block my code!

@playmer
Copy link
Contributor

playmer commented Dec 7, 2022

SDL doesn't know ahead of time that you're entering the message pump to be dragged for an indeterminate amount of time. This is a limitation of the design of the SDL_Event loop interacting with the Windows event loop. There are many workarounds, but they'd need to be implemented and all require changes on the part of the app.

Perhaps the design could be revisited in SDL3 to not interact poorly, but I'm not sure what it would look like.

@iactix
Copy link

iactix commented Dec 7, 2022

I am sure the techicalities of why this is a problem are sound, and I am sure it's the usual microsoft thing that's causing it. However, the consequences are absolutely terrible. All I'm saying, all my criticism is regarding priorities. I'm sure music visualizations and things like that are loving it. If they are smart, they'll probably let the music go bwbwbwbwbw until you let the window go. Heck, I'm going mad just having my avg time measures f'ed up for seconds when I have to move the window out of the way of the console after starting all the time.

Btw. all of that needs to be combined with all I've read how you must not move the rendering or the event loop to different threads. All of this appears pretty extreme to me. Which is of course measured by the standing SDL seems to have. It's not like I would complain about some dude's engine that way. Anyway, cheerio everyone. Just felt that this whole thing needed quite the kick in the behind.

@joncampbell123
Copy link

There is a reason dragging/resizing the window or using the menus blocks execution of your SDL application.

The way Windows handles those interactions, and always has handled it going back to Windows 1.x even, is that DefWindowProc() goes into it's own event handling loop to handle that action. This of course blocks the SDL event handling loop.

The way DOSBox-X handles it is by modding the SDL library to maintain both a parent top level window and a child window inside, and then a separate thread handles message handling. If DefWindowProc() blocks for window size/move and menu interaction, then that thread is blocked while the main SDL application continues to run unimpeded.

Perhaps official SDL development can handle it differently or possibly cleaner, but that's how you can avoid the blocking issue entirely.

@iactix
Copy link

iactix commented Dec 7, 2022

Appreciate the help, but really I'm not using a cross platform thingy that "is mainly used to handle cross platform window management" to work around window management tailored to specific platforms. The only solution that works is that SDL is just able to move a running program across the screen, even on an outlandish platform like windows.

@icculus
Copy link
Collaborator

icculus commented Dec 7, 2022

We're still discussing what the appropriate way to work around this Windows limitation should be for SDL3, which is why this issue is still open.

While we discuss that, I'm going to lock this thread, as I think we have enough feedback telling us that people feel strongly about finding a resolution.

@slouken
Copy link
Collaborator

slouken commented Nov 8, 2023

I've added a solution that dovetails nicely with the new main callbacks in SDL 3.0 and if you're not using that you can set an event watcher to handle expose events and draw then.

Thanks for all the feedback!

@libsdl-org libsdl-org unlocked this conversation Nov 8, 2023
slouken added a commit that referenced this issue Nov 8, 2023
… loop

SDL will send an SDL_EVENT_WINDOW_EXPOSED event for your window during the modal interaction and you can use an event watcher to redraw your window directly from the callback.

Fixes #1059
Closes #4836
lao-wen pushed a commit to lao-wen/scrcpy that referenced this issue Nov 16, 2023
It turns out that the workaround only worked for MacOS.

Refs Genymobile#3458 <Genymobile#3458>
Refs SDL/Genymobile#1059 <libsdl-org/SDL#1059>
@RT2Code
Copy link
Contributor

RT2Code commented Dec 5, 2023

Being angry because you are completely ignorant on a topic doesn't help you or anyone else. This issue is indeed related to Win32 modal loops, and also applies if you use Win32 directly. A little search on the internet, or just reading this conversation, would have told you about that. Moreover, this problem is solved now, so I don't see the point of your intervention.

Anyway, props to the SDL team for your amazing work on this library, you don't deserve such rude comments.

@clseibold
Copy link

clseibold commented Dec 6, 2023

@RT2Code It should be the SDL team that apologizes for the rude comments themselves. If you make a multiplatform library like this, it is your responsibility to make sure that your event loop interacts correctly with ALL of the platforms. SDL had this issue for several years and they kept brushing it off, as many of the other people in this conversation have pointed out. @icculus 's comments above, berating people because they expect this library to not block on window resizes (or even holding down one of the window buttons on macOS), and even trying to compare this to someone unplugging an ethernet cable, was extremely disappointing and completely uncalled for.

And no, the problem isn't solved. It still exists in SDL2. We still have to use a workaround for SDL2.
Your "example" StackOverflow link (as if StackOverflow is the best place to get programming advice, lmao) isn't using the Windows API correctly. I think you will find that the Win32 API documents what is expected and what isn't, unlike SDL.
This is well explained in one of the StackOverflow answers:

When DefWindowProc handles WM_SYSCOMMAND with either SC_MOVE or SC_SIZE in the wParam, it enters a loop until the user stops it by releasing the mouse button, or pressing either enter or escape. It does this because it allows the program to render both the client area (where your widgets or game or whatever is drawn) and the borders and caption area by handling WM_PAINT and WM_NCPAINT messages (you should still receive these events in your Window Procedure).

It works fine for normal Windows apps, which do most of their processing inside of their Window Procedure as a result of receiving messages. It only effects programs which do processing outside of the Window Procedure, such as games (which are usually fullscreen and not affected anyway).

Many people have solved this problem easily. It only took the SDL team many years to fix it.

@icculus
Copy link
Collaborator

icculus commented Dec 6, 2023

icculus 's comments above, berating people because they expect this library to not block

I didn't berate people, I offered several possible technical workarounds, and I locked this thread because it keeps generating unhelpful commentary like this, which is also why I'm locking it again now.

@libsdl-org libsdl-org locked and limited conversation to collaborators Dec 6, 2023
@slouken
Copy link
Collaborator

slouken commented Dec 6, 2023

This is fixed for the SDL 2.30 release, in 509c70c

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet