We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 5197

Summary: Metal renderer seems to be inefficient and is blurry on Retina displays without High-DPI flag
Product: SDL Reporter: Konrad <iryont>
Component: renderAssignee: Alex Szpakowski <amaranth72>
Status: RESOLVED INVALID QA Contact: Sam Lantinga <slouken>
Severity: normal    
Priority: P2 Keywords: target-2.0.14
Version: 2.0.13   
Hardware: x86_64   
OS: macOS 10.15   

Description Konrad 2020-06-16 21:01:34 UTC
If you compare both OpenGL and Metal SDL2 renderer backends on OS X you will notice the difference is not that big from what should be expected from a command buffer like API.

I did create Metal backend for my game while I do use SDL2 renderer as a fallback and I've noticed this issue in which within a simple scene of about 20 draw calls I was not able to push more than 1k fps or so. Right besides Metal I have OpenGL 3.3 core renderer which was able to push over 4k fps in the same scene and it made me wonder what is going on. Metal is command buffer like API in which you do encode all of your requests for the GPU and you send it all at once for asynchronous execution, so it should at least match the performance of OpenGL 3.3 core, however, it did not.

At first I did the same as SDL2 does - I did create NSView and attached CAMetalLayer to it. However, that turned out to be rather slow which led me to profile the game and I've noticed whole rendering process is blocked on nextDrawable usually for about 800-900 us or so (even though amount of drawables was set to 3 which is a maximum value).

Later on after being really frustrated about the performance I did switch to MTKView out of curiosity to try its currentDrawable and currentDescriptor methods. I haven't changed anything else and in the same scene I was now able to push 7-8k fps. I'm pretty much unfamiliar with Obj-C and Apple API since I barely scratched the surface with creating a simple 2D renderer and I have no way of explaining the difference which I did notice, but the difference is there and perhaps it is worth to investigate it for SDL2 renderer.

Also, both software and Metal renderers are blurry on Retina displays without using High-DPI flag. This should not be the case since OpenGL backend is crystal clear. The solution to fix this is to set filter of underlying layer for the view:

[layer setMagnificationFilter:kCAFilterNearest];
Comment 1 Alex Szpakowski 2020-06-16 22:16:34 UTC
(In reply to Konrad from comment #0)
> I was not able to push more than 1k fps or so. Right besides
> Metal I have OpenGL 3.3 core renderer which was able to push over 4k fps in
> the same scene

That's a difference of 0.75ms/frame (for reference, a frame will take 16.6ms at 60fps).

Can you post some simple code that demonstrates the difference? From your post it's hard for me to sort out what might be specific to your own code (for example acquiring or releasing a drawable at a different time than what Apple recommends, which can have that sort of impact on framerate) versus what might be part of SDL_Render.


> Also, both software and Metal renderers are blurry on Retina displays
> without using High-DPI flag. This should not be the case since OpenGL
> backend is crystal clear. The solution to fix this is to set filter of
> underlying layer for the view:
> 
> [layer setMagnificationFilter:kCAFilterNearest];

Nearest-neighbour versus linear upscaling each have distinct tradeoffs. Nearest neighbour will have much more noticeable pixellation than linear, which might be good for an already-upscaled pixel art game and might be bad for other art styles.

Apple defaults to linear for layer-backed views (Metal on macOS, OpenGL ES on iOS, Metal on iOS) and defaults to nearest neighbour for the legacy NSOpenGLView APIs on macOS - so OpenGL on macOS is the odd one out across all Apple platforms, right now.

I think making them consistent is probably a good idea, but I don't necessarily think choosing one or the other for all games because your specific game looks good for one is the right approach. Maybe a SDL hint would make sense.
Comment 2 Konrad 2020-06-17 06:27:14 UTC
It's not about pushing ridiculous amount of frames per second since we know at such values the time difference isn't really huge, but rather a fact that OpenGL was able to pump so many while Metal was not.

Anyway, I'm afraid I cannot post any code for that matter. My game code is much more advanced and writing a test case would take some time to demonstrate the difference. That is why I said you are free to investigate it and I'm just sharing what I have noticed.

Regarding the filtering, actually choosing the nearest should the right way to do so. There is no question about the fact that Apple wanted non-HiDPI aware applications to work the same as they did on non-Retina displays. That is precisely the reason why the pixel density difference on such MacBooks is 4x (2x in width and 2x in height), so we can easily use integer scaling to match it and not blur it out in the process.
Comment 3 Konrad 2020-06-21 15:04:11 UTC
I had some time to look at it and since I'm not OSX or Metal expert I believe I can push more fps because currentDescriptor / currentDrawable of the MTKView does return even if next drawable is not available, so we are rendering to the same drawable, likely.

It seems like it is preferred for MTKView to render frame in its delegate of drawInMTKView and personally I don't do that, I just push as many frames as possible to see how many draw calls I can push. While this technique works fine on newer MacBooks such as mine and few other users I did test with (2019 models) it appears to be causing some tearing and artifacts on older Macs (e.g. from 2013) therefore I can only conclude this is not the right way to render.

It is just a bit sad that Metal cannot push as many frames as OpenGL 3.3 can, but I guess this is just the way it is. I suppose more drawables within CAMetalLayer could solve this since nextDrawable is the stalling call, but for the time being Apple limits it to maximum of 3. That being the case perhaps I did expect too much from Metal on simple 2D rendering while it can only thrive on hardcore 3D rendering with fps numbers way below 1000. Anyway, I will close this issue since for the time being I do not recommend it, but if anything changes I will let you know.