| Summary: | Blit from SDL_PIXELFORMAT_RGB555 to SDL_PIXELFORMAT_ARGB1555 is slow | ||
|---|---|---|---|
| Product: | SDL | Reporter: | bugmenot_0 <kajema2739> |
| Component: | video | Assignee: | Sam Lantinga <slouken> |
| Status: | NEW --- | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | ||
| Version: | 2.0.9 | ||
| Hardware: | x86 | ||
| OS: | All | ||
Recently, another user reported to me, that a blit from RGB555 renderer texture to a ARGB1555 framebuffer (window format in the video backend) is very slow in the software renderer. I believe this is a missing optimization in the blitter in the video subsystem. Unfortunately, I wasn't provided any more precise measurements. They eventually fixed the performance issue in their application by using the same format (which is obviously faster). However, looking at the formats, this shouldn't be a bottleneck, because the formats are largely the same: ``` SDL_PIXELFORMAT_RGB555 = SDL_DEFINE_PIXELFORMAT(SDL_PIXELTYPE_PACKED16, SDL_PACKEDORDER_XRGB, SDL_PACKEDLAYOUT_1555, 15, 2), [...] SDL_PIXELFORMAT_ARGB1555 = SDL_DEFINE_PIXELFORMAT(SDL_PIXELTYPE_PACKED16, SDL_PACKEDORDER_ARGB, SDL_PACKEDLAYOUT_1555, 16, 2), ``` So this should still be a fast copy (only has to set or clear a bit during the copy). The test machine was a Pentium 3 (only has SSE and MMX). I believe their tests were done without compiler optimizations. Our toolchains `memcpy` implementation is a simple loop which copies individual bytes, so I'd expect trivial conversions like this to perform similar, too. I know that SDL 2.0.10 added a bunch of blitter optimizations, but I don't think this conversion is optimized in any form. I assume 2.0.12 to perform similarly.