| Summary: | sdlgenblit.pl / SDL_blit_auto.c does expensive tasks in the inner pixel-processing loop | ||
|---|---|---|---|
| Product: | SDL | Reporter: | bugmenot_0 <kajema2739> |
| Component: | video | Assignee: | Sam Lantinga <slouken> |
| Status: | WAITING --- | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | ||
| Version: | 2.0.12 | ||
| Hardware: | All | ||
| OS: | All | ||
Patches are welcome! Please include optimized benchmark timing results for your changes. |
The code generator in sdlgenblit.pl moves a lot of the conditional operations into the inner loop which processes pixels. This goes against intuition. While it creates readable code, it is potentially slow code. I feel like the main purpose of the generator is to produce fast code, as the resulting code doesn't have to be concise and redundancy is to be expected. The only code quality that matters is that of the generator script itself. As such, it would probably be better to avoid constructs like this: ``` for(y = 0; y < height; y++) for(x = 0; x < width; x++) switch (flags) { case SDL_COPY_BLEND: process_pixel_blend(); break case SDL_COPY_ADD: process_pixel_add(); break; case SDL_COPY_MOD: process_pixel_mod(); break; case SDL_COPY_MUL: process_pixel_mul(); break; } } } ``` Instead, the code generator should probably generate this: ``` switch (flags) { case SDL_COPY_BLEND: for(y = 0; y < height; y++) for(x = 0; x < width; x++) process_pixel_blend(); } } break; case SDL_COPY_ADD: for(y = 0; y < height; y++) for(x = 0; x < width; x++) process_pixel_add(); } } break; case SDL_COPY_MOD: for(y = 0; y < height; y++) for(x = 0; x < width; x++) process_pixel_mod(); } } break; case SDL_COPY_MUL: for(y = 0; y < height; y++) for(x = 0; x < width; x++) process_pixel_mul(); } } } ``` This would avoid reliance on the optimizing compiler, and makes the intention more obvious, which might also affect optimizers.