We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 63

Summary: BlitNtoNPixelAlpha, possible fix
Product: SDL Reporter: Ryan C. Gordon <icculus>
Component: videoAssignee: Ryan C. Gordon <icculus>
Status: RESOLVED WONTFIX QA Contact: Sam Lantinga <slouken>
Severity: normal    
Priority: P2 CC: betasoft
Version: HG 1.2   
Hardware: All   
OS: All   
Attachments: Benchmark program for 3 diffrent alpha algorithms

Description Ryan C. Gordon 2006-01-22 03:19:44 UTC
From: Daniel F Moisset <dmoisset@except.com.ar>
To: sdl@libsdl.org
Date: Thu, 12 Jan 2006 18:29:28 -0300
Subject: [SDL] alpha blending bug - possible fix?

Recently I posted about a problem with alpha blending that turned out to
be a problem with the generic blit function when there is no
acceleration, BlitNtoNPixelAlpha. Alex Volkov pointed me to the
following comment:

	/* FIXME: for 8bpp source alpha, this doesn't get opaque values
	   quite right. for <8bpp source alpha, it gets them very wrong
	   (check all macros!)
	   It is unclear whether there is a good general solution that doesn't
	   need a branch (or a divide). */
	   
The problem is a precision bug at the ALPHA_BLEND macro:

#define ALPHA_BLEND(sR, sG, sB, A, dR, dG, dB)	\
do {						\
	dR = (((sR-dR)*(A))>>8)+dR;		\
	dG = (((sG-dG)*(A))>>8)+dG;		\
	dB = (((sB-dB)*(A))>>8)+dB;		\
} while(0)

you can make a slight correction changing this to:

#define ALPHA_BLEND(sR, sG, sB, A, dR, dG, dB)	\
do {						\
    int premultR = (sR-dR)*(A); \
    int premultG = (sG-dG)*(A); \
    int premultB = (sB-dB)*(A); \
    dR += ((premultR>>8)+((A)>>7)+(premultR>>16);	\
    dG += ((premultG>>8)+((A)>>7)+(premultG>>16);	\
    dB += ((premultB>>8)+((A)>>7)+(premultB>>16);	\
} while(0)

That incurs into some extra shifts and adds, but no division or branch.
The correction is not better on the average (it is slightly worse,
although the maximum error in the function is the same), but it gives
equal or much better results at usual alpha values (0, 128, 255)

could this change be introduced? thanks a lot, 

     D.

PS: I tested the formula separately an it works, but have not tried
merging the above into SDL. perhaps something needs to be fixed first

Cheers,
	D.

-- 
Except - Free Software developers for hire - http://except.com.ar


_______________________________________________
SDL mailing list
SDL@libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl
Comment 1 Ryan C. Gordon 2006-01-22 03:23:51 UTC
Date: Sat, 21 Jan 2006 01:56:27 +0100
From: Stephane Marchesin <stephane.marchesin@wanadoo.fr>
To: "A list for developers using the SDL library. \(includes SDL-announce\)" <sdl@libsdl.org>
Subject: Re: [SDL] alpha blending bug - possible fix?

> [...quote snipped by ryan for bugzilla posting...]
> could this change be introduced? thanks a lot, 

Is changing (even slightly) the alpha behaviour for one among many alpha 
blitting functions a good idea ?

Stephane



_______________________________________________
SDL mailing list
SDL@libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Comment 2 Ryan C. Gordon 2006-01-22 03:25:45 UTC
Date: Sat, 21 Jan 2006 13:58:38 +0100 (MET)
From: Mattias Karlsson <betasoft@acc.umu.se>
To: "A list for developers using the SDL library. \(includes SDL-announce\)" <sdl@libsdl.org>
Subject: Re: [SDL] alpha blending bug - possible fix?

> [...quotation clipped for bugzilla by ryan...Sam was asking...]
>
> Did you profile your code as opposed to simply dividing by 255?

I have done some quick-and-dirty testing on both UltraSparc3 and 
Xeon by blending two arrays. Some preliminary results:

* gcc 3.4 replaces /255 with a multiply+shift on both processors. 
* The suggested replacement above is on average faster than division, but
   slower than the current shifts.
* On UltraSparc3 there is hardly any difference in speed between the
   suggested replacement and using division, unless the arrays grow realy,
   realy large.
* The difference in time between cache-hit and cache-miss is larger than
   the difference between shift and division; division + cache-hit is
   faster than shift + cache-miss.

Note that this is not tested using SDL blitter code, but a seperate 
implementation using the three different blending algorithms.


More tests are in progress...

_______________________________________________
SDL mailing list
SDL@libsdl.org
http://www.libsdl.org/mailman/listinfo/sdl

Comment 3 Mattias Karlsson 2006-01-22 14:44:05 UTC
My continued testing have revealed two bugs in the suggested algorithm:

For src=255, dest=255 and alpha >= 128 it overflows, giving (modulo 255) the value 0 instead of the expected 255.

For src=0, dest=1 and  0 < alpha < 127 it underflows, giving (modulo 255) the value 255 instead of the expected 0.

Also, "slightly worse" means that it is off-by-one half the time, while shifting is off-by-one only in 1 out of 6 on average.
Comment 4 Mattias Karlsson 2006-01-22 14:45:00 UTC
Created attachment 27 [details]
Benchmark program for 3 diffrent alpha algorithms
Comment 5 Ryan C. Gordon 2006-01-27 11:23:19 UTC
Setting Sam as "QA Contact" on all bugs (even resolved ones) so he'll definitely be in the loop to any further discussion here about SDL.

--ryan.

Comment 6 Sam Lantinga 2006-03-15 10:51:09 UTC
Thanks for the analysis, Mattias.  The fix proposed won't be implemented, although I'm open to suggestion for other fixes that are correct and still speedy. :)