We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 3775

Summary: SDL_audiotypecvt scalar and SSE2 implementations produce different results
Product: SDL Reporter: sfalexrog
Component: audioAssignee: Ryan C. Gordon <icculus>
Status: RESOLVED FIXED QA Contact: Sam Lantinga <slouken>
Severity: blocker    
Priority: P2 CC: programmerjake
Version: HG 2.1Keywords: target-2.0.6
Hardware: x86_64   
OS: Linux   
Attachments: Initial sample
Captured audio with scalar converter (44100hz, s16b, stereo)
Captured audio with SSE2 converter (44100hz, s16b, stereo)

Description sfalexrog 2017-08-24 22:01:26 UTC
After encountering audio glitches on Android when converting 11kHz 8-bit unsigned mono samples to 22050/44100 16-bit stereo, I've tried to replicate the same behavior on an x86 Linux desktop. To my surprise, there were no audible glitches there. Then I've noticed that x86 builds would use SSE2 versions for type conversion, and tried disabling them. This allowed me to replicate said glitches.

Curiously enough, SDL2 builds with SSE2 enabled for Android x86 targets, so the sound is fine there.

The glitches sound like several milliseconds of static.
Comment 1 Ryan C. Gordon 2017-08-24 23:08:02 UTC
To be clear: the non-SSE code is glitching, not the SSE code? That's surprising.

--ryan.
Comment 2 sfalexrog 2017-08-25 08:22:22 UTC
Created attachment 2889 [details]
Initial sample
Comment 3 sfalexrog 2017-08-25 08:23:24 UTC
Created attachment 2890 [details]
Captured audio with scalar converter (44100hz, s16b, stereo)
Comment 4 sfalexrog 2017-08-25 08:23:48 UTC
Created attachment 2891 [details]
Captured audio with SSE2 converter (44100hz, s16b, stereo)
Comment 5 sfalexrog 2017-08-25 08:24:54 UTC
Now that I've looked at the file revisions, I'm surprised as well. Using scalar and SSE2 routines produce different results, but it appears that scalar should be the correct one? Then there's a bug somewhere further down the pipeline as well.

I've tried recording a sample that produces audio glitches (sounds like clipping) with and without SSE2.
Comment 6 Ryan C. Gordon 2017-08-25 20:41:04 UTC
I'm looking at this.

--ryan.
Comment 7 sfalexrog 2017-08-26 07:17:28 UTC
Just some random thoughts:

 - The chain of filters in my case looks like this: U8_to_F32 -> MonoToStereo -> ResampleCVT -> F32_to_S16

 - U8_to_F32 remaps [0..255] to [-1.0f..1.0f+DIVBY127]. This by itself does not cause any overflows, but I'm pretty sure we actually want to map [0..255] to [-1.0f..1.0f]

 - MonoToStereo just doubles the amount of samples, not messing with their values

 - ResampleAudioSimple interpolates the samples, not checking their amplitude. Since the original sample rate is 8000Hz and the destination is 44100Hz, there's a chance some samples will be in the [1.0f..1.0f+DIVBY127] range

 - In scalar variant of F32_to_S16, samples larger than 1.0f will overflow (1.0f + DIVBY127 converts to 33025.007812f and overflows to -32511). This does not happen with SSE variant, because _mm_packs_epi32 uses saturation arithmetic (well, that's my understanding of what's in the docs, at least: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=packs_&expand=1649,3808 )

 - Using 
   *dst = ((((float)*src) / 127.5f) - 1.0f);
in U8_to_F32 seems to make the sound not glitch. Not sure if that's a correct "fix", though.
Comment 8 programmerjake 2017-08-29 00:58:14 UTC
(In reply to sfalexrog from comment #7)
>  - Using 
>    *dst = ((((float)*src) / 127.5f) - 1.0f);
> in U8_to_F32 seems to make the sound not glitch. Not sure if that's a
> correct "fix", though.

PulseAudio uses:
*dst = *src * (1.0 / 128.0) - 1.0;

https://github.com/pulseaudio/pulseaudio/blob/master/src/pulsecore/sconv.c#L39
Comment 9 Ryan C. Gordon 2017-08-29 03:42:15 UTC
(In reply to programmerjake from comment #8)
> (In reply to sfalexrog from comment #7)
> >  - Using 
> >    *dst = ((((float)*src) / 127.5f) - 1.0f);
> > in U8_to_F32 seems to make the sound not glitch. Not sure if that's a
> > correct "fix", though.
> 
> PulseAudio uses:
> *dst = *src * (1.0 / 128.0) - 1.0;

Yeah, looking at this, DIVBY127 was wrong for the S8 path, too (should also divide by 128, for the same reason).

--ryan.
Comment 10 Ryan C. Gordon 2017-08-29 03:44:10 UTC
> Yeah, looking at this, DIVBY127 was wrong for the S8 path, too (should also
> divide by 128, for the same reason).

Argh, the 16 and 32 bit DIVBY macros are also like this, actually.  :/

--ryan.
Comment 11 Ryan C. Gordon 2017-08-29 04:46:20 UTC
These should all be fixed now, in https://hg.libsdl.org/SDL/rev/4cdc242e4102

--ryan.