| Summary: | SDL_audiotypecvt scalar and SSE2 implementations produce different results | ||
|---|---|---|---|
| Product: | SDL | Reporter: | sfalexrog |
| Component: | audio | Assignee: | Ryan C. Gordon <icculus> |
| Status: | RESOLVED FIXED | QA Contact: | Sam Lantinga <slouken> |
| Severity: | blocker | ||
| Priority: | P2 | CC: | programmerjake |
| Version: | HG 2.1 | Keywords: | target-2.0.6 |
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Attachments: |
Initial sample
Captured audio with scalar converter (44100hz, s16b, stereo) Captured audio with SSE2 converter (44100hz, s16b, stereo) |
||
|
Description
sfalexrog
2017-08-24 22:01:26 UTC
To be clear: the non-SSE code is glitching, not the SSE code? That's surprising. --ryan. Created attachment 2889 [details]
Initial sample
Created attachment 2890 [details]
Captured audio with scalar converter (44100hz, s16b, stereo)
Created attachment 2891 [details]
Captured audio with SSE2 converter (44100hz, s16b, stereo)
Now that I've looked at the file revisions, I'm surprised as well. Using scalar and SSE2 routines produce different results, but it appears that scalar should be the correct one? Then there's a bug somewhere further down the pipeline as well. I've tried recording a sample that produces audio glitches (sounds like clipping) with and without SSE2. I'm looking at this. --ryan. Just some random thoughts: - The chain of filters in my case looks like this: U8_to_F32 -> MonoToStereo -> ResampleCVT -> F32_to_S16 - U8_to_F32 remaps [0..255] to [-1.0f..1.0f+DIVBY127]. This by itself does not cause any overflows, but I'm pretty sure we actually want to map [0..255] to [-1.0f..1.0f] - MonoToStereo just doubles the amount of samples, not messing with their values - ResampleAudioSimple interpolates the samples, not checking their amplitude. Since the original sample rate is 8000Hz and the destination is 44100Hz, there's a chance some samples will be in the [1.0f..1.0f+DIVBY127] range - In scalar variant of F32_to_S16, samples larger than 1.0f will overflow (1.0f + DIVBY127 converts to 33025.007812f and overflows to -32511). This does not happen with SSE variant, because _mm_packs_epi32 uses saturation arithmetic (well, that's my understanding of what's in the docs, at least: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=packs_&expand=1649,3808 ) - Using *dst = ((((float)*src) / 127.5f) - 1.0f); in U8_to_F32 seems to make the sound not glitch. Not sure if that's a correct "fix", though. (In reply to sfalexrog from comment #7) > - Using > *dst = ((((float)*src) / 127.5f) - 1.0f); > in U8_to_F32 seems to make the sound not glitch. Not sure if that's a > correct "fix", though. PulseAudio uses: *dst = *src * (1.0 / 128.0) - 1.0; https://github.com/pulseaudio/pulseaudio/blob/master/src/pulsecore/sconv.c#L39 (In reply to programmerjake from comment #8) > (In reply to sfalexrog from comment #7) > > - Using > > *dst = ((((float)*src) / 127.5f) - 1.0f); > > in U8_to_F32 seems to make the sound not glitch. Not sure if that's a > > correct "fix", though. > > PulseAudio uses: > *dst = *src * (1.0 / 128.0) - 1.0; Yeah, looking at this, DIVBY127 was wrong for the S8 path, too (should also divide by 128, for the same reason). --ryan.
> Yeah, looking at this, DIVBY127 was wrong for the S8 path, too (should also
> divide by 128, for the same reason).
Argh, the 16 and 32 bit DIVBY macros are also like this, actually. :/
--ryan.
These should all be fixed now, in https://hg.libsdl.org/SDL/rev/4cdc242e4102 --ryan. |