| Summary: | Resampling of certain sounds adds heavy distortion | ||
|---|---|---|---|
| Product: | SDL | Reporter: | Manuel Alfayate Corchete <redwindwanderer> |
| Component: | audio | Assignee: | Ryan C. Gordon <icculus> |
| Status: | RESOLVED FIXED | QA Contact: | Sam Lantinga <slouken> |
| Severity: | major | ||
| Priority: | P2 | CC: | chli.hug, sezeroz |
| Version: | 2.0.6 | ||
| Hardware: | ARM | ||
| OS: | Linux | ||
| Attachments: |
Small program that showcases the heavy sound distortion that happens on ARM with certain samples.
Patch that adds [-1, 1] clamping to the scalar audio type conversions. Alternative (and sligthly faster) patch that adds [-1, 1] clamping to the scalar audio type conversions. |
||
|
Description
Manuel Alfayate Corchete
2017-10-12 16:06:58 UTC
Looking into this. (In reply to Ryan C. Gordon from comment #1) > Looking into this. The bug definitely reproduces on my RPi3 with your test program, but to my surprise, it wasn't SDL's new resampler at fault. This script resamples that wave file to various frequencies and then plays it back, and they all seem to be okay. for freq in 8000 11025 22050 44100 48000 96000 192000 ; do echo $freq ... ./testresample test.wav $freq.wav $freq 1 ./loopwave $freq.wav done It might be SDL_mixser or SDL's streaming code. Still digging in. --ryan. @r(In reply to Ryan C. Gordon from comment #2) > (In reply to Ryan C. Gordon from comment #1) > > Looking into this. > > The bug definitely reproduces on my RPi3 with your test program, but to my > surprise, it wasn't SDL's new resampler at fault. This script resamples that > wave file to various frequencies and then plays it back, and they all seem > to be okay. > > for freq in 8000 11025 22050 44100 48000 96000 192000 ; do > echo $freq ... > ./testresample test.wav $freq.wav $freq 1 > ./loopwave $freq.wav > done > > It might be SDL_mixser or SDL's streaming code. Still digging in. > > --ryan. If you look at my report, I point Mix_LoadWAV_RW() from SDL_Mixer and SDL_RWFromConstMem() from SDL itself as the possible culprits, because that's where the differenciation of the data bewteen X86_64 and ARM happens... (In reply to Manuel from comment #3) > If you look at my report, I point Mix_LoadWAV_RW() from SDL_Mixer and > SDL_RWFromConstMem() from SDL itself as the possible culprits, because > that's where the differenciation of the data bewteen X86_64 and ARM > happens... Oh, whoops, you did say that, sorry for not reading more carefully! --ryan. Created attachment 3003 [details]
Patch that adds [-1, 1] clamping to the scalar audio type conversions.
This may come from the SDL_Convert_F32_to_X_Scalar functions. They don't clamp the float value to [-1, 1] and when they cast it to the target integer it may be too large or too small for the type and get truncated, causing horrible noise.
The attached patch throws clamping in, but I don't know if that's the preferred way to fix this. For x86 (without SSE) the compiler (I tested MSVC) seems to throw a horrible amount of x87 code in it. It's a bit better with SSE, but probably still quite the performance hit. And SSE2 uses a branchless approach with maxss and minss.
@Simon: it works good here, with this patch applied no more horrendous distortion can be heard on the test or games (RAWL, SDLPop). I hope it's included soon, to be ready for 2.0.7. Last two minor versions (2.0.5 and 2.0.6) had sound broken on ARM :( (In reply to Manuel from comment #6) > @Simon: it works good here, with this patch applied no more horrendous > distortion can be heard on the test or games (RAWL, SDLPop). > I hope it's included soon, to be ready for 2.0.7. Last two minor versions > (2.0.5 and 2.0.6) had sound broken on ARM :( Also, with regard to possible NEON optimization of the clamping, there are saturating opcodes on NEON, yes. I posted them on the forum but I leave them here for reference too: VQABS Absolute value, saturate V{Q}ABS and V{Q}NEG VQADD Add, saturate V{Q}ADD, VADDL, VADDW, V{Q}SUB, VSUBL, and VSUBW VQDMLAL, VQDMLSL Saturating Doubling Multiply Accumulate, and Multiply Subtract VQDMULL, VQDMLAL, and VQDMLSL (by vector or by scalar) VQDMUL Saturating Doubling Multiply VQDMULL, VQDMLAL, and VQDMLSL (by vector or by scalar) VQDMULH Saturating Doubling Multiply returning High half VQ{R}DMULH (by vector or by scalar) VQMOV{U}N Saturating Move (register) VMOVL, V{Q}MOVN, VQMOVUN VQNEG Negate, saturate V{Q}ABS and V{Q}NEG VQRDMULH Saturating Doubling Multiply returning High half VQ{R}DMULH (by vector or by scalar) VQRSHL Shift Left, Round, saturate (by signed variable) V{Q}{R}SHL (by signed variable) VQRSHR{U}N Shift Right, Round, saturate (by immediate) VQ{R}SHR{U}N (by immediate) VQSHL Shift Left, saturate (by immediate) VSHL, VQSHL, VQSHLU, and VSHLL (by immediate) VQSHL Shift Left, saturate (by signed variable) V{Q}{R}SHL (by signed variable) VQSHR{U}N Shift Right, saturate (by immediate) VQ{R}SHR{U}N (by immediate) VQSUB Subtract, saturate V{Q}ADD, VADDL, VADDW, V{Q}SUB, VSUBL, and VSUBW How about doing the float or double multiplication first as it has been, running lrint or lrintf on the result, and then clamping the integral value instead? (SDL_Convert_F32_to_S32_Scalar() might be a problem though..) I'm always surprised how much stuff they packed into C99. To my knowledge, SDL still targets C90 (because of MSVC 2010 maybe?). A few of the dependency headers are already C99 so perhaps that's something to consider for SDL 2.1. I agree that clamping the integer may be better, but these functions don't seem to handle overflows the way we'd like them to here. Dealing with overflows is so expensive... how annoying. Well, if someone is interested in getting into writing NEON code, here's the perfect little project. Simple algorithm, short function. :) @Simon: Can your patch go into mainline as it's for now? I am not getting time to study NEON or assembly in general till next year, but I would like to have SDL2 sounding right on ARM: 2.0.6 is broken without this fix. It's for Ryan to decide if this overhead is acceptable or if he wants to clamp in the resampler. Or maybe he has an even better idea! Created attachment 3012 [details]
Alternative (and sligthly faster) patch that adds [-1, 1] clamping to the scalar audio type conversions.
Attaching alternative patch that performs slightly better despite the branching.
The C99 function signbit could make it faster on ARM. Varies heavily depending on compiler flags.
@Simon: This new version of the patch works perfectly well, too. No distorted audio on tests or games on ARM. Simon, your patch is in, thanks! https://hg.libsdl.org/SDL/rev/37aca00967db This is a good fix for now, and it turns out that recent versions of gcc and clang can optimize this to SSE. I haven't checked to make sure these particular functions get optimized that way, but it's good to know when we check the performance. Cheers! |