We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 4186

Summary: New ARM/NEON audio converters cause strange clicking noises
Product: SDL Reporter: Manuel Alfayate Corchete <redwindwanderer>
Component: audioAssignee: Ryan C. Gordon <icculus>
Status: RESOLVED FIXED QA Contact: Sam Lantinga <slouken>
Severity: normal    
Priority: P2 CC: flibitijibibo, sylvain.becker
Version: 2.0.10Keywords: target-2.0.10
Hardware: ARM   
OS: Linux   
Attachments: patch
patch
dump audio conversion datas

Description Manuel Alfayate Corchete 2018-06-05 14:44:20 UTC
Hi there,

While testing latest SVN version, I found out that ARM-specific NEON-optimized audio converters have been added:

https://hg.libsdl.org/SDL/rev/08c415f14810

After building with these new NEON converters active, I have found that some games exhibit audio problems. Most notable case is rawgl:

https://github.com/cyxx/rawgl

You can easily notice there's a clicking noise after each step sound when you run (just start the game and start running, pressing an arrow + SPACE).

I guess these new converters need to be tested against games before going into "production" again: if no one notices these errors, they can get very hard to find and fix in the future.
Comment 1 Manuel Alfayate Corchete 2018-06-15 17:29:52 UTC
Another game that also exhibits clicking effects when these new NEON converters are active is crispy-doom:

https://github.com/fabiangreffrath/crispy-doom.git
Comment 2 Ryan C. Gordon 2018-09-29 20:51:07 UTC
I've disabled the NEON code for now with https://hg.libsdl.org/SDL/rev/dfa5cea01a56 but I will revisit this once 2.0.9 ships.

--ryan.
Comment 3 Ethan Lee 2018-10-03 13:38:02 UTC
Is there a small sample program that demonstrates this with a particular sound file? These converters were recently used in Dust: AET's Switch version and it ran just fine... we only used some of the converters, however (S16->F32 and F32->S16).
Comment 4 Sylvain 2018-12-06 20:11:17 UTC
Created attachment 3534 [details]
patch

Here's a patch that probably fix the issue, but need confirmation as there are several things !

Happens with 8bit wav, while doing:
SDL_Convert AUDIO_U8 -> AUDIO_F32 (using NEON) (SDL_Convert_U8_to_F32_NEON)
SDL_Convert AUDIO_F32 -> AUDIO_S16 (using NEON) (SDL_Convert_F32_to_S16_NEON)


1/ 
"SDL_Convert_U8_to_F32_NEON()" has a blatant issue: mis-placed parameter "one":
vmlsq_f32(a, b, c) is doing r=(a - b * c)
So  vmlsq_f32(a, b, one), should be in fact: vmlsq_f32(one, b, c)

2/ For "SDL_Convert_U8_to_F32_NEON()" and "SDL_Convert_U16_to_F32_NEON()"
formula seems incorrect as it performs: (1 - src * DIVBY128 ) instead of (src * DIVBY128 - 1)

It should use "-1" and "vmlaq_f32"
vmlaq_f32(a, b, c) is doing r=(a + b * c)
to do ( -1  + src * DIVBY128 )
Comment 5 Sylvain 2018-12-06 20:13:18 UTC
Created attachment 3535 [details]
patch

Update the patch to rename "one" -> "negone"
Comment 6 Sylvain 2018-12-06 20:15:50 UTC
Also: 

Android-x86 can probably run SSE2 (at least __SSE2__ is defined and it compiles. it runs also on emulator, but I have not device to try it).

Android armeabi-v7a could also run those NEON cvt, but it needs also the fallback!
Comment 7 Sylvain 2019-04-02 08:02:16 UTC
@Manuel Alfayate Corchete, could you try the patch ?
Comment 8 Ethan Lee 2019-06-09 15:45:00 UTC
We ended up merging this patch into FAudio's copy of the U8->F32 converter:

https://github.com/FNA-XNA/FAudio/commit/68b13c1ff04784e31ffffd51d147cb890fe88803

AFAIK this works, but I don't have the original reporter's data to check with.
Comment 9 Manuel Alfayate Corchete 2019-06-09 16:12:46 UTC
I am sorry about not testing this sooner, I have been very busy!
But I will do so today or tomorrow: I just have to build rawgl with SDL_Mixer support again (As a side note: I have ported it to sts_mixer instead, since SDL_Mixer is very slow on the Pi3 because of the SLOOOOOW SDL_BuildAudioCVT() that is called inside Mix_LoadWAV_RW()).
Comment 10 Sylvain 2019-06-14 09:10:27 UTC
No problem, but let us know!

Maybe this could be added to next 2.10 version ?
Comment 11 Ryan C. Gordon 2019-06-14 19:53:54 UTC
I'm applying this patch, these fixes look correct to me. If it's a problem, we can reopen this bug!

--ryan.
Comment 12 Manuel Alfayate Corchete 2019-08-18 16:41:04 UTC
Hi,

I was testing some games with 2.0.10 today, and I found that this ARM_NEON resampling code is still a big problem. It causes massive audio glitches in devilutionX (Diablo 1 engine using SDL2: https://github.com/diasurgical/devilutionX), and ECWolf, where the closing door audio FX is also broken.
In both cases, the audio is "robotic" and strange.

How to reproduce the issue:
-Build devilutionX against stable 2.0.10 (I cant see any changes in SDL2 repo regarding SDL2 resampling code)
-Talk to any NPC and you will see audio is metallic and robotic, very evidently broken.
-Now, rebuild SDL2 after editing src/audio/SDL_audiotypecvt.c and changing
#define HAVE_NEON_INTRINSICS 1
into
#define HAVE_NEON_INTRINSICS 0
You will see that *audio without HAVE_NEON_INTRINSICS is perfectly correct*.

So I am reopening this issue, as its still clearly unresolved.
Comment 13 Sylvain 2019-08-18 19:40:44 UTC
Maybe you can upload the sound file, and/or also add some trace in SDL code to see which conversion path is taken
Comment 14 Manuel Alfayate Corchete 2019-08-19 11:20:52 UTC
@Sylvain

I have tried recording the ALSA output on the Raspberry PI so I could provide both the traces and file, but I was unable to do so. I tried redirecting the ALSA output to the alsa snd-loop device after building a kernel with this module, then recording from it with arecord. All I got was a blank, silent file.

Anyway, the audio conversion that results in highly broken audio goes thru these two functions each time a sound has to be converted (both functions are called for each single sound conversion):

SDL_Convert_U8_to_F32_NEON
SDL_Convert_F32_to_S16_NEON


This call sequence happens both in DevilutionX and ECWolf, which shows the wrong sounds when NEON_INTRINSICS is active, too.

Some sounds seem affected, and others are unaffected, resulting in good audio. The same call sequence is done for affected and for unaffected sounds.
Comment 15 Sylvain 2019-08-19 12:25:42 UTC
there is not just the sound file in your game data ?

to debug, since you have the test-case ...

can you replace/copy paste the implementation of 

SDL_Convert_U8_to_F32_NEON
SDL_Convert_F32_to_S16_NEON

by the non NEON, eg:

SDL_Convert_F32_to_S16_scalar
SDL_Convert_F32_to_S16_scalar

so that we hopefully know which one has the issue
Comment 16 Sylvain 2019-08-19 12:33:31 UTC
by the way the previous patch was applied here
https://hg.libsdl.org/SDL/rev/30204ee1c7c5
Comment 17 Manuel Alfayate Corchete 2019-08-19 14:03:25 UTC
@Sylvain:

There is no independent audio data file for the game because Diablo 1 stores all its data in a single file, called diabdat.mpq, which is 500MB. DevilutionX engine uses this data file, and voices are inside it.
-The open source DevilutionX engine is here:
https://github.com/diasurgical/devilutionX
-The diabdat.mpq datafile is inside this ISO you can legally get from archive.org:
https://archive.org/download/Diablo_1996_Blizzard/Diablo%20%281996%29%28Blizzard%29.iso

As for what function could be the problematic one, I did this:
-I swapped the SDL_Convert_U8_to_F32_NEON code with SDL_Convert_U8_to_F32_Scalar.
Voices and FX sounded good!
-Restored original SDL_Convert_U8_to_F32_NEON and then swapped the SDL_Convert_F32_to_S16_NEON code with SDL_Convert_F32_to_S16_Scalar.
Voices and FX sounded wrong again!

So SDL_Convert_U8_to_F32_NEON seems the function to blame, because swapping it with the scalar imlpementation fixes sound problems.
Comment 18 Sylvain 2019-08-19 15:02:27 UTC
Ok, I think it's the order the sample were stored which was bad:
 https://hg.libsdl.org/SDL/rev/644eeed7ce2f
let me know if this fixes it?


I think it should also be the same to S8_to_F32 ?
maybe also S/F 16_to_F32 ?

and also, I wonder if the NEON checks for src aligned are important: 

/* Make sure src is aligned too. *
if ((((size_t) src) & 15) == 0)
there were false  in my case ?!
Comment 19 Ethan Lee 2019-08-19 15:04:42 UTC
> and also, I wonder if the NEON checks for src aligned are important:

I think whenever that's false it just means it can't do any NEON operations at all, so it just falls back and does everything in scalar at the bottom of the function. Ideally you'd have your input/output memory be aligned, but it's not required.
Comment 20 Manuel Alfayate Corchete 2019-08-19 15:45:38 UTC
@Sylvain: Ah, yes, it works! Those changes fix the audio problems in DevilutionX and ECWolf. You found the solution. Great!

It is pity the recently released 2.0.10 has inherited the problem, which is pretty much a major audio problem. I suppose its partly my fault for not testing more SDL2 software earlier in the 2.0.10 release cycle...
Comment 21 Ethan Lee 2019-08-19 16:09:22 UTC
> It is pity the recently released 2.0.10 has inherited the problem, which is pretty much a major audio problem. I suppose its partly my fault for not testing more SDL2 software earlier in the 2.0.10 release cycle...

For what it's worth, it's all the more reason to enforce the new release cycle!

https://libsdl.org/next.php

We've integrated this into FAudio as well, and nothing appears to have fallen apart:

https://github.com/FNA-XNA/FAudio/commit/a916bf05edf18668748430b994197bca23fabe08
Comment 22 Sylvain 2019-08-19 18:51:26 UTC
I've dump all outputs of all converting functions, and SDL_Convert_S8_to_F32_NEON was also failing, now fixed in
https://hg.libsdl.org/SDL/rev/33b64882e322

beside, all seems quite clean for NEON. Only sometimes saturation values changes between scalar/neon: 1/0/255/254, but some are expected.
Comment 23 Sylvain 2019-08-19 18:59:35 UTC
(In reply to Ethan Lee from comment #19)
> > and also, I wonder if the NEON checks for src aligned are important:
> 
> I think whenever that's false it just means it can't do any NEON operations
> at all, so it just falls back and does everything in scalar at the bottom of
> the function. Ideally you'd have your input/output memory be aligned, but
> it's not required.

NEON instructions can operate on aligned or not addresses. When they are aligned, it's simply faster. (unlike SSE where there are aligned/non-aligned instruction for aligned/non-aligned memories). So I would just always do the NEON instruction as long as there are enough samples.

And if src and dst must be the same buffer and have to be aligned at the same time, it might be difficult to happen. I haven't checked all the condition and the code is a little bit confusing ...
Comment 24 Sylvain 2019-08-19 19:25:04 UTC
I also double-check SSE2 conversion and spot this one, constant was set to 1, instead of -1
https://hg.libsdl.org/SDL/rev/69cc269eec53
Comment 25 Sylvain 2019-08-19 19:51:18 UTC
Created attachment 3928 [details]
dump audio conversion datas

Attached patch to dump and compare audio datas
Comment 26 Sylvain 2019-08-20 08:27:31 UTC
Marked as fixed!
Comment 27 Manuel Alfayate Corchete 2019-08-20 18:30:40 UTC
@Sylvain: Thanks A LOT for your work on this. SDL2 on ARM is a wonder and it deserves to be cared about!