| Summary: | SDL_RunAudio serializes data processing and WaitDevice, leading to risk of runouts | ||
|---|---|---|---|
| Product: | SDL | Reporter: | bugmenot_0 <kajema2739> |
| Component: | audio | Assignee: | Ryan C. Gordon <icculus> |
| Status: | ASSIGNED --- | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | Keywords: | target-2.0.16 |
| Version: | 2.0.12 | ||
| Hardware: | All | ||
| OS: | All | ||
Ryan, can you look at this for the next release? I'll look into this, but it's worked like this since the 1990's, so I'm going to bump this to 2.0.16. --ryan. |
A high-level overview of `SDL_RunAudio` (if a stream is used) is this: ``` while(true) { // Section 1 (generate new data) callback(udata, data, data_len); [...] SDL_AudioStreamPut(device->stream, data, data_len); // Section 2 (feed data to device) while (SDL_AudioStreamAvailable(device->stream) >= ((int) device->spec.size)) { [...] SDL_AudioStreamGet(device->stream, <device buffer>, device->spec.size); [...] current_audio.impl.PlayDevice(device); current_audio.impl.WaitDevice(device); } } ``` This is not ideal, because in section 1: - The application might block in the callback or does expensive processing. - The audio conversion happens in `SDL_AudioStreamPut` and might be slow on some platforms. So the larger the callback buffer, the longer both of these steps will take. This section is very CPU intensive and might take a while. Because we don't feed the audio device during this, runouts can happen. In section 2: - We split the data into consumable chunks. Because the data is already converted, this step is "cheap". - We block while the device frees another buffer for the next iteration of the loop. This section mostly idles the CPU and requires little work. During this step, we keep feeding the audio device, so runouts won't happen. Unfortunately, by the time we reach section 2, we might have waited a long time in section 1, so we might have had an audio runout because we didn't feed the device anymore. We were also stressing the CPU in section 1, but now idle it in section 2. Ideally, section 1 and 2 would run interleaved. A better loop might be this: ``` bool must_wait = false; while(true) { // Generate more data and just buffer it callback(udata, data, data_len); SDL_AudioStreamPut_SkipConversion(...); // Buffer the rest // Check if we have enough data for playback while (SDL_AudioStreamPreparableSize() >= device->spec.size) { // Do the expensive conversion, but only convert a chunk SDL_AudioStreamPrepare(..., device->spec.size); // Wait for playback to finish if we already had a chunk playing. // Shouldn't take long, because we were busy with conversion. if (must_wait) { current_audio.impl.WaitDevice(device); } // Feed data to device SDL_AudioStreamGet(..., device->spec.size) current_audio.impl.PlayDevice(device); // Defer the WaitDevice until the next loop iteration (until after conversion and callback) must_wait = true; } } ``` This way, we always wait after expensive operations, thereby reducing idle time. This would allow reduced buffer sizes, less audio latency, and fewer risks of running out of audio. A further optimization would be to continue to use SDL_AudioStreamPut and SDL_AudioStreamPut_SkipConversion, so that we avoid a quick succession of: 1. SDL_AudioStreamPut_SkipConversion 2. SDL_AudioStreamPrepare (because it would create a copy of the input data in 1, then have to read it back for 2; by fusing these steps, we could avoid this useless copy)