Similar presentations:
XAudio2
1.
2.
XAudio2 Performance TipsTom Mathews
Lead Developer
Advanced Technology Group
Microsoft
3.
OverviewXAudio2 overview
Voice & Graph optimization
xAPO optimization
Voice reuse
Compression
Streaming
Debugging / Performance analysis
4.
What Is XAudio2Low-level cross-platform game audio API
Play hundreds of sounds at once
Loop, start, stop, adjust sounds at any time
Volume, pitch, filter, reverb, DSP
Identical code on both platforms
Building block for higher-level sound design tools
such as the XACT3 engine
Replaced XAudio1
Replaced DirectSound for gaming purposes
5.
FeaturesFlexible channel routing
Any channel can be sent to any other channel with
attenuation/amplification
Multistage submixing
For example, each car can have a submix (exhaust,
transmission, engine, etc.), and each car’s mix can then
be fed into another submix for environmental effects
6.
Advanced FeaturesDeferred commands
Most operations (Start, SetParameter, SetOutputVoice,
SetEffectChain) can be grouped and applied as atomic,
sample-accurate operations
xAPOs (DSPs)
In-box APOs (Reverb, notch, etc.)
Create custom equalizers, compressors, limiters,
monitors,
phase shifters, attenuators, delays, …..
And they can be cross-platform, like the in-box APOs.
7.
XAudio2: Minimum CPUVectorized signal processing
XAudio2 requires at least SSE
Available since 1999 for PCs
Makes extensive use of it in processing code
Your processing code may do the same
XAudio2 also makes use of SSE2/FTZ/DAZ
Available since 2001 for PCs
XAudio2 makes use of XMA hardware-accelerated
decode and VMX instructions for 360
8.
Audio FlowPitch/SRC + filter
Effect1
EffectN
32k
(Mono)
Filter
Pitch/SRC + filter
Effect1
EffectN
Sample
Rate.
Conv.
EffectN
44k (5.1)
XMA2
Effect1
32k (5.1)
24k
(Mono)
xWMA
32k (5.1)
32k (5.1)
Pitch/SRC + filter
Effect1
EffectN
Sample Rate Conversion
32k (5.1)
EffectN
XMA2
Submix Voices Mastering Voice
Effect1
32k,
Mono
Source Voices
48k
(5.1)
9.
Graph OptimizationFilter
Effect1
EffectN
Sample
Rate.
Conv.
SUBMIX!
Apply FX to many voices at once for the price of
one
Make use of lower-rate sub-graphs
Lower rate == fewer samples == less CPU
Run expensive global send FX at a lower rate/channels
than the final mix
Provides for more detailed control of performance
characteristics
Allows for smooth crossfades between disparate FX
e.g. Environmental reverb crossfade
10.
Source Voices32k
(Mono)
Setting up for best performance
XMA2
Pitch + filter
Effect1
EffectN
Use XAudio2_VOICE_NOPITCH & _NOSRC when
possible
Minimize MaxFrequencyRatio when used
Stopped voices are not touched by
the real-time processing thread
Voice Pooling
Much faster than repeated allocation/free
SetFrequencyRatio may be applied to reuse
voices for
data of a different sampling rate
Sample
Rate.
Conv.
11.
Voice Pooling32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Create pools of Voices
Each Pool is unique on Source Content (xWMA, XMA,
ADPCM) and Channel Count
When you need a new Voice
Identify a lower priority voice in the pool
Call Stop(), then FlushSourceBuffers()
With February XDK, you no longer have to wait for the
next Process() before reusing
If needed: Call SetSourceSampleRate()
Remember: Stopped voices are CPU-free
12.
FX Optimization32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
XAPO_BUFFER_SILENT
Indicates silent data should be assumed
Actual memory may be uninitialized
Buffers are 16-byte aligned & interleaved perchannel
Use VMX128 instructions
Use in-place processing
In-place: Input buffer == Output buffer
Use EnableEffect/DisableEffect
More convenient than destroying and recreating the
voice/FX
Sample
Rate.
Conv.
13.
XAudio2 Memory PoolAll internal XAudio2 allocations pooled
Allows for efficient parameter passing without imposing
cumbersome parameter scope requirements
Xaudio2 allocates sooner, rather than later
Pool reset when last IXAudio2 instance released
Gives applications control of memory pool lifespan
Possible uses include reclaiming memory between levels
Remember this?
Memory is pooled for many things, including SRCs
and
Pitch Shifting
14.
Compression32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Always use compression to minimize
disk/memory/cache footprint
Reduce XMA/xWMA quality per sound for optimal
quality/size tradeoff
Seek tables:
Allows caller to skip past unwanted packets, without
having to load the data itself.
15.
Compression - Tradeoffs32k
(Mono)
XMA2
Pitch + filter
Effect1
PCM
Not compressed, so highest fidelity
ADPCM (Windows Only)
Slight Compression (~4:1, lossy)
XMA (360 Only)
Hardware-accelerated decode (316 concurrent streams)
Good compression (~6+:1)
xWMA
Software decode (Mono/Stereo~=.6-1.2% of 360 core)
Excellent compression (~20+:1)
Good for voices/music, no seamless looping
EffectN
Sample
Rate.
Conv.
16.
Streaming32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Sample
Rate.
Conv.
Cycle a circular queue of buffers to submit new
data to XAudio2
Submit new data within voice’s OnBufferEnd
callback
Increasing read-ahead before starting the voice
decreases chance of glitching, but can increase
perceptible latency depending on implementation
Consider streaming several buffers into the engine
before throttling
XMA2 Block Size should be in increments of 32K to
mirror DVD I/O patterns
17.
xWMA Streaming32k
(Mono)
XMA2
Pitch + filter
Effect1
EffectN
Each xWMA file contains a list of offsets
(DPDS chunk)
EachDPDS
submit 1needs a
modified form of this
2 Submit
Chunk:
50002000
(50000
Submit
list:
0
7000
13000)
0
st
1000
1
2000
2
3000
3
5000
4
7000
5
12000
6
1000
1
2000
2
3000
nd
12000
4000
2
3000)
9000
3000)
(7000-
(12000-
Sample
Rate.
Conv.
18.
Blocking Calls – XAudio2 ThreadThe XAudio2 realtime thread can be blocked by:
StopEngine and IXAudio2::Release()
DestroyVoice()
Thus, the need for voice reuse
XAudio2 callbacks
Check time spent in effect chain
Your code can be blocked by any XAudio2 API call,
waiting on internal realtime thread locks.
19.
DebuggingUse the debug versions of XAudio2, X3DAudio,
XAPOBase, etc.
SetDebugConfiguration may be used to control
debug behavior for XAudio2
VolumeMeter xAPO useful for detecting clipping
PIX counters available to track CPU, memory, and
voice statistics
Similar data available via
IXAudio2::GetPerformanceData
Watch for other threads on the core that may be
slowing down XAudio2
20.
Audio performance analysis withPIX
21.
Quad5.1
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Filter
Reverb
Sample
Rate.
Conv.
EffectN
Stere
o
Pitch/SRC + filter
Effect1
Mono
Sample Rate Conversion
A Case Study
22.
PIX23.
Timing Capture24.
OnProcessingPassEnd CallbackUse callbacks to notify Hardware Thread 5 that it
can resume execution
25.
xbPerfVieww/ Sampling Capture
26.
A Case StudyQuad
5.1
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Sample
Rate.
Conv.
Sample Rate Conversion
Stere
o
Pitch/SRC + filter
Reverb
EffectN
Mono
Filter
Effect1
Adding submixes
27.
xbPerfVieww/ Submixing
28.
Stereo
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
32k
48k
Quad
5.1
Pitch/SRC + filter
Effect1
EffectN
Pitch/SRC + filter
Effect1
EffectN
Reverb
Sample
Rate.
Conv.
Sample Rate Conversion
Mono
32k
48k
Filter
EffectN
SRC & Reverb
Change to Mono->5.1 Reverb
Effect1
A Case Study
29.
xbPerfViewFinal Numbers
Component
Start CPU%
Final CPU%
% Freed
MatrixMix
17.48%
4.25%
13.23%
Reverb
6.37%
4.94%
1.43%
Resampling
14.74%
11.41%
3.33%
Total
38.59%
20.60%
17.99%
Idle
27.95%
48.47%
20.52%
30.
With Processing to Spare…31.
SummarySUBMIX!
Use OnBufferEnd callbacks to stream data
Intentionally choose your compression methods
Carefully manage your voice interactions
Watch for Blocking Calls
Pool voices where possible
Use EnableEffect/DisableEffect
Profile your title to focus your efforts
32.
www.microsoftgamefest.com© 2009-2010 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.