Semi-random thoughts and tales of tinkering
You have a working spectrum analyzer on your iPhone. This section looks back at what we built, then looks forward at what comes next — distribution, better algorithms, production hardening, and further learning.
Let's take stock. Over the course of this tutorial, you built a real-time spectrum analyzer that:
AVAudioEngineThe entire project is four Swift files totaling roughly 200 lines of code:
| File | Lines | Responsibility |
|---|---|---|
VUMeterApp.swift |
~8 | App entry point |
AudioEngine.swift |
~55 | Audio capture, RMS, thread dispatch |
SpectrumAnalyzer.swift |
~80 | FFT, log-spaced binning, note detection |
ContentView.swift |
~70 | All UI — spectrum bars, VU meter, controls |
Those 200 lines touch a surprising breadth of topics: SwiftUI layout and animation, real-time audio capture, signal processing with the FFT, unsafe pointer interop with C APIs, and SIMD-optimized vector math. If you were coming from C# and Windows with no iOS experience, you now have working knowledge of all of these.
If you have an Apple Developer account ($99/year), you can share your app with others before publishing it to the App Store. Apple's beta distribution platform is called TestFlight, and it is remarkably easy to use.
TestFlight builds expire after 90 days, which is fine for beta testing. Testers can provide feedback directly through the TestFlight app, including screenshots. This is an excellent way to share your spectrum analyzer with friends, get feedback on the UI, or test on devices you do not own.
TestFlight is roughly equivalent to distributing a ClickOnce installer or sharing an MSIX package, but with Apple managing the hosting and updates. The key difference is that Apple performs automated checks on every build you upload (crash analysis, API usage review, privacy manifest validation), which catches some issues before your testers ever see them.
If you want to make your spectrum analyzer available to anyone, you publish it on the App Store. Here is the high-level process:
Apple requires every app to declare what data it collects. These declarations appear on your App Store listing as "privacy nutrition labels." For our spectrum analyzer, the declaration is minimal:
This makes for a clean privacy label. Users increasingly care about this, and "no data collected" is a genuine selling point.
Our spectrum analyzer identifies the peak frequency by finding the FFT bin with the largest magnitude and mapping it to a musical note. This works, but it has real limitations. If you want to build a guitar tuner or a vocal pitch tracker, you need better tools.
With 4096 samples at a 44,100 Hz sample rate, each FFT bin spans approximately 10.8 Hz
(44100 / 4096 = 10.77). That means our peak-bin method can only identify
frequencies in 10.8 Hz steps.
At middle frequencies this is tolerable. A4 is 440 Hz and A#4 is 466 Hz — a gap of 26 Hz, so we have two or three bins between adjacent notes. But at lower pitches the gap narrows in absolute terms: A2 is 110 Hz and A#2 is 116.5 Hz — only 6.5 Hz apart, which is less than one bin width. Our peak-bin method literally cannot distinguish between these two notes.
You can improve resolution by increasing the FFT size (8192 or 16384 samples), but this adds latency. At 44,100 Hz, 16384 samples is 0.37 seconds of audio — the pitch display would lag noticeably behind the sound.
Real musical instruments do not produce a single frequency. When you pluck the A string on a guitar, the string vibrates at 110 Hz (the fundamental) but also at 220 Hz, 330 Hz, 440 Hz, and so on. These are harmonics. In many instruments, a harmonic (often the second or third) is louder than the fundamental.
Our peak-bin method reports the loudest frequency, not the lowest. If the 220 Hz harmonic is stronger than the 110 Hz fundamental, we report A3 instead of A2. This is a full octave error — a serious problem for a tuner.
Accurately detecting the fundamental frequency of a musical note is a surprisingly hard problem that has been studied for decades. The FFT tells you what frequencies are present and how loud each one is. It does not directly tell you which one is the fundamental. You need an additional algorithm on top of the FFT — or a completely different approach.
The simplest improvement to our peak-bin method is parabolic interpolation. The idea: the true peak frequency almost certainly falls between two bins. If bin 42 has the highest magnitude, the real peak is somewhere between bins 41 and 43.
Take the magnitudes of the peak bin and its two neighbors, fit a parabola through the three points, and solve for the vertex. The vertex gives you a fractional bin index (like 42.3) which you convert to a frequency. This is a few lines of math, no extra data needed, and it typically improves accuracy by a factor of 5–10. It does not solve the harmonics problem, but for applications where the fundamental is reliably the strongest frequency (like whistling or a flute), it works well.
Autocorrelation takes a completely different approach. Instead of looking at frequency magnitudes, it looks at the periodicity of the signal in the time domain.
The algorithm compares the signal to a time-shifted copy of itself. At a lag of zero samples, the signal matches perfectly (correlation = 1.0). As you increase the lag, the correlation drops. But when the lag equals exactly one period of the fundamental frequency, the signal aligns with itself again and the correlation spikes.
For a 110 Hz signal at 44,100 Hz, one period is 401 samples. The autocorrelation function
will show a peak at lag 401, from which you compute 44100 / 401 = 110 Hz.
The key advantage: autocorrelation inherently detects the fundamental, not the strongest harmonic. A signal with harmonics at 110, 220, and 330 Hz repeats every 401 samples regardless of which harmonic is loudest. Autocorrelation finds that repetition period directly.
The downside is computational cost. A naive autocorrelation is O(n²), though you can
compute it efficiently using two FFTs (which makes it O(n log n), the same as the FFT itself).
Accelerate's vDSP_conv function handles this.
The YIN algorithm (de Cheveigné and Kawahara, 2002) is an improved autocorrelation method. It adds two refinements:
YIN is the standard algorithm used in most commercial guitar tuner apps. It runs comfortably in real time on an iPhone. If you want to build a serious pitch detection tool after this tutorial, YIN is where you should start. The original paper is freely available and only 12 pages long.
Our tutorial code is intentionally minimal. We skipped error handling and edge cases to keep the focus on the core concepts. For a production app — one you would ship to real users — you would add several layers of hardening.
Every iOS app that uses audio should configure its AVAudioSession. This tells
the system what kind of audio your app needs:
let session = AVAudioSession.sharedInstance()
try session.setCategory(.record, mode: .measurement)
try session.setActive(true)
The .record category tells iOS this is a recording app (not playback, not a
phone call). The .measurement mode disables any system-level signal processing
(noise cancellation, automatic gain control) that would alter the audio before we analyze it.
Without this, iOS might apply voice-optimized processing that skews our frequency data.
When a phone call comes in, iOS interrupts your audio session. When the call ends, you need
to restart it. In production code, you observe the
AVAudioSession.interruptionNotification and respond to begin/end events. Similarly,
route changes (plugging in headphones, connecting to Bluetooth) trigger
AVAudioSession.routeChangeNotification. A robust app listens for these and
reconfigures as needed.
If the user denies microphone access, a production app should:
AVAudioSession.sharedInstance().recordPermission).UIApplication.shared.open(URL(string: UIApplication.openSettingsURLString)!).The audio engine can fail for various reasons: hardware error, resource contention with another
app, or a system-level audio reset. In production, you would wrap engine.start() in
a retry loop with exponential backoff, or at minimum surface the error to the user with a
"Retry" button.
Our tutorial code covers the 80% case — the happy path where everything works. The remaining 20% (error handling, edge cases, interruption recovery) is what separates a tutorial project from a production app. It is also where 80% of the code ends up living. This is true in every platform and language.
Audio processing code is notoriously tricky to test. You cannot easily automate
"play a guitar and check the display." But the architecture we chose — separating
SpectrumAnalyzer from AudioEngine — gives us a clean seam.
The analyzer is a pure computation: feed in samples, get back bars and notes. No hardware,
no microphone, no audio session. This is the code we test.
The key insight: you do not need a microphone to test an FFT. You can generate perfect sine waves in code and feed them directly to the analyzer. A 440 Hz sine wave is just math:
func sineWave(frequency: Float, sampleRate: Float, count: Int) -> [Float] {
(0..<count).map { i in
sin(2 * .pi * frequency * Float(i) / sampleRate)
}
}
This produces a buffer identical to what AVAudioEngine would deliver if a
perfect 440 Hz tone were playing. No hardware variance, no background noise, no
nondeterminism. The test either passes or it does not.
With synthetic signals, you can write deterministic tests for every behavior of the analyzer:
The most valuable test is the one that would have caught our sample rate bug (Section 8). If we had written this test on day one:
func testSampleRateMismatchProducesWrongNote() {
// Analyzer thinks sample rate is 44100, but signal was sampled at 48000
let wrongAnalyzer = SpectrumAnalyzer(binCount: 48, sampleRate: 44100)
let signal = sineWave(frequency: 440, sampleRate: 48000, count: 4096)
let (_, _, note) = wrongAnalyzer.process(buffer: signal)
// 440 Hz signal is misinterpreted as ~409 Hz → G#4
XCTAssertNotEqual(note, "A4")
}
func testCorrectSampleRateFixesNote() {
let analyzer = SpectrumAnalyzer(binCount: 48, sampleRate: 48000)
let signal = sineWave(frequency: 440, sampleRate: 48000, count: 4096)
let (_, _, note) = analyzer.process(buffer: signal)
XCTAssertEqual(note, "A4")
}
The first test documents the bug: it proves that mismatched sample rates produce wrong notes. The second test proves the fix: passing the correct sample rate gives the correct answer. Together, they ensure this bug never comes back.
You cannot unit test AVAudioEngine — it needs a real device and a
real microphone. But you can unit test everything downstream of it. By designing
SpectrumAnalyzer as a pure struct that takes [Float] and returns
results, we made the core logic fully testable. This is not unique to audio: in any app
that processes external input (network data, sensor readings, user gestures), separate the
I/O from the computation and test the computation directly.
To run the tests: open the project in Xcode, press ⌘U (or
Product → Test). Xcode builds the test target, launches the app in the
simulator, injects the test bundle, and runs every method that starts with
test. Green checkmarks appear next to passing tests in the Test Navigator
(⌘6).
You can also run a single test by clicking the diamond icon in the gutter next to the test method. This is useful when iterating on a failing test — you do not have to wait for the entire suite.
XCTest is Apple’s built-in test framework — roughly equivalent to NUnit or
xUnit. Test classes inherit from XCTestCase, test methods start with
test (no [Test] attribute needed), and assertions use
XCTAssert* instead of Assert.*. The lifecycle is familiar:
setUp() and tearDown() run before and after each test.
There is no separate test runner — Xcode handles discovery, execution, and
reporting.
You now have a working iOS app and the foundational skills to build more. Here are concrete next steps, organized by topic.
Apple's official SwiftUI tutorials
are excellent — interactive, well-paced, and free. They cover navigation, lists, custom
drawing with Canvas and Path, gestures, and data flow patterns. If
you want to add features like a settings screen, frequency labels on the spectrum bars, or a
scrollable history view, these tutorials will give you the tools.
"The Swift Programming Language" is Apple's official language reference, available free on Apple Books or at docs.swift.org/swift-book. You have already used many Swift features in this tutorial (optionals, closures, generics, protocols, pattern matching). The book fills in the gaps and serves as an excellent reference when you encounter unfamiliar syntax.
Build something. That is the fastest way to solidify what you have learned. Here are five projects in roughly increasing order of complexity:
AVAudioFile, display the waveform as a scrollable Path in SwiftUI.
Touches on file I/O, large dataset rendering, and gesture handling.SFSpeechRecognizer) alongside the audio engine to build an app that responds
to voice commands. Combines audio processing with on-device machine learning.AVAudioUnitDelay, AVAudioUnitReverb, or a custom
AVAudioUnit to the speaker. Introduces the full audio graph with processing
nodes between input and output. This is how music production apps work.You came into this tutorial as a C# developer who had never written a line of Swift. You now have a working iOS app that does real-time audio processing. The concepts transfer: reactive UI is reactive UI, signal processing is signal processing, and good engineering practices are universal. The syntax is different, the toolchain is different, but the thinking is the same. Go build something.