.package(url: "https://github.com/mesqueeb/SwiftSound", from: "1.0.2")
An Experimental Swift CLI for FFT Analysis
This repository was created as an experiment to explore sound analysis using Swift. Through this project, I aimed to understand how sound works, how it is stored digitally, and how we can visualize it using techniques like Fast Fourier Transform (FFT). The journey covered everything from raw audio amplitude extraction to spectrogram generation.
This tool reads audio files (including .mp3
, .wav
, .flac
, and .raw
) and processes them into raw amplitude arrays and spectrograms, while supporting file format conversion via SoX and generating FFT visualizations.
A complete example is located at examples:
-
We started with elephant.flac
-
It gets converted into elephant_flac.raw
-
The raw file is then converted to raw amplitude values visible at elephant_raw_amplitudes.json
[0.0018920898,0.0014038086,0.001739502,0.0013427734,0.0008544922,0.0007324219,-0.00012207031,-0.00036621094,-0.00048828125,-0.00076293945,-0.00033569336,-0.00088500977,0.00012207031,-0.00039672852,0.0004272461,-0.00021362305,0.00021362305,-0.00021362305,0,-0.0005187988,3.0517578e-05,-0.00039672852,0.00015258789,-0.0005493164,0.00091552734,6.1035156e-05,0.0014038086,0.00076293945,0.0013427734,0.00076293945,0.00061035156,0.00033569336,9.1552734e-05,-0.00021362305,-0.00018310547,...]
-
The raw amplitude values are then converted into a spectrogram at elephant_spectrogram.svg
- Ensure
sox
is installed:brew install sox
- Clone the repository:
git clone https://github.com/mesqueeb/SwiftSound.git cd SwiftSound
- Run the tool:
swift run fft <input-file> [options]
OVERVIEW: A Swift command-line tool for FFT analysis.
USAGE: fft <input-file> [--json] [--svg] [--open] [--output <output>] [--sample-rate <sample-rate>] [--max-frequency <max-frequency>] [--verbose]
ARGUMENTS:
<input-file> The path to the raw or audio file.
OPTIONS:
--json Save the extracted raw amplitude as JSON.
--svg Save an SVG spectrogram to a file.
--open Open the SVG in Preview.
-o, --output <output> Output SVG spectrogram filename.
-s, --sample-rate <sample-rate> (default: 44100.0)
-m, --max-frequency <max-frequency> (default: 1000.0)
-v, --verbose Enable verbose output.
-h, --help Show help information.
-
Convert and Analyze a
.mp3
File:swift run fft elephant.mp3 --json --svg --open --verbose
-
Analyze a
.wav
File and Save JSON Only:swift run fft sound.wav --json
-
Create an SVG Spectrogram with Custom Sample Rate:
swift run fft music.flac --svg --sample-rate 48000 --max-frequency 5000
- Frequency (Pitch): The speed of air vibrations, measured in Hertz (Hz). Higher frequency = higher pitch.
- Amplitude (Gain): The strength of the wave (how loud it is).
- Timbre: The unique fingerprint of a sound, which is a combination of:
- Harmonics: Additional frequencies layered on top of the fundamental pitch.
- ADSR (Envelope): The Attack, Decay, Sustain, and Release profile of a sound.
- Noise Characteristics: Subtle imperfections or unique textures that distinguish sounds.
- Sampling: Measuring sound waves thousands of times per second (e.g., 44100 samples/sec).
- Fourier Transform: Breaking a complex wave into simple frequencies (FFT analysis).
- Spectrograms: Visualizing sound as time vs. frequency vs. amplitude.
- File Formats:
.raw
is uncompressed samples, while.mp3
,.wav
, and.flac
compress audio differently. - SoX Integration: Using
sox
can convert between audio formats and sample rates. - FFT Analysis: FFT stands for Fast Fourier Transform, which converts sound waves into frequency components.
- STFT Analysis: STFT stands for Short-Time Fourier Transform, which breaks sound into time slices on which FFT is applied, resulting in a spectrogram of frequencies over time.
- SVG Rendering: Visualizing STFT results with
SwiftPlot
.
This project started from a simple question: βWhat is sound?β and became a deep dive into audio analysis, signal processing, and digital representations of sound. It also became an exploration of how Swift can be used for scientific computation.
I hope this experiment inspires others to explore the intersection of programming, science, and art. πΆπ»β¨
Feel free to fork this project, contribute, or reach out with suggestions!
Author: Luca Ban
GitHub: mesqueeb