granular synthesis SMC 2026
SMC 2026 Paper Submission | Tom Didiot-Cook, University of Bristol
Companion code and web demo for the paper submitted to the Sound and Music Computing Conference (SMC 2026).
We present a granular synthesis technique that splits audio at rising zero-crossings to produce a signal-adaptive grain decomposition requiring no amplitude windowing. Grain boundaries naturally align with the signal's waveform cycles at near-zero amplitude, so grains can be concatenated directly without envelopes. A merge algorithm absorbs short segments to guarantee that every sample is preserved: concatenating grains in order exactly reproduces the original signal (SNR = infinity). Each grain's duration encodes a pitch via a logarithmic mapping to MIDI note numbers. A real-time diffusion parameter continuously interpolates between faithful sequential playback and a dense granular cloud. A dual-source cross-synthesis mode separates structure (grain ordering) from corpus (grain pool), enabling timbral transplantation between recordings via pitch-class matching.
Live demo: https://granular-270882994369.us-central1.run.app/
cd web
npm install
npx vite
Open http://localhost:5173 in Chrome (Web Audio works best in Chrome/Edge).
The diffuse/ directory contains the core Python implementation.
cd diffuse
pip install numpy soundfile sounddevice
# Split and play with diffusion
python 01_granular_synthesis.py path/to/audio.wav --diffusion 0.5
# See all options
python 01_granular_synthesis.py --help
The splitting algorithm in 01_granular_synthesis.py provides three functions:
rising_zero_crossings(y) - detect rising zero-crossing indices in O(N)slice_segments_covering_all(y, zc) - partition signal into contiguous segmentsmerge_short_segments(segs, min_len) - two-pass merge ensuring all grains meet minimum lengthThe paper source is in paper/. The compiled PDF is at paper/0X-granular.pdf.
cd paper
tectonic smc2026template.tex
# or: pdflatex smc2026template.tex && bibtex smc2026template && pdflatex smc2026template.tex && pdflatex smc2026template.tex
The evaluation scripts require numpy and librosa:
pip install numpy librosa
# Null test (lossless reconstruction verification)
cd paper/evaluation
python null_test.py path/to/audio.wav
# Boundary comparison (ZC vs rectangular vs Hann-windowed)
python boundary_compare_hann.py path/to/audio1.wav path/to/audio2.wav ...
# Original detailed boundary comparison
python boundary_compare.py path/to/audio.wav
cd paper/figures
pip install matplotlib librosa numpy
python fig1_grain_boundaries.py path/to/audio.wav
python fig2_pitch_distribution.py path/to/audio.wav
python fig3_diffusion_continuum.py path/to/audio.wav
python fig_cross_synthesis.py structure.wav corpus.wav
.
├── web/ # Browser-based interactive demo
│ ├── index.html # Main UI
│ ├── main.js # Application logic + state management
│ ├── audio-engine.js # Web Audio API orchestration
│ ├── grain-worklet.js # AudioWorklet grain renderer
│ ├── corpus-manager.js # Grain corpus + cross-synthesis matching
│ ├── splitter.js # Zero-crossing splitter (Web Worker)
│ ├── streaming-splitter.js # Incremental splitter for live mic
│ ├── tonnetz.js # Tonnetz hex grid visualisation
│ ├── piano.js # 88-key piano visualisation
│ └── ...
├── diffuse/ # Python toolkit
│ ├── 01_granular_synthesis.py # Core engine
│ └── README.md # Detailed Python documentation
├── paper/ # SMC 2026 paper
│ ├── smc2026template.tex # Paper source
│ ├── 0X-granular.pdf # Compiled paper
│ ├── evaluation/ # Evaluation scripts
│ └── figures/ # Figure generation scripts + PDFs
└── DEVELOPMENT.md # Research notes and roadmap
Each grain's sample count determines a MIDI pitch:
midi = 60 + 12 * log2(d_ref / d_grain)
where d_ref = 0.00382225643s (period of middle C at 261.626 Hz). Shorter grains map to higher pitches.
A single parameter delta in [0, 1] controls playback: - delta = 0: sequential grain playback reproduces the original signal exactly - delta = 1: dense granular cloud with Poisson-spawned overlays
The backbone cursor makes bidirectional jumps with probability delta, while an overlay cloud spawns grains from the cursor neighbourhood.
Separate structure (grain ordering) from corpus (grain pool). For each structure grain, a corpus grain is selected via 4-tier MIDI pitch matching: exact match, nearest (plus/minus 6 semitones), same pitch class, global nearest. This preserves the temporal/harmonic structure of one recording while replacing its timbral content.
Grain pitch classes are mapped onto an axial hex grid where midi = 4q + 7r (+q = major thirds, +r = perfect fifths). Active grains highlight cells in real time, colour-coded by the circle of fifths.
This is an open-access project distributed under the terms of the Creative Commons Attribution 3.0 Unported License.