Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Song Matching and IDing by Analyzing and Hashing Audio Fingerprints
Search
Milos Miljkovic
November 11, 2015
Science
0
110
Song Matching and IDing by Analyzing and Hashing Audio Fingerprints
Slides from the talk given at PyData NYC2015.
Milos Miljkovic
November 11, 2015
Tweet
Share
Other Decks in Science
See All in Science
2023-07-18_Verge_Genomics
lcolladotor
0
110
ABEMAの効果検証事例〜効果の異質性を考える〜
s1ok69oo
3
1.5k
バックアップ『しながら』ランサムウェア検出も!? セキュリティ強化が満載 Veeam 12.1
climbteam
0
300
Xpenologyなるアングラプロジェクト周りについて語るやつ
sushi514
0
640
AI(人工知能)の過去・現在・未来 —AIは人間を超えるのか—
tagtag
0
120
Non-Gaussian methods for causal discovery
sshimizu2006
0
180
Introduction to Graph Neural Networks
joisino
4
1.4k
Machine Learning for Materials (Lecture 7)
aronwalsh
0
730
Snowflake上でRを使う: RStudioセットアップとShinyアプリケーションのデプロイ
ktatsuya
0
110
AI Alignment: A Comprehensive Survey
s_ota
0
180
構造活性フォーラム2023-山﨑担当分
yamasakih
0
310
Machine Learning for Materials (Lecture 3)
aronwalsh
0
830
Featured
See All Featured
Rebuilding a faster, lazier Slack
samanthasiow
73
8.2k
Git: the NoSQL Database
bkeepers
PRO
422
63k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
20
1.9k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
14
1.5k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
125
32k
How to name files
jennybc
65
93k
The Brand Is Dead. Long Live the Brand.
mthomps
49
29k
StorybookのUI Testing Handbookを読んだ
zakiyama
13
4.6k
How to Ace a Technical Interview
jacobian
272
22k
The Art of Programming - Codeland 2020
erikaheidi
42
12k
Ruby is Unlike a Banana
tanoku
96
10k
Put a Button on it: Removing Barriers to Going Fast.
kastner
58
3.1k
Transcript
Song Matching & IDing *** Analyzing and Hashing Audio With
Python Scientific Stack And SQL DB
Link to talk material bit.ly/MMNYC15 GitHub
Talk outline • Airing of grievances • Feats of strength
• Pontification
Who is this guy? Miloš Miljković @miishke
WW WW
Data science mule Slow, not too smart, and hard working
under burden.
?
SOTU 2015
SOTU 2015 – applause occurrence time [min] and duration [s]
SOTU 2015 – speech and applause duration [s]
Somebody said something on Twitter
Literature search
Who is doing it?
Why do it?
o utlier
Step one
Steps 2 to N
Awesome paper • Andrew Wang • An Industrial-strength Audio Search
Algorithm • Clearly written, informative figures • Seven (7) pages long
None
Sound & human hearing • 20 – 20,000 Hz •
Loudness in dB (logarithmic scale) • Quiet study room ~40 dB • Jackhammer ~95 dB • Jet engine ~140 dB
Human auditory system
Recording & encoding • ADC → pulse-code modulation • 16
bit signed integer • Shannon – Nyquist theorem • 20 kHz → 40 kHz • CD audio 44.1 kHz
Audio bit rate • sampling rate x bit depth x
n • 44,100 x 16 x 2 = 1,411 kb/s • MP3 highest quality is 320 kb/s
Short time Fourier transform STFT
bit.ly/FTSEA15 William Cox @gallamine
STFT • Analysis of frequencies in signal when frequency varies
in time • Speech, music, seismology, ECG…
STFT in a nutshell • Apply DFT to windowed segments
of data • Windows overlap • Windows are apodized
STFT Q&A • How long are window segments? • Why
do windows overlap? • Why do we mess with windows? • What is apodization?
Jupyter notebook time!
Peak quadrants
Churning out numbers • Make quadrant for each peak •
In each quadrant for each peak pair record f 1 , f 2 , Δt • Pass [f 1 , f 2 , Δt] through hash function
Beaty of simplicity
Align and match • For each peak record offset time
and its hashes • Find matching hashes and get song IDs • d = offset song - offset recording
None
SQL DB CREATE TABLE fingerprint( hash songID offset ); CREATE
TABLE song( songID song_name );
Power of Python • scipy → read in raw audio
• matplotlib → spectrogram • scikit-image → find peaks • Python → hash • Python → set intersection
Performance • DB size • Search time • No messing
with play speed • False positives or are they?