Alright folks, let me tell you about my little adventure with NBA announcers. It all started with me being super annoyed by the same old voices commentating on every game.

So, I thought, “Hey, why not try to build something that can identify these guys?” I figured it’d be a fun way to learn a bit about audio processing and maybe even create a cool little tool for myself.
First thing I did was gather some data. I spent hours scouring YouTube, finding clips of different NBA games and announcers. I’m talking about folks like Mike Breen, Kevin Harlan, Reggie Miller – the whole crew. I downloaded a bunch of these clips, making sure I had a good variety of audio samples for each announcer. This was honestly the most tedious part.
Next, I needed to clean up the audio. You know, remove background noise, normalize the volume, and all that jazz. I used a tool called Audacity for this. It’s free and pretty easy to use. I basically went through each clip, trimmed it down to just the announcer’s voice, and made sure the audio quality was consistent.
Then came the fun part: feature extraction! I wanted to convert the audio into something a computer could understand. I ended up using a library called Librosa in Python. It allowed me to extract features like MFCCs (Mel-Frequency Cepstral Coefficients), which are commonly used in speech recognition. I won’t bore you with the details, but basically, MFCCs capture the unique characteristics of a person’s voice.
With the features extracted, I needed a model to classify them. I went with a simple machine learning model: a Support Vector Machine (SVM). I used scikit-learn in Python to train the model. I split my data into training and testing sets, trained the SVM on the training data, and then evaluated its performance on the testing data.
To be honest, the initial results were not great. The accuracy was around 60%, which wasn’t terrible, but definitely not good enough. I realized that the model was probably overfitting to the training data, so I tried a few things to improve it.
- I added more data. I went back to YouTube and found even more clips of the announcers.
- I adjusted the model’s parameters. I played around with the regularization parameter of the SVM to prevent overfitting.
- I tried different feature extraction techniques. I experimented with different MFCC parameters and even tried adding other features like pitch and energy.
After a few iterations, I managed to get the accuracy up to around 80%, which I was pretty happy with. It’s still not perfect, but it’s good enough for my purposes.
Finally, I built a simple command-line interface that allows me to input an audio file and get the predicted announcer. It’s not the most polished tool, but it gets the job done.

So yeah, that’s the story of how I built my NBA announcer identifier. It was a fun and challenging project, and I learned a lot about audio processing and machine learning along the way. Maybe one day I’ll turn it into a real app, but for now, it’s just a fun little project that I can use to impress my friends.
Things I Learned:
- Data collection is key. The more data you have, the better your model will perform.
- Audio processing can be tricky. There are a lot of factors that can affect audio quality, so it’s important to clean up your data carefully.
- Machine learning is fun! It’s amazing how you can train a computer to recognize patterns in data.
Anyways, hope you found this interesting! Let me know if you have any questions. Peace out!