
When Your Spotify Wrapped Won't Wait: Building My Own Music Data Pipeline
Music runs in the background of everything I do. From the moment I wake up until I crash at night, there's always something playing - and honestly, even when it's quiet, I'm still mentally jamming. So when Spotify Wrapped rolls around each December, it's not just a fun recap; it's a genuine moment of "wait, that's what I was listening to in March?"
That fleeting annual insight sparked something: What if I didn't have to wait a year? What if I could see my listening patterns shift in real-time, track my moods through music, and maybe... just maybe... understand why I went through that regrettable pop phase last summer?
So I started saving my Spotify data. Every play. Every skip. Every late-night deep cut. And in the process, I learned that connecting to the Spotify API is less about authentication flows and more about deciding what story you want your data to tell.
The Spotify API: More Than Just "Now Playing"
When I first dove into the Spotify Web API docs, I expected a straightforward REST API. And it is, but it's also surprisingly nuanced in how it categorizes and serves music data.
The API offers several endpoints that sound similar but serve different purposes:
- Recently Played Tracks: Your last 50 tracks, with timestamps
- Top Tracks/Artists: Aggregated data over short, medium, and long-term periods
- Currently Playing: Real-time snapshot of what's playing right now
- Audio Features: The secret sauce: valence, energy, danceability, acousticness, and more
For my project, I needed recently played tracks. I wanted the raw timeline, not Spotify's pre-aggregated "top" lists. I wanted to see the sequence of songs, the context switches between moods, the 3 AM playlist choices.
1// Fetching recently played tracks
2const getRecentlyPlayed = async (accessToken: string) => {
3 const response = await fetch(
4 'https://api.spotify.com/v1/me/player/recently-played?limit=50',
5 {
6 headers: {
7 'Authorization': `Bearer ${accessToken}`,
8 },
9 }
10 );
11
12 const data = await response.json();
13 return data.items; // Array of track objects with played_at timestamps
14};Simple enough. But here's where it gets interesting: that limit=50 is both generous and limiting. You can fetch your last 50 tracks in one call, but if you want more history, you need to paginate using before and after cursors based on Unix timestamps. No bulk historical export. No "give me everything from last month."
This means if you want a true personal music database, you need to poll regularly and store it yourself.
Building the Pipeline: Supabase as My Music Vault
I chose Supabase for storage - partly because their free tier is generous, partly because PostgreSQL feels right for relational music data, and partly because I wanted to experiment with their real-time subscriptions (future feature: live dashboard updates).
The schema evolved as I learned more about the API's data structure:
1// Simplified schema for tracks table
2create table tracks (
3 id uuid primary key default uuid_generate_v4(),
4 spotify_track_id text unique not null,
5 played_at timestamp not null,
6 track_name text not null,
7 artist_name text not null,
8 album_name text,
9 duration_ms integer,
10 audio_features jsonb, // Store the full features object
11 created_at timestamp default now()
12);One of the first things I noticed: Spotify's data structure doesn't always match what you think you need. The recently-played endpoint gives you track info, but audio features (BPM, energy, valence) require separate calls to /audio-features/{id}. More API calls. More rate limiting to consider. More caching strategies to implement.
And here's a reality check I had early on: the data Spotify serves you through the API isn't exactly the same as what powers their algorithm. When I compared my stored "top tracks" to what Spotify showed in-app, there were discrepancies. Their algorithm weights recent plays differently, considers skip rates, and probably has a dozen other signals I can't access.
My database is pure play history. Their Wrapped is curated narrative. Both are valid, but they tell different stories.
The Visualizer Dream (and Free API Reality)
I had this vision: WinAmp-style visualizations that pulse and shift with the actual BPM of each track. If you remember early 2000s Winamp (IYKYK), those visualizers were hypnotic. I wanted that, but my music.
Then I hit the wall: The free Spotify API doesn't give you precise BPM data.
It gives you tempo, which is close, but it's an estimated value and sometimes wildly off for electronic music or anything with polyrhythms. I wanted the visualizer bars to sync perfectly to the beat. What I got was "approximately 120-128 BPM, probably."
So I improvised. I used energy and valence instead:
1// Mapping audio features to visualization properties
2const getVisualizerConfig = (audioFeatures: AudioFeatures) => {
3 return {
4 intensity: audioFeatures.energy, // 0.0 to 1.0
5 color: mapValenceToColor(audioFeatures.valence), // Happy = warm, sad = cool
6 movement: audioFeatures.danceability, // How much the bars "bounce"
7 smoothness: 1 - audioFeatures.acousticness, // Electronic = sharp, acoustic = smooth
8 };
9};It's not what I originally wanted, but it works. And honestly? It might be better. The visualizer doesn't just react to tempo...it reacts to mood. Low energy, low valence songs get slow, muted blues. High energy, high valence tracks explode in yellows and reds with rapid movement.
Sometimes constraints force better solutions.
Patterns in the Data (and in My Life)
The most fascinating part isn't the tech... it's what the data reveals when I look back.
There's the week in August where every other song was a different language because I was deep in a Duolingo streak and wanted immersion. There's the entire month where I apparently listened to the same four songs on repeat because I was in flow state building the SportsFest Dashboard and didn't want lyrical distractions.
And yes, there are the cringe moments. That viral pop phase in June that I'd completely blocked out? The data doesn't forget.
But there's also validation: seeing that I did listen to that indie artist 47 times in a week confirms that yes, I was in a moment. The data backs up the memory.
What This Taught Me About Working with APIs
This side project reinforced a few frontend fundamentals that I'd intellectually known but really internalized through building:
API data shapes your UI, not the other way around. I designed the visualizer around what data was available, not what data I wished I had. That's the same constraint-driven design we do with product APIs every day.
Polling vs. webhooks vs. scheduled jobs... pick your poison. I ended up with a combination: a cron job that runs every hour to fetch new tracks, and manual "sync now" button for when I want immediate updates. Real-time webhooks would be overkill (and aren't offered for this use case anyway).
Data freshness has a cost. Every additional API call is a tradeoff between having the latest data and staying within rate limits. I had to decide: Is it worth 50 extra API calls per day to get audio features for every track, or can I batch-fetch them once a week? (I went with weekly batches.)
Where This Goes Next
Right now, this project is a functional curiosity. I can see my listening history, watch visualizations respond to different tracks, and spot patterns I'd never notice otherwise.
But I'm collecting data daily, and the longer I run this, the more interesting it becomes. Yearly comparisons. Seasonal shifts. Correlations between work stress and genre choices. The compound growth of a personal dataset.
I'm not trying to recreate Spotify Wrapped. I'm building something weirder and more personal: a music journal that happens to be queryable SQL.
And maybe that's the real parallel to frontend work. We're all building something that already exists somewhere else, but the act of building it yourself, with your own constraints and goals and weird ideas, teaches you things that just using the polished product never would.
If you're building with the Spotify API or have your own music data projects, I'd love to hear about them. What patterns have you found in your listening history? What constraints led to your best creative solutions?