DragRaceData.io catalogues 16 years of RuPaul's Drag Race across multiple franchises. It provides searchable, queryable access to 400+ episodes, 200+ queens, 400+ songs, and thousands of lipsync performances.
The site gets thousands of organic page views per month and ranks on page 1 of Google for hundreds of long-tail queries.
The technical trick
The entire site regenerates from TypeScript seed files. Add one song to a .ts file, and 50+ pages automatically update with new statistics, rankings, and cross-references.
This is possible because of an end-to-end type-safe pipeline:
TypeScript seed files → PostgreSQL → tRPC API → statically generated pages
The database is rebuilt from typed code on every deploy. TypeScript files are the source of truth, not the database itself. This means compile-time validation catches all errors before they reach production.
Why this architecture matters
Traditional content sites separate data entry from code. Writers update a CMS, developers maintain the site separately. When data changes, pages might break.
This project flips that model. Data is code. Code is data. They're validated together.
The result: one person can maintain 1,400+ pages that update automatically.
Why this was hard
Drag Race data is messy. Reality TV competition data mixed with music metadata mixed with dozens of edge cases. Three concrete problems required domain-specific solutions.
Challenge 1: The entity problem
- The absurdity
Rosé the drag queen vs. Rosé from BLACKPINK. Ariana Grande, the guest judge, vs. Ariana Grande, the artist, vs. Ariana Grande, the Snatch Game character. The same name can refer to entirely different entities. The same person can appear in multiple contexts.
- The solution
A unified entities table with two key fields: entityType and existentialCategory. The key innovation: composite unique constraints on (name, entityType) and (slug, entityType). Names and slugs must be unique within each entity type, but can duplicate across entity types. This allows Rosé the drag queen (entityType: "queen") and Rosé the K-pop artist (entityType: "artist") to coexist in the same table with identical names and slugs. The database enforces data integrity automatically while relationship tables like songCredits and queenSeasons simply reference entities.id without caring about entity type.
- The impact
One system handles real people, fictional characters, and mythological figures. Bigfoot and Meryl Streep coexist in the same data model.
Challenge 2: Music metadata chaos
- The absurdity
400+ songs with duplicate titles, complex credits (Artist A ft. Artist B & Artist C), and overlapping identities (queens who are also recording artists). "Physical" by Dua Lipa vs. "Physical" by Olivia Newton-John. "Hello" by Allie X vs. "Hello" by Martin Solveig & Dragonette. Multiple artists release different songs with identical names. The same person can be both a drag queen and a musical artist.
- The solution
A Songs table with title, releaseYear, and originalArtist fields for disambiguation. Song credits system with role types: "primary", "featured", and "collaboration". Slug-based routing handles duplicate names by combining all entities with that slug.
- The impact
Queryable by the artist with accurate attribution. Every Ariana Grande song used on Drag Race returns correct results despite complex collaborations and featured artists.
Challenge 3: Reality TV ≠ clean data
- The absurdity
Main competition vs. Queen of She Done Already Had Herses (side competition) vs. team challenges vs. guest appearances. Different competitions overlap within the same episodes. You can't just track placements without understanding which competition you're tracking.
- The solution
Separate but linked data structures: episodePlacements (individual challenge results, excludes team scores) and seasonPlacements (overall season rankings). Clear type definitions prevent accidentally mixing competition contexts.
- The impact
Accurate statistics that filter signal from noise. Side competitions don't corrupt primary season data.
The results
SEO performance
- 1,400+ pages of structured, queryable data
- Page 1 Google rankings for hundreds of long-tail queries
- Artist pages particularly successful: Ariana Grande Drag Race, Beyoncé lip sync songs
- Thousands of organic page views per month and growing rapidly
Technical achievements
- Zero errors
End-to-end type safety catches all issues at compile time
- Solo maintainable at scale
One person managing 1,400+ pages with weekly updates
- Automated data journalism
Statistical articles regenerate automatically with fresh data
- Idempotent deployments
Complete database wipes and re-seeds without data loss
Real-world impact
- Media outlets can now quote accurate statistics instead of random Reddit users.
- Fans have a single queryable source instead of scattered wiki pages.
- Complex queries answered instantly: Queens who lipsynced the most, Most used artist on the show, Snatch Game winners by season.
Key technical wins
- Type safety as a superpower
End-to-end type safety eliminates entire categories of bugs. TypeScript at the data layer provides compile-time validation for all content changes. tRPC makes API mismatches impossible.
- Database-as-code benefits
Seed files serve as the source of truth, enabling version control for data, not just code. Idempotent seeding means fearless schema changes and data corrections. Complete database resets become routine, not risky.
- Static generation for scale
All pages are statically generated at build time from database queries. Aggressive caching (staleTime: Infinity) since all data is read-only historical records. Search engines can crawl all 1,400+ pages.
- Automated data journalism
This is the moat. Statistical articles update themselves when new data is added. Rankings recalculate. Cross-references rebuild. Literally no one else in the world can maintain data journalism at this scale solo.
What's next
Adding more seasons and franchises as they air. Expanding data properties (prize money, guest judges, runway themes).
Most importantly: more automated data journalism pieces. This is where domain knowledge meets technical capability in a way that can't be replicated.
Technology stack
- Frontend
Next.js 15, React, TypeScript, Tailwind CSS, Shadcn/UI
- Backend
tRPC (end-to-end type safety), Drizzle ORM
- Database
PostgreSQL
- Architecture
Database-as-code with TypeScript seed files as the single source of truth
