Everything Is Showbiz: Lessons from a PHP + AI Side Project

Everything Is Showbiz: Lessons from a PHP + AI Side
Project Paul Conroy / @conroyp

Paul Conroy From Dublin, Ireland Started playing with the web
30+ years ago (Notepad, Frontpage & Geocities!) CTO at Square1 conroyp.com / @conroyp 👴 🌍 🇮🇪

• Interview comedians and celebrities 

• Interview comedians and celebrities  • Ask them what they
did yesterday 

• Interview comedians and celebrities  • Ask them what they
did yesterday  • That’s it!

What do I want?

• Some kind of episode transcription  What do I want?

• Some kind of episode transcription  • Searchable  What do
I want?

• Some kind of episode transcription  • Searchable  • Easily
share specific parts  What do I want?

• Some kind of episode transcription  • Searchable  • Easily
share specific parts  • That’s it! What do I want?

INGEST TRANSCRIBE SERVE A Simple Plan!

Things I already know how to do Challenges in this
project Audio Transcription Semantic Search Dynamic front end Model evaluation RSS Parsing Queues Cost-effective hosting Saying “we’ll tidy that up later Convincing myself it’s “MVP”     Making confusing Venn diagrams

I needed: I get: Something to run an ingest twice
a week Scheduler Process files in the background Queues Try different AI tools Service classes & container A simple UI to browse results Blade, Alpine Deploy & run at low cost Cache, config, cheap VPS

TRANSCRIBE SERVE INGEST A Simple Plan

Let’s get started!

• Design the entities / models Let’s get started!

• Design the entities / models • Map out the
Controllers, Jobs, Services Let’s get started!

Controllers, Jobs, Services • Set up config and environments Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig • Fix problems with test / validation rig Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig • Fix problems with test / validation rig • Build queue infrastructure Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig • Fix problems with test / validation rig • Build queue infrastructure • Fix problems with queues not running Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig • Fix problems with test / validation rig • Build queue infrastructure • Fix problems with queues not running • …. Let’s get started!

Controllers, Jobs, Services • Set up config and environments • Set up test / validation rig • Fix problems with test / validation rig • Build queue feature • Fix problems with queues not running • …. Let’s get started!

Whisper curl --request POST \ --url https://api.openai.com/v1/audio/transcriptions \ --header "Authorization:
Bearer $OPENAI_API_KEY" \ --header 'Content-Type: multipart/form-data' \ --form file=@/path/to/file/audio.mp3 \ --form model=gpt-4o-transcribe { "text": “Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger.....” } • OpenAI API • Fast! • Supports prompting • Post-processing https://developers.openai.com/api/docs/guides/speech-to-text/

https://github.com/ggml-org/whisper.cpp

• Runs locally! • Plain C/C++ implementation • No python
required • Different models can be used • CPU/GPU-hungry https://github.com/ggml-org/whisper.cpp

• Large variations in model quality • --max-context keeps internal
consistency high • --prompt can help with noun recognition • Prompt too long, and the quality degrades - a balancing act!

https://github.com/openai/whisper/discussions/194

• Sporadic issues with proper nouns • Challenges with punctuation
& run-on sentences • Occasionally gets a little confused… https://github.com/openai/whisper/discussions/194

Symphony of the butt: https://everythingisshowbiz.com/?episode=323&segment=390151

• Transcription breaks on pause  • Lots of short segments
• Not great for search! 

TRANSCRIBE SERVE REFINE TEXT INGEST A Simple Plan

INGEST TRANSCRIBE SERVE REFINE TEXT A Simple(-ish) Plan

What’s in a name? • Models trained on “mid-Atlantic” accent 
• Show features strong regional accents  • Names pronounced differently in each episode • Post-transcription processing needed

Actual Conversation

Actual Conversation Original Transcription

Responsibility • Content is AI-generated, but responsibility is ours  •
Need guardrails to prevent incorrect publication • Errors already costing companies

Responsibility • Content is AI-generated, but responsibility is ours  •
Need guardrails to prevent incorrect publication • Errors already costing companies https://www.theguardian.com/world/2024/feb/16/air-canada-chatbot-lawsuit

🎉🥳

INGEST TRANSCRIBE SERVE REFINE TEXT SEARCH SERVE A Simple(-ish) Plan

INGEST TRANSCRIBE SERVE REFINE TEXT SEARCH A Plan

Semantic Search • Keyword search asks: “Does this text contain
these words?” • Semantic search asks: “Is this about the same thing?” • It tolerates synonyms, paraphrasing, and noise • Under the hood: high-dimensional math

Semantic Search • Keyword search asks: “Does this text contain
these words?” • Semantic search asks: “Is this about the same thing?” • It tolerates synonyms, paraphrasing, and noise • Under the hood: high-dimensional math • ✨✨ Embeddings! ✨✨

• Complex data reduced to simpler information [159, 837, 528,
162, …] • The larger the embedding, the more “meaning” captured  • Similar words or phrases are close to each other in this multi-dimensional space

Embeddings • Postgres extension pgvector • Generate embedding for each
text block • Store embedding alongside original text • Generate embedding for each query • Search by similarity https://github.com/pgvector/pgvector My interesting query

text block • Store embedding alongside original text • Generate embedding for each query • Search by similarity https://github.com/pgvector/pgvector My interesting query [938, 324, 5, 234.. ]

text block • Store embedding alongside original text • Generate embedding for each query • Search by similarity https://github.com/pgvector/pgvector My interesting query [938, 324, 5, 234.. ] [11, 903, 675, 837.. ]

Embeddings • Use local model (Ollama) or API (OpenAI) •
Really basic model is ok! (text-embedding-3-small) • Makes remote APIs very cheap

How cheap exactly?

Observers • Hook into model lifecycle events • Automate side-effects
cleanly • Remove glue code from controllers • Keep business logic cohesive

Real-world users! • Tend to use really short queries •
Hard to parse intent! • How to handle both?

Sub query, avoid recalculation

Boost score for exact match Sub query, avoid recalculation

• Search for text segments • Click segment to autoplay 
• Share deeplink • URLs always update!

🤨 🤖 Issue Spotted

🤨 🤖 🚀 Issue Spotted Magic Fix

🤨 🤖 🚀 📦 Issue Spotted Magic Fix Change Integrated

🤨 🤖 🚀 📦 Issue Spotted Magic Fix Change Integrated
But then…

What did I expect?

What did I expect? • Discrete function classes / jobs 

• Logical separation 

• Logical separation  • Testable

• Logical separation  • Testable • That’s it!

• Side by side of different code • Highlight “same
but different” approach • AI solving from 10ft, I’m approaching this from 10,000ft

What’s happening? • Lots of near-duplicate code • Huge amount
of dead code • AI is focusing on low-level context alone  Working at 10ft while I’m at 10,000ft • Need to match the code to my mental model

Cognitive debt https://margaretstorey.com/blog/2026/02/09/cognitive-debt/

Use AI to tidy up AI • Much tighter feedback
loops • Personas ◦ “You are a senior PHP dev…”  • Less vibes, more control • Closer to pair programming

SERVE A Plan INGEST TRANSCRIBE REFINE TEXT SEARCH

Workload Generation • Heavy GPU load • But only 2-3x
a week • Admin tooling Public site • Effectively read-only • Updated 2-3x a week • Strong caching candidate

a week • Admin tooling Public site • Effectively read-only • Updated 2-3x a week • Strong caching candidate 💸💸💸

a week • Admin tooling Public site • Effectively read-only • Updated 2-3x a week • Strong caching candidate 💸💸💸 💸

Mac Mini RSS Feed (Cheap!) VPS

🧑💻 Mac Mini RSS Feed (Cheap!) VPS

A Plan INGEST SERVE TRANSCRIBE REFINE TEXT SEARCH

TIDY UP A Plan INGEST SERVE TRANSCRIBE REFINE TEXT SEARCH

INGEST TRANSCRIBE SERVE REFINE TEXT SEARCH

INGEST TRANSCRIBE SERVE REFINE TEXT SEARCH TIDY UP

Late 2025 • Step-change in AI tooling  • Moving from
the IDE to CLI • From “check this code I wrote” to “I’ll check the code you wrote”

AI-Assisted Engineering • Vibe coding optimises for speed • Engineering
optimises for sustainability

https://addyosmani.com/blog/ai-assisted-engineering-idea/ AI-Assisted Engineering • Vibe coding optimises for speed •
Engineering optimises for sustainability

Parallel & Structured • Sub-agents with defined roles • Reusable
skills  (Laravel clean code, frontend design…) • Worktrees for isolation • Inter-agent communication 🤖 🤖 🤖 🤖 The Boss Design Security Architect

From prompting to planning • Interview with the model first
• Capture decisions in plan.md • Make constraints explicit • Break work into checkable steps

Ralphing • Models cut corners on long tasks • Hook
in to finished state • “Did you really finish? Hmmmm???” • Keep going relentlessly until it’s done https://github.com/anthropics/claude-code/blob/main/plugins/ralph-wiggum/README.md

Cleanup Complete! • It works (mostly..) • Documented • Tested
• Public https://github.com/conroyp/podcast-transcription

Side Projects! • Abstract away the boring boilerplate • Dopamine
hit of fast feedback • Keep the momentum going • Quick to prototype & validate • From noodling on phone to live app

https://www.kidssudoku.com https://www.tictacgoaway.com https://www.jdcaptcha.com

Where does that leave us developers? • AI is already
faster than most mid-level devs • Productivity up 5-6× → headcount down? • “Good enough” beats “well-engineered.” • Why hire juniors when AI does the boilerplate?

Geoffrey Hinton, “Godfather of AI”

“[I]f you work as a radiologist you are like Wile
E. Coyote in the cartoon. You’re already over the edge of the cliff, but you haven’t yet looked down. Geoffrey Hinton, “Godfather of AI”

“[I]f you work as a radiologist you are like Wile
E. Coyote in the cartoon. You’re already over the edge of the cliff, but you haven’t yet looked down.   It’s just completely obvious that in five years deep learning is going to do better than radiologists.” Geoffrey Hinton, “Godfather of AI”

What Happened Next? • Demand went up • Salaries increased!
• The nature of the work  changed

What Happened Next? • Demand went up • Salaries increased!
• The nature of the work  changed https://worksinprogress.co/issue/the-algorithm-will-see-you-now/

Who gets to build? • Doctors can ship apps •
Lawyers can prototype tools • Contractors can automate their workflows • Founders can validate ideas without devs (?)

We’ve been here before • Assembly → C • C
→ PHP • PHP → Low-/no-code tools • Manual infra → DevOps • On-prem → Cloud

We’ve been here before • Assembly → C • C
→ PHP • PHP → Low-/no-code tools • Manual infra → DevOps • On-prem → Cloud https://addyosmani.com/blog/the-efficiency-paradox/ “Every time we’ve made it easier to write software, we’ve ended up writing exponentially more of it.”

What changes for us? • Execution is getting cheaper •
Prototypes are cheaper • Boilerplate is getting cheaper • Judgment is not

What changes for us? • Execution is getting cheaper •
Prototypes are cheaper • Boilerplate is getting cheaper • Judgment is not https://addyosmani.com/blog/the-efficiency-paradox/ The real question is whether we’re prepared for a world where the bottleneck shifts from “can we build this?” to “should we build this?”

Thank you! Conroyp.com @conroyp [email protected] 🌍 🌐 📧 https://joind.in/event/php-uk-conference-2026/everything-is-showbiz-lessons-from-a-php--ai-side-projects

Everything Is Showbiz: Lessons from a PHP + AI ...

Everything Is Showbiz: Lessons from a PHP + AI Side Project

More Decks by Paul Conroy

Other Decks in Programming

Featured

Transcript