Dear founder,
Before we dive into this week's topic of bootstrapper constraints, a quick just-in-time mention:
Next Tuesday —on the 8th of October— I'll be on a live workshop with Castos founder and CEO Craig Hewitt, where we'll talk all about effectively Building in Public.
If you want to join a masterclass on building an audience around yourself and your work, this free event will be for you.
In recent podcast episodes, I’ve been discussing constraints and how I deal with limited resources as a bootstrapper.
Today, I want to dive deeper into this topic, sharing my experiences with Podscan.fm and how I navigate the challenges of limited adjustable income and fixed expenses.
🎧 Listen to this on my podcast.
The Balancing Act
Running Podscan involves a delicate balance between various processes:
- Transcription of podcasts
- Data extraction
- Question answering and alert verification
And unfortunately, all these things need a GPU to work efficiently. Graphics cards (and their respective compute cycles) are a rare commodity. And even with new providers flooding the market —just this week, DigitalOcean has started offering GPU droplets— this stuff is expensive.
So, my core business processes compete for resources, and maintaining a steady flow of podcasts while extracting valuable data is crucial to customers seeing Podscan be valuable. When extracted data fits an alert —the main selling point of my business— it needs verification.
That’s quite a bit of GPU compute.
It creates an interesting balance, especially when it comes to managing queue sizes and server capacity — because that’s really what I’m limited by.
The Search Server Dilemma
As a main part of my UI and the underlying API, I’m running Meilisearch on a fairly sizable machine, but the structure of the data I push there requires massive indexing. This can lead to the queue on that server falling behind, resulting in outdated system data.
And that’s a dilemma:I need to balance volume (customers being able to find everything) with fidelity (data being as accurate as it can be) and recency (finding the most recent episodes first).
So prioritizing becomes important.
It wouldn’t make sense for Podscan to prioritize a four-year-old episode over one released two hours ago. To my customers, recent episodes offer more potential for user engagement, collaborations, or placement opportunities for podcasting agencies. As I often tell my users, if you respond to a recent episode, talk to the host, or post about it on social media, you could potentially get a partnership going or find a place for your podcasting agency to place a client. That’s why timeliness is more important than completeness, but both are relevant as I want search to work historically too. I can’t just ignore one over the other.
Setting Priorities
In my constrained world of 30-some servers, 30 graphics cards, and one search server with limited compute capacity, I’m constantly deciding where data should flow and who gets priority. It’s not just a technical thing; it’s a mental exercise in prioritization.
I’ve found that it helps to conceptualize software system priorities as a reflection of customer priorities. I could theoretically build a perfect system with ideal balance, but the ultimate decider is how my customers react and what they truly need.
So I talked to them. And I listen to them when they have something to say.
Their feedback, both explicit and implicit, has been instrumental in figuring out priorities.
Here’s how I came up with my priorities and what triggered them:
Customer-Driven Priorities
- Data Extraction: Customers were most vocal about extracted data not being present when expected. They didn’t care as much about transcription speed, as long as it happened within the same day. This was a surprise to me initially, as I had assumed transcription speed would be the primary concern. But they’d rather wait for data to be complete and enriched.
- AI Verification: The speed of my AI verification tool (ensuring keywords are mentioned in the right context) wasn’t a major concern, as long as it happened roughly on the same day. This tool is crucial for maintaining the accuracy of our alerts, but customers were more forgiving of slight delays here. Again: accuracy and completeness over speed.
- Enriched Data: I had several calls and chat messages where people told me they value maximally useful, enriched data as input for their next process, their Job to be Done — whether it’s integrating into a CRM, outreach tool, or database. I’ve come to think of business processes as machines with input and output chutes. For my customers, the more enriched data they get, the better their following process works. My output is their input. They need the highest-fidelity data possible.
- API Reliability: For API users, having reliable and consistent data is crucial. They prefer having all possibly extractable data, even if it means a slight delay in other processes. APIs don’t do well with change, so having data there reliably and predictably is very important to people. I used to expose new episodes even before they were transcribed and extracted, but customers have asked for flags to hide incomplete data from their tooling. Did not expect that.
Prioritizing Data Extraction
Based on customer feedback, I’ve learned that my priority should be extracting as much data as possible from the podcasts people actually want to learn about, rather than tracking all podcasts as quickly as possible.
This realization led me to focus on improving our data extraction capabilities. I’ve been running a lot of experiments, trying to find the right balance between speed and quality of extraction.
This has been taking up quite some time during the last week or so.
Experimenting with New Models
I’m currently experimenting with Meta’s latest local large language model, Llama 3.2. While the bigger variants have vision capabilities (which isn’t particularly useful for podcasts), I’m more interested in its text models, particularly the 1 billion and 3 billion parameter models that can run on edge devices or servers with limited resources.
These smaller models are exciting because they can run within about 10 gigabytes of graphics RAM, which is something that even normal graphics cards can handle. This means they’ll be very performant in their normal state, and even more so when quantized and pre-loaded on my backend servers.
If I can get the Llama 3.2 3 billion instruct model running where I currently have the Llama 3.0 7 billion instruct model, I could potentially speed up extraction significantly without losing much quality in terms of summary accuracy or data extraction precision. I might even try the 11 billion model if I can make it work within my constraints.
The idea here is to find the perfect balance between accuracy of extraction and cost of the GPUs used for that. Fortunately, every few weeks, new models or new implementation of inference strategies appear that make this a little bit faster and cheaper. So the true challenge is to keep up with the tech.
But I have another challenge I want to share, because it highlights just how hard it is to deal with things that don’t scale well.
The Diarization Challenge
One of the most computationally intensive processes in podcast analysis is diarization - the process of detecting and labeling different speakers in a podcast. This became a significant bottleneck in my system, and it’s a great example of how constraints force you to think creatively.
Before I started diarizing episodes, I could easily process 120,000 episodes a day with medium-to-high quality transcription. If I went for extremely high quality, it would be around 80,000, which was still significantly outperforming the 30,000 new episodes released daily. I could easily cover today’s episodes and go two days into the past.
However, after introducing diarization, this number dropped dramatically to 50,000 or less per day, sometimes as low as 35,000. Unlike transcription, where you can choose different models to balance speed and accuracy, diarization is a linear affair. It takes the same time to diarize a high-quality expert panel with 10 people as it does a simple monologue.
To address this, I built an internal prioritization system that considers factors like podcast popularity, user interactions, and search frequency. This system determines which podcasts get diarized and which receive regular transcription.
I’ve adopted an 80/20 approach: provide standard treatment by default, and flag podcasts for special treatment based on user interaction and importance. If a podcast gets a certain amount of interaction, it automatically gets flagged by the system to be one of the important ones. If people don’t interact with a podcast, then maybe it’s not important enough to be prioritized for diarization.
And as you can imagine, all of these transcripts are a lot of data. 30.000+ episodes a day occupy quite a lot of space in my database.
Which brings me to another constraint: search data indexing.
Search Functionality and API Performance
Podscan serves two main functions:
- Google Alerts for podcast mentions (very much real-time)
- Google-like search for podcasts (a lot of historical data)
The search function is accessed through our interface, which some people use, and our API, which more people use. When I encountered issues with Meilisearch being millions of items behind in indexing, API users reached out about the lack of fresh information. They were very clear: “I use the API to get fresh information. If information isn’t fresh, then this is a problem for me.”
This feedback led me to develop another prioritization system for search database updates. Initially, I had the approach that every episode with a transcript should be in the search database. When that overflowed the database, I switched to only including important episodes. But that meant even the title and description of less popular podcasts weren’t searchable, which wasn’t ideal.
Now, I’ve found a balance. Every episode’s title, description, and metadata are written to the database, but full transcripts are only included for episodes from podcasts with a certain level of popularity (e.g., iTunes rating over 10). This dynamic priority system adjusts based on the current queue size, allowing more room for popular podcasts likely to contain information relevant to Podscan’s professional users.
It’s not a perfect solution - I kind of hate the idea that I can’t include everything - but it’s a necessary compromise given my current constraints.
The Bootstrapper’s Dilemma
As a bootstrapper, I often face the unfortunate reality of not being able to do everything simultaneously. Prioritization has to happen because without it, I won’t attract customers, and without customers, the business won’t be profitable.
I’m very much enjoying the kind of limitations that I have around my backend servers, because it allows me to really min-max the efficiency of these models and to see which one of these many open source models can do the most work and create the most value for customers.
Every component of Podscan that interacts with the world of podcasts and their data is a balance of priorities derived from customer conversations and observed behavior. I continually ask questions to understand what customers need and automatically prioritize these aspects in future interactions, focusing on data quality, fidelity, timeliness, and overall quality of service.
For instance, I get a lot of podcasts every day from city council meetings or religious readings - content with a small but well-defined audience. Unless that audience is a customer of Podscan, I try to transcribe these quickly to catch most of it and summarize them, but I don’t need 100% accuracy. That’s different for podcasts that people are actively following - every episode of those needs to be high quality.
That is a bootstrapper’s mindset. I can always adjust things once I find customers who care about this data.
In the end, navigating limitations as a bootstrapper is about understanding customer needs —and having those customers in the first place. I need to set clear priorities, and continuously optimize my systems to deliver the most value within my constraints. It’s a challenging but rewarding process that forces you to think creatively and stay closely attuned to your users’ needs.
This approach has not only helped me make the most of my limited resources but has also given me valuable insights into what my customers truly value — and what I thought they did. It’s a constant process of adjustment and refinement, but it’s what allows a bootstrapped business like Podscan to compete and thrive in a resource-intensive field.
If you want to track your brand mentions on podcasts, please check out podscan.fm — and tell your friends!
Thank you for reading this week’s essay edition of The Bootstrapped Founder. Did you enjoy it? If so, please spread the word and share this issue on Twitter.
If you want to reach tens of thousands of creators, makers, and dreamers, you can apply to sponsor an episode of this newsletter. Or just reply to this email!