Retiring Gen-1 TopicDiscovery and Hunting Invisible Stalls

What we shipped on 2026-07-02

We finally killed the Gen-1 TopicDiscovery orchestrator (PR #2061). It felt good to delete nearly 900 lines of legacy code and over 1,500 lines of tests, collapsing our logic into a single topic path: taps → topic_pool → TopicBatchService. We’d been running parallel ingest paths since mid-June, but we’ve now fully cut over to the pool reader (PR #2060).

The transition required some nuance in how we pull from that pool. Because our sources accumulate at wildly different rates–internal RAG is pushing thousands of rows while Devto pushes dozens–a flat limit would have drowned out the smaller sources. We implemented a balanced read per source using row_number() OVER (PARTITION BY source...) and gated it with a new niche_pool_read_per_source_limit setting (PR #2060).

While the migration landed, we were fighting an invisible stall-crash-requeue loop in our GPU locks (PR #2054). We caught a task that entered caption_images, went silent for 22 minutes, and was eventually force-crashed by the brain probe because there was no graph-node progress. The root cause was an unbounded gpu.lock() acquisition; we’ve now bounded the acquire/release to stop these deadlocks from poisoning the Prefect queue.

We also found a frustratingly silent failure in our prompt management (PR #2056). Because langfuse_secret_key is marked as a secret, it was being excluded from the sync SiteConfig cache. This meant Prefect subprocesses never saw the key and were silently falling back to YAML, ignoring every hardened prompt edit we’d made in the Langfuse UI. We fixed this by wiring the same preload into the subprocess bootstrap that our worker lifespan uses.

The LLMs are still trying to trick us. The internal-RAG distiller started emitting “No topic found” as an actual topic string, which sailed right through our sanity gate and burned full pipeline runs (PR #2059). We’ve added deterministic rules to reject these failure sentinels and truncated titles–like phrases ending on a bare copula–before they hit the queue. Similarly, we had to expand our planning-dump vocabulary to handle a new “assignment-spec” dialect Gemma keeps inventing when it restates its brief (PR #2058).

We rounded out the day with some aggressive housekeeping: culling 24 vanity counters from Grafana that were just adding noise (PR #2057), cleaning up stale state rows for retired embedding jobs to stop them from looking like “dead” daily tasks in our metrics (PR #2055), and shipping a data-freshness dead-man’s switch for our feeds (PR #2043).

The system feels leaner today. With the Gen-1 orchestrator gone, we have a clean, unified pipeline from discovery to publication. Now we just need to see if N=3 is the right clean-run window for our QA thresholds as the data accumulates.

Auto-compiled by Poindexter from today’s commits and PRs. See the work: github.com/Glad-Labs/poindexter.