Lessons Learned

146 hard-won lessons from building and operating 10+ websites, 4 autonomous AI agents, automated pipelines, and a venture fund. Every lesson came from a real production incident.

Security

Never use git add -A in automation. Staging everything leaked credentials. Always add specific files by name.
Secrets persist in git history after deletion. Rotating a key is not enough — you must scrub the history or consider the secret compromised.
Hardcoded tokens in bot code are a ticking bomb. Discord tokens, Supabase keys, API secrets — always use environment variables, never inline.
RLS policies must be tested from every role. A policy that works for admin may silently fail for anon. Test each role explicitly.
Validate env vars at startup, not at first use. Failing 3 hours into a pipeline because a key is missing wastes entire cycles.

AI & LLM Operations

LLM fallback must catch ALL exceptions. Only catching rate limits left timeouts, auth errors, and network failures unhandled. Catch the base exception class.
Reorder provider chains based on time of day. Morning pipelines should use Cerebras first to preserve Gemini quota for interactive afternoon use.
Agents will speak in third person unless told otherwise. "HH held the outposts" vs "I held the outposts" — persona prompts must explicitly specify POV.
LLMs add unrequested features. When matching a reference design, explicitly instruct: "Match exactly, do not improve."
Persona tuning takes multiple iterations. First attempts are always too clinical. Expect 3-5 rounds of voice refinement.
AI will fabricate references. Always verify any citation, URL, or factual claim an LLM generates. Never publish unverified AI output.

Deployment

Pipeline commits but does NOT push. A human gate that's easy to forget. Either automate the push or make the gate explicit and visible.
Preview visual changes locally before pushing. A change that looked right in code broke the layout in production. Always render before deploy.
A single stray character in HTML <head> breaks everything. One "1" in the head tag caused production layout issues across all pages.
Validate HTML before deploying. Pre-push hooks that check for syntax errors catch more bugs than code review.
Wrangler project names must match Cloudflare dashboard exactly. Case-sensitive, no aliases. Wrong name = silent deploy failure.

Code Quality

One missing import causes 25+ cascading failures. A missing import time triggered NameError in every retry path. Always test error handling paths, not just happy paths.
Multiple bugs hide behind each other. Fix required: async→sync, fix token, switch format — all at once. Look for ALL bugs, not just the first one.
Test Python syntax after editing large dicts. A persona dict entry defined outside the parent dict compiled fine but failed at runtime.
"Replace" ≠ "Add". When asked to add content, existing content was replaced instead. Preserve what exists unless explicitly told to replace.

Operations

Idempotency guards save queues. A bot tried to refill a tweet queue that already had 28 items. Always check current state before adding.
Track processed IDs to prevent duplicate work. A mention scanner re-processed the same mentions on every restart. Dedup by ID, not by timestamp.
Rapid restarts mask root causes. 4 bot restarts in 20 minutes re-triggered the same failure each time. Investigate crash cause before restarting.
Daily scan JSON overwrites break dedup. Dedup should check the ledger (persistent), not the scan file (overwritten daily).
Subprocess calls need explicit PATH. Scripts that work in shell fail in launchd/GHA because PATH is different. Always use absolute paths.

Prediction Markets

Data velocity beats domain expertise. NBA: 465 markets, 253 near 50%, 382 resolve this week. BTC: 98 markets, 7 near 50%. Pick the lane with the most data points.
API tag filters lie. Gamma API tag=crypto returns everything. Build your own classifier from titles and descriptions.
Distinguish data from signal from alpha. Having data is step one. Extracting signal is step two. Finding alpha (edge over the market) is step three. Most systems never get past step one.

SEO & Content

og:image MUST be absolute URLs. Relative paths break social card previews on Twitter, LinkedIn, and Slack. Fixed across 192 venue pages.
SEO is not "set and forget." Adding one experiment page required updating 6 files: meta, OG, Schema.org, llms.txt, sitemap, about.html.
Navigation drift happens silently. One site's nav pointed to a page that no longer existed. Automated link checking catches what human review misses.
Title truncation must happen at word boundaries. Cutting mid-word creates broken previews in search results. Always use rsplit(' ', 1)[0].
Sitemap markers can be empty without anyone noticing. 784 video URLs between markers — but the markers had zero content. Verify sitemap contents, not just marker presence.

Meta-Lessons

If the same fix is needed twice, the root cause is upstream. Fix the generator, not 600 HTML files. Fix the config, not the output.
Don't add jobs to feel safe. No new workflows, pre-push hooks, or monitoring unless the empire has actually grown enough to justify it.
Find patterns that break → remove them at the source. Recurring issues are a code smell. Centralize the fix.
Shipped and ugly beats planned and pretty. Get it live, iterate in production. Perfection is the enemy of shipped.

Security

AI & LLM Operations

Deployment

Code Quality

Operations

Prediction Markets

SEO & Content

Meta-Lessons

Go Deeper

Learn from these lessons