The Loop Library 
27 agent workflows that run themselves — and know when to stop. Copy one into Claude, GPT, or Gemini, point it at your work, and let it loop until the job is actually done.
What is a loop?
A normal prompt asks for one thing and stops. A loop hands the agent a goal plus a way to check its own work, then repeats until that check passes. The magic isn't the task — it's the stop condition. Without one, an agent either quits too early or "improves" forever.
Every loop below has three parts: the goal, the repeating steps (do work → verify), and the stop condition that ends it with a real deliverable. Swap the bracketed bits for your own context and run.
The Overnight Docs Sweep
Keep documentation matching the code, automatically, while you sleep.
Keep our documentation in sync with the code. Loop until the docs match the current implementation: 1. Diff the docs against the actual code, configs, and public APIs. 2. List every disagreement: stale steps, renamed flags, dead links, undocumented features. 3. Fix the DOCS to match reality. Do not change code. 4. Re-check. Stop when everything matches. Open a PR titled "Docs sync" with a bullet list of what changed. If nothing was stale, say so and stop.
Stops when: docs match the implementation, with a reviewable PR.
The Architecture Satisfaction Loop
Refactor in small, verified steps until the design is clean and green.
Refactor [module] until the architecture is clean and all checks pass. Loop: 1. Make ONE focused refactor (extract, rename, decouple) — the smallest viable change. 2. Run the full test suite and the linter. 3. Live-test the affected path end to end. 4. Log progress in /tmp/refactor.md so you don't repeat yourself. Stop when the design is satisfactory AND tests + lint pass with no regressions. Commit, then summarize what changed and why.
Stops when: architecture is satisfactory and all checks pass.
The Sub-50ms Page-Load Loop
Hunt the slowest page, fix it, repeat — until every page is fast.
Get every page loading in under 50 ms. Loop until met: 1. Measure load time for all pages under identical conditions (same cache state, same network profile). 2. Take the slowest page; find the real bottleneck (query, asset, render). 3. Apply one fix; re-measure it and spot-check the rest for regressions. Stop when every page is under 50 ms and none is slower than before. Report the before/after table.
Stops when: every page loads under 50 ms with no regressions.
The Production Error Sweep
Clear the error backlog by root cause, not by symptom.
Clear the production error backlog. Loop: 1. Read the last 24h of error logs; group by ROOT CAUSE, not by symptom. 2. Take the highest-impact group; trace it to the actual cause. 3. Fix it; add a test that would have caught it; verify the error stops. Stop when logs are clean or only known/accepted errors remain. Open a PR per fix and post a one-line summary to Slack. If logs are already clean, confirm and stop.
Stops when: errors fixed with PRs + summary, or logs confirmed clean.
The 100% Test Coverage Loop
Add meaningful tests, lowest-covered file first, until the suite is green at 100%.
Bring this codebase to 100% meaningful test coverage. Loop: 1. Run coverage; find the least-covered file with real logic. 2. Add tests for its real behavior and edge cases — not trivial getters or generated code. 3. Re-run the full suite. Stop when the suite passes at 100% coverage. Exclude generated and vendor files, and tell me which you excluded and why.
Stops when: the full suite passes at 100% coverage.
The SEO / GEO Visibility Loop
Make priority pages findable by search engines and AI answer engines.
Make our priority pages findable by search engines and AI answer engines. Loop: 1. Audit crawlability, indexation, titles/meta, internal links, and structured data. 2. Rank the issues by traffic impact. 3. Fix the top issue; re-run the audit. Stop when every priority page is indexable and answer-ready with no high-impact gaps. List what you changed and what you intentionally left.
Stops when: priority pages are indexable and answer-ready, no high-impact gaps.
The Logging Coverage Loop
Make production debuggable from logs alone — no silent failures.
Make sure we can debug production from logs alone. Loop: 1. Walk the important paths (auth, payments, data writes, external calls). 2. Find paths that fail silently or log nothing useful. 3. Add structured, tested log lines. Avoid noise. Never log secrets. Stop when every important path emits a useful, tested log. Summarize what you added.
Stops when: every important path emits a useful, tested log.
The Nightly Changelog Loop
Translate yesterday's commits into plain-language, user-facing release notes.
Keep a human-readable changelog current. Loop over yesterday's merged changes: 1. Read the commits/PRs since the last entry. 2. Translate the user-facing ones into plain-language changelog lines. Skip internal noise. 3. Add them under today's date. Stop when every user-relevant change is captured. If there were none, record "no user-facing changes" and stop.
Stops when: all user-relevant changes are documented, or a no-change is recorded.
The Quality Streak Loop
Prove reliability by passing N realistic runs in a row — a failure resets the streak.
Prove [feature] is reliable by passing N realistic runs in a row. (Set N = 20.) Loop: 1. Run a realistic end-to-end scenario. 2. If it passes, increment the streak. If it fails: document the failure, add regression coverage, fix it, and reset the streak to 0. Stop when you hit N consecutive passes. Report every failure you fixed along the way.
Stops when: the latest N realistic cases pass consecutively, with protections documented.
The Full Product Evaluation Loop
Define realistic scenarios with success criteria, then fix until they all pass.
Evaluate whether this product actually works, end to end. Setup: write N realistic user scenarios (set N = 15), each with explicit success criteria. Loop: 1. Run all scenarios under identical conditions. 2. For each failure, find the cause and fix it. 3. Re-run the full set. Stop when all N scenarios meet their success bar under the original conditions. Report the scorecard.
Stops when: all N scenarios meet the defined quality bar under original conditions.
The Test-Suite Speed Loop
Make tests faster without ever deleting an assertion.
Make the test suite faster without losing coverage. Loop: 1. Profile the suite; find the slowest tests or setup steps. 2. Speed up ONE (parallelize, mock I/O, share fixtures) — never by removing assertions. 3. Re-run; confirm timing improved and coverage + behavior are unchanged. Stop when the suite is meaningfully faster with proven-equal coverage. Report before/after timing.
Stops when: the suite is faster with timing and coverage proven, no regressions.
The Repository Cleanup Loop
Recover valuable abandoned work, then clear out the truly stale.
Tidy this repo without losing anything valuable. Loop: 1. Inventory branches, open PRs, stale commits, and worktrees. 2. Identify abandoned-but-valuable work; preserve it (cherry-pick or document it). 3. Remove what's truly stale. Stop when valuable work is recovered and everything remaining is intentional, with a note explaining each decision.
Stops when: valuable work is recovered and the remaining state is intentional, with evidence.
The Stale-Safe Batch Release Loop
Bundle only current, complete changes into one clean release.
Ship a clean combined release. Loop: 1. Review all merged changes awaiting release. 2. Exclude anything stale, half-finished, or behind a dead flag. 3. Combine the valid changes; run the full pre-release check. Stop when only current, complete changes are bundled. Produce the release notes and a go/no-go summary.
Stops when: only current, complete changes ship in the combined release.
The Production Data Cleanup Loop
Bring a table into compliance and improve the rule that let bad rows in.
Bring [table] into compliance with its definition. Loop: 1. Sample records; flag any that violate the allowed definition. 2. Remove or correct violations; improve the classification rule that let them in. 3. Re-scan with the improved rule. Stop when every remaining record passes, backed by classification tests. Report counts removed/fixed.
Stops when: every record meets the allowed definition, with classification tests.
The Post-Release Baseline Loop
Capture a fresh, labeled performance baseline after every release.
Capture a fresh performance baseline after a release. Steps: 1. Wait until the release is fully deployed and warm. 2. Run the standard benchmark suite under documented conditions. 3. Record results as the new baseline, with revision, environment, and conditions noted. Stop once the baseline is saved and labeled. Flag any metric that regressed versus the prior baseline.
Stops when: a new baseline is established with revision, environment, and conditions documented.
The Inbox Triage Loop
Every email gets a decision — reply, delegate, schedule, or archive.
Get my inbox to a decided state. Loop until the inbox is triaged: 1. Take the oldest un-triaged email. 2. Decide: reply now (draft it), delegate (draft the handoff), schedule (add a dated task), or archive. 3. Apply the label and move on. Stop when every email has a decision. Give me the drafts to approve and a list of what you scheduled or archived. Never send without my OK.
Stops when: every email has a decision and drafts await your approval.
The SOP Freshness Loop
Keep written procedures matching how the work is actually done today.
Keep our SOPs matching how the work is actually done. Loop: 1. Take one SOP; compare each step to the current tools, screens, and owners. 2. Flag steps that are wrong, missing, or reference a retired tool. 3. Rewrite the SOP to match reality; note the change date. Stop when the SOP is accurate, then move to the next. Report which SOPs changed and which were already correct.
Stops when: every SOP matches current reality, with change dates recorded.
The Subscription Audit Loop
Find unused, duplicate, and oversized plans — with the savings attached.
Cut wasted recurring spend. Loop over our active subscriptions/vendors: 1. For each, check last-used date, seats used vs. seats paid, and whether a cheaper tier covers us. 2. Flag unused, duplicate, or oversized plans with the monthly savings. 3. Draft the cancel/downgrade action for my approval. Stop when every line item has a keep / cut / downsize recommendation. Total the potential savings.
Stops when: every subscription has a keep/cut/downsize call with savings totaled.
The Content Repurposing Loop
Turn one long-form asset into a full multi-channel content set.
Turn one source asset into a full content set. Input: [link or paste the long-form piece]. Loop until the set is complete: 1. Pull the next distinct key idea from the source. 2. Repurpose it into the next target format (LinkedIn post, X thread, newsletter blurb, short-video hook). 3. Keep our voice. Don't invent facts not in the source. Stop when every key idea is covered across all target formats. Deliver them grouped by channel.
Stops when: every key idea is repurposed across all target formats.
The Content Gap Loop
Close the gaps between your pages and the top-ranking results.
Close the gaps between us and the top results for our target topics. Loop: 1. Take the next target keyword; review what the top-ranking pages cover that we don't. 2. List the missing subtopics, questions, and entities. 3. Draft an outline (or section) that closes the highest-value gap. Stop when each target keyword has a gap-closing plan. Rank them by opportunity.
Stops when: every target keyword has a ranked gap-closing plan.
The Brand Voice Loop
Make every public page read like one consistent brand.
Make all our public copy sound like one brand. Reference: [our voice guide — tone, reading level, banned phrases]. Loop: 1. Take the next page/post; compare it to the voice guide. 2. Flag every off-voice line. 3. Rewrite it to match — meaning unchanged. Stop when each asset reads on-voice. Show before/after for anything you changed. Never alter pricing, claims, or legal lines without flagging them first.
Stops when: every asset reads on-voice, with before/after shown.
The CRM Hygiene Loop
Make the pipeline data clean enough to trust your forecast.
Get the pipeline data clean and trustworthy. Loop: 1. Take the next batch of records; check for missing fields, wrong stage, stale next-steps, and duplicates. 2. Fix what's unambiguous; flag what needs a human decision. 3. Move on. Stop when every active deal has a stage, an owner, a dated next step, and no duplicate. Report what you fixed and what needs my call.
Stops when: every active deal has a stage, owner, dated next step, and no duplicate.
The Lead Follow-up Loop
No warm lead goes cold — every overdue one gets a real, personal draft.
No warm lead goes cold. Loop over open leads with no reply: 1. Find leads past their follow-up date. 2. Draft a personalized, context-aware follow-up — reference their actual situation, not a template. 3. Queue it for my approval with a send date. Stop when every overdue lead has a drafted follow-up. List them by priority. Don't send without approval.
Stops when: every overdue lead has a prioritized, drafted follow-up awaiting approval.
The Competitive Intel Watch Loop
Re-check every tracked competitor and surface what actually changed.
Keep a current read on our competitors. Loop over our tracked competitor list: 1. Check each one's site, pricing, releases, and public posts for changes since our last snapshot. 2. Note what changed and why it matters to us. 3. Update the tracker. Stop when every competitor is re-checked. Deliver a "what changed and what we should do" brief, with anything urgent flagged at the top.
Stops when: every competitor is re-checked and the tracker + brief are updated.
The Source Synthesis Loop
Build a cited answer where major claims are corroborated by multiple sources.
Build a trustworthy synthesis on [topic] from real sources. Loop until the question is answered: 1. Find and read the next credible source. 2. Extract claims with citations; note where sources agree and conflict. 3. Add to the running synthesis; name the biggest remaining unknown. Stop when the key questions are answered and major claims are corroborated by 2+ sources. Deliver a cited brief plus an explicit "what we still don't know" list.
Stops when: key questions are answered, major claims corroborated by 2+ sources.
The Expense Categorization Loop
Categorize every transaction, and flag — never guess — the ambiguous ones.
Get the books categorized and consistent. Loop over uncategorized transactions: 1. Take the next one; assign the right category from vendor + memo + past patterns. 2. Flag anything ambiguous or unusually large for my review instead of guessing. 3. Continue. Stop when every transaction is categorized or flagged. Report the flagged ones and any new vendor patterns you learned.
Stops when: every transaction is categorized or flagged for review.
The Invoice Reconciliation Loop
Match billed to paid, then tell you exactly who to chase.
Match what we billed to what we got paid. Loop: 1. Take the next open invoice; match it to incoming payments. 2. Mark matched ones paid; flag short-pays, overpayments, and overdue invoices. 3. Continue. Stop when every invoice is matched or flagged. Deliver an aging summary and a list of who to chase, with draft reminders for my approval.
Stops when: every invoice is matched or flagged, with an aging summary and draft reminders.
No loops match that search. Try another term.
How to run a loop well
1. Give it a real verifier. A loop is only as good as its check. "Looks done" isn't a stop condition; "all 15 scenarios pass under identical conditions" is.
2. Scope the blast radius. Add "do not touch X," "draft for my approval," or "one change at a time" when the work is risky or irreversible.
3. Make it report. Every loop ends with a deliverable — a PR, a table, a list of drafts — so you can audit what it did, not just trust that it ran.
Loops are the everyday face of harness engineering — the model is the engine, the loop is the car that gets it to a finished destination.
Frequently asked
What is an AI agent loop?
A loop gives an AI agent a goal plus a way to check its own work, then repeats — do work, verify — until that check passes. The check is called the stop condition, and it's what separates a loop from a one-shot prompt.
Why does every loop need a stop condition?
Without a clear stop condition an agent either quits too early or keeps improving forever. A good stop condition is verifiable — "all 15 scenarios pass under identical conditions," not "looks done."
When should I use a loop instead of a single prompt?
Use a loop when the work has a checkable "done" state and benefits from repetition — clearing an error backlog, reaching full test coverage, triaging an inbox. For a one-off answer, a single prompt is enough.
Do these loops work in Claude, ChatGPT, and Gemini?
Yes. Each loop is a plain-language prompt with no tool-specific syntax. Paste it into Claude, GPT, or Gemini, swap the bracketed placeholders for your context, and run.