Most LLM interactions follow the same pattern: ask a question, get an answer in seconds. The model does one pass — searches the web, reads a few pages, synthesises a response — and you're done. The result is usually competent. It's also usually a first draft pretending to be a final answer.

term-llm has a feature called progressive mode that changes this. Instead of answering once and stopping, the agent gets a time budget — say, 5 or 10 minutes — and keeps working. It drafts, critiques its own work, searches for more sources, verifies claims, corrects errors, and only finalises when the budget runs out. The model explicitly knows how much time it has and can choose how to spend it.

I wanted to know: does this actually produce better results? Or does the model just pad its answer with filler?

The Experiment

Same question, same model, three configurations. The question is deliberately one where surface-level answers are easy but thorough answers require cross-referencing multiple sources:

What are the top 10 audible sci fi audiobooks released in the past year? Search the web for current listings.

The three runs:

RunCommandTime budget
Vanillaterm-llm ask "..." --agent jarvisNone (finishes when done)
Progressive 5mterm-llm ask "..." --agent jarvis --progressive --timeout 5m5 minutes + finalization grace
Progressive 10mterm-llm ask "..." --agent jarvis --progressive --timeout 10m10 minutes + finalization grace

Model: Claude Opus (via claude-bin provider). All three runs used the same agent configuration with web search enabled. No cherry-picking — these are the actual outputs from one run of each.

Vanilla: The First Draft

The vanilla run took about 90 seconds. It searched the web, found Audible's "Best of 2025" editorial list, read the page, and presented all 10 titles with genre annotations:

Full vanilla output (3.3 KB — click to expand)
Straight from Audible's own "Best of 2025" list, here are their top 10
sci-fi & fantasy audiobooks of the past year. Fair warning: it's a mixed
bag — some are hard sci-fi, others lean fantasy or LitRPG. I've flagged
the genre for each:
  1. The Incandescent – Emily Tesh (dark fantasy / magical school)
  2. There Is No Antimemetics Division – qntm (sci-fi / horror)
  3. The Fourth Consort – Edward Ashton (space opera / sci-fi comedy)
  4. What We Can Know – Ian McEwan (speculative / cli-fi)
  5. This Inevitable Ruin – Matt Dinniman (LitRPG / sci-fi)
  6. 1% Lifesteal – Robert Blaise (progression fantasy / LitRPG)
  7. The Prophecy, Season 2 – Kerry Washington (supernatural thriller)
  8. The River Has Roots – Amal El-Mohtar (fairy tale / literary fantasy)
  9. Bury Our Bones in the Midnight Soil – V.E. Schwab (gothic / vampire)
  10. Greenteeth – Molly O’Neill (cozy fantasy)

The two strongest pure sci-fi picks: There Is No Antimemetics Division and The Fourth Consort. If you only want hard SF and skip the fantasy/LitRPG, start there.

This is a perfectly fine answer. The genre flags are helpful. The recommendations at the bottom are sensible. If you asked a friend who'd spent 90 seconds researching, you'd be satisfied.

But it has blind spots it doesn't know about:

The model didn't lie. It just stopped too early.

Progressive 5 Minutes: The Self-Critique

The 5-minute run is where things get interesting. After producing its initial draft (similar to vanilla), the model did something the vanilla run never does: it criticised its own output and kept going.

Here's the exact moment it pivots, visible in the raw output:

The current answer is solid but has gaps: Audible's list is from
November 2025 (misses anything since), it's mixed sci-fi/fantasy
rather than sci-fi-focused, and I haven't checked current bestseller
charts or early 2026 releases. Let me fix that.

It then searched for the AudioFile critics' list, the 2025 Hugo winner, the Nebula Awards, and the current Audible live chart. It found The Tainted Cup (Hugo winner), Operation Bounce House (Feb 2026, currently charting), and The Dragonfly Gambit (Nebula winner). It also did something subtle: it started verifying its own claims.

Risky claims to verify: The Tainted Cup's actual release date
(2024 or 2025?), whether Shroud by Tchaikovsky has a firm Audible
listing and release date, whether The Dragonfly Gambit is available
as an audiobook...

The final output includes 10 entries with sources, award data, and honest caveats about where the Audible editorial list falls short. Here's entry #1 — which vanilla missed entirely:

5-minute output excerpt — The Tainted Cup (entry #1)
1. The Tainted Cup — Robert Jackson Bennett
Detective sci-fi/fantasy | Hugo Award Winner 2025 · World Fantasy
Award Winner 2025 | Narrated by Andrew Fallaize

A Holmes-and-Watson detective duo investigate an imperial officer who died when a tree erupted from his body. Set in an empire built on leviathan blood and biological contagion — strange, layered worldbuilding wrapped in a fiendishly clever murder mystery. Ana Dolabra wears a blindfold and never leaves home. Her assistant Din has a magically perfect memory. T.J. Klune called it “a fantasy version of Knives Out.” NYT, NPR, and Elle all named it a best book of the year. It’s the year’s most award-laden genre novel and inexplicably absent from Audible’s own November list.

Confirmed on Audible — also recommended: the just-released sequel A Trade of Blood (2025).

Progressive 10 Minutes: The Full Audit

The 10-minute run does everything the 5-minute run does, then keeps going. It consulted more sources (Penguin Random House audiobook lists, Book Riot new releases, Goodreads), fact-checked individual Audible product pages, and verified narrator credits and runtimes.

But the most interesting difference is structural. The 10-minute run produced:

10-minute output excerpt — entry cards with metadata
### 3. There Is No Antimemetics Division — qntm
Narrator: Rebecca Calder · Runtime: 8h 47m · Genre: Sci-Fi Horror

Expanded from the SCP Foundation universe. Entities that destroy your memory of their existence — how do you fight something you can’t remember exists? Non-linear, redacted-document structure. Raises questions about deepfakes and fascism without being heavy-handed. The genuinely weird pick of the list.

4. The Fourth Consort — Edward Ashton

Narrator: Barrie Kreinik · Runtime: 8h 25m · Genre: Sci-Fi (Space Comedy)

Humans as a client civilization in a galactic power structure. A human diplomat and a stick-man alien stranded on an insect-politics planet. Mickey7 author doing highbrow sci-fi meets Simon Pegg buddy comedy.

10-minute output — UK vs US analysis (vanilla had no idea this existed)
UK vs. US: The UK "Best of 2025" shares 7 of these 10 titles.
The UK-only swaps include:
- Flybot by Dennis E. Taylor (standalone AI thriller, narrated by
  Ray Porter — not part of Bobiverse but same author/narrator energy)
- The Devils by Joe Abercrombie
- Harry Potter Full-Cast Production

The 3 US-only titles that get swapped out: The Fourth Consort, 1% Lifesteal, and The Prophecy Season 2.

10-minute output — self-correction in action
A few things worth verifying: the list was published Nov 17,
2025 — it misses ~4 months of releases up to today (March 2026).
Also, half the list is fantasy, not strict sci-fi. Let me check
for December 2025–2026 notable releases and cross-reference
against other major lists.

[searches Penguin Random House, Book Riot, GoodEReader]

Earlier claim “Flybot is in the Bobiverse” — confirmed correct in spirit but imprecise. Flybot is a standalone novel, not in the Bobiverse series. The Audible listing calls Taylor “the number one best-selling author of the Bobiverse series” and reviewers note “echoes of the Bobiverse” — but it’s its own thing.

The Lists Side by Side

#VanillaProgressive 5mProgressive 10m
1The Incandescent (fantasy)The Tainted Cup ★ (Hugo winner)The Incandescent (fantasy)
2Antimemetics DivisionShroud ★ (hard sci-fi)The River Has Roots (fantasy)
3The Fourth ConsortAntimemetics DivisionAntimemetics Division
4What We Can KnowThe Fourth ConsortThe Fourth Consort
5This Inevitable RuinThis Inevitable RuinThis Inevitable Ruin
61% Lifesteal (LitRPG)Operation Bounce House ★ (Feb 2026)The Prophecy S2
7The Prophecy S2What We Can KnowWhat We Can Know
8The River Has Roots (fantasy)Katabasis ★ (AudioFile pick)1% Lifesteal (LitRPG)
9Bury Our Bones (vampire fantasy)The Dragonfly Gambit ★ (Nebula winner)Bury Our Bones (vampire fantasy)
10Greenteeth (cozy fantasy)The Incandescent (fantasy)Greenteeth (cozy fantasy)

★ = unique to that run. The vanilla and 10-minute runs both faithfully reproduced Audible's editorial list — but the 10-minute run added extensive context around it (UK vs US analysis, pure sci-fi filter, post-list releases, narrator/runtime metadata for every entry). The 5-minute run diverged most aggressively: it replaced 5 of the 10 titles with picks from Hugo/Nebula awards, AudioFile critics' lists, and the current Audible chart — surfacing the year's most acclaimed sci-fi that Audible's own editorial missed.

The Numbers

VanillaProgressive 5mProgressive 10m
Output size3.3 KB15 KB24 KB
Sources consulted2814
Wall time~90 sec~6 min~8 min
Hugo/Nebula winners foundNoYesYes
Self-corrections035+
UK vs US comparisonNoNoYes
Narrator/runtime metadataNoPartialFull
Post-Nov 2025 releasesNoYesYes

The vanilla answer is not wrong — it's just the first thing the model found presented as the final thing. Progressive mode changes the incentive: the model knows it has time remaining and uses it to question its own work.

How It Works

Progressive mode gives the agent two synthetic tools: update_progress and finalize_progress. As the agent works, it checkpoints its best-so-far state with update_progress. When the time budget expires, a finalization pass runs: the agent reviews its accumulated state, writes a polished prose response, and calls finalize_progress to commit it.

The key insight is that the model sees the time budget in its prompt. A 10-minute budget doesn't just mean "run longer" — it means "you have time to consult additional sources, verify claims, and restructure." The model adjusts its strategy based on the budget it has.

If the model finishes early (decides it's done before the timeout), it checks: is stop_when set to "timeout"? If so, it keeps going — searching for more sources, verifying entries, correcting errors. If set to "done", it stops. When you pass --timeout, stop_when defaults to "timeout", so the model uses the full budget.

How to Use It

Install term-llm, configure a provider and agent, then:

# Vanilla — fast, single-pass
term-llm ask "your research question" --agent my-agent

# Progressive — 5 minute budget
term-llm ask "your research question" --agent my-agent --progressive --timeout 5m

# Progressive — 10 minute budget
term-llm ask "your research question" --agent my-agent --progressive --timeout 10m

Progressive mode also works with scheduled jobs. Set progressive: true and timeout_seconds in the job's runner config, and the job runner handles the time budget, finalization, and result storage.

When to Use It

Progressive mode isn't for everything. Quick factual questions don't benefit — the model will just pad. It shines on research tasks where the quality difference comes from consulting more sources, cross-referencing claims, and catching your own errors:

The tradeoff is cost. A 10-minute progressive run uses more tokens than a vanilla run — roughly 3–5x in this experiment. Whether that's worth it depends on whether you're optimising for speed or for getting it right the first time.

The Takeaway

The vanilla answer reads better than it deserves to. It's confident, well-structured, and wrong by omission. The progressive answers are messier in their working — you can see the self-doubt, the corrections, the "let me check that" moments — but they converge on something significantly more trustworthy.

The interesting thing isn't that more time produces more output. It's that more time produces self-correction. The model catches its own mistakes when it has the budget to look back at what it wrote. That's the real value of progressive mode: not "think longer" but "think, then check your work."