Claude Code Looping, Planning, and Orchestration Playbook / Handoff Artifact
Artifact date: 2026-03-17
Scope: Research-grade extraction and improvement of the discussion about how to get Claude Code to do more useful looping, especially for planning, and whether wrappers such as OpenClaw or tools such as Task Master materially help.
Prepared for: Another AI or human operator continuing the work without needing the original conversation.
Executive Summary
This discussion was fundamentally not about “make Claude Code loop forever.” It was about how to get Claude Code to do more useful planning passes and avoid stopping too early during the planning stage.
Highest-confidence conclusions
-
[Verified] The strongest official path is to use Claude Code’s native features first:
Plan Mode, subagents, hooks, skills, CLAUDE.md, sandboxing, and aggressive context management are all first-party mechanisms specifically designed to improve multi-step planning and execution workflows.
Why this matters: These are the least brittle, most policy-safe, and most directly supported ways to make Claude Code behave more like a deliberate staged agent rather than a single-turn chatbot.
Evidence: [R1], [R2], [R3], [R4], [R5], [R6], [R7], [R8], [R9], [R10], [R11], [R22] -
[Verified]
/loopis real, but it is not durable orchestration.
Claude Code can re-run prompts on a schedule with/loop, but those scheduled tasks are session-scoped, disappear when the Claude Code process exits, and recurring tasks expire after 3 days.
Why this matters:/loopis useful for polling or re-running packaged workflows inside a live session, but it is not the right primitive for long-lived autonomous planning orchestration.
Evidence: [R4] -
[Verified] Hooks are one of the biggest levers for “keep going until planning is actually done.”
Claude Code supportsStophooks that can block stopping and feed back what remains to be done. It also supportsagenthooks for stronger verification.
Why this matters: This directly addresses the user’s complaint that Claude Code “stops” before enough planning cycles have happened.
Evidence: [R3] -
[Verified] Anthropic explicitly forbids using OAuth tokens from Free/Pro/Max in other products, tools, or services.
Anthropic’s current Claude Code legal/compliance documentation says OAuth authentication for Free/Pro/Max is intended exclusively for Claude Code and Claude.ai, and using those tokens in any other product, tool, or service is not permitted.
Why this matters: This is the most important policy constraint in the entire discussion. It materially weakens “subscription-auth wrappers” as a recommended path.
Evidence: [R7] -
[Verified] OpenClaw can technically work with Anthropic setup-token / subscription auth, but even OpenClaw’s own docs say Anthropic API keys are the safer recommended production path.
OpenClaw documentation explicitly recommends API key auth over Anthropic subscription setup-token auth for long-lived gateways, and warns that Anthropic subscription use outside Claude Code has been restricted for some users before.
Why this matters: OpenClaw is not fake vaporware, but it is also not the cleanest answer to “how do I make Claude Code plan better on my subscription.”
Evidence: [R14], [R15], [R16] -
[Verified + flagged conflict] Task Master is genuinely relevant for task decomposition, but the documentation is mixed.
- Task Master’s installation docs say “Now that you have Node.js and your first API Key…”
- A GitHub example for Task Master shows a Claude Code provider with no API key required, plus settings like
maxTurns. - Another Task Master configuration page also emphasizes OAuth for ChatGPT and optional API key usage in that context.
Why this matters: Task Master may be useful as a structured planning/task layer, but its auth story and “recommended” path are not cleanly aligned across all docs. It must be treated carefully.
Evidence: [R12], [R13], [R17]
-
[Assistant-stated but strongly supported] The best operational pattern is staged orchestration, not brute-force infinite looping.
The most robust pattern is:- explore first,
- plan deeply,
- write the plan to a file,
- break it into PR-sized tasks,
- add stop gates,
- execute the next task in a fresh or narrowed pass.
Why this matters: This is the synthesis most consistent with official docs on Plan Mode, hooks, CLAUDE.md, skills, context management, and programmatic/headless usage.
Evidence basis: Inference from [R1], [R2], [R3], [R5], [R8], [R9], [R10], [R11]
Purpose of This Document
This document is intended to function as all of the following:
- a guide for the user,
- a playbook for implementation,
- a briefing memo for decision-making,
- and a handoff document for another AI continuing the work.
It does not merely summarize the conversation. It extracts the important claims, verifies what can be verified, labels the evidence status of each major point, highlights conflicts and policy risks, adds missing considerations, and turns the discussion into an actionable operating model.
Discussion Context
User goals and constraints
- [User-stated] The user wants to understand the best ways to push Claude Code to do more looping, especially more planning steps.
- [User-stated] The user’s main pain is that Claude Code appears to stop after enough cycles, even though the planning stage still needs several more passes.
- [User-stated] The user is curious whether OpenClaw or similar tools sitting “on top of Claude Code” using a subscription model are actually helpful.
- [User-stated] The user is primarily interested in practical leverage, not abstract theory.
- [User-stated] The user cares about current, up-to-date information and does not want a stale answer.
Framing correction
- [Assistant-stated, now strengthened] The real problem is not “more looping” in the abstract. The real problem is:
- preserving context,
- forcing a real planning workflow,
- avoiding early exits,
- reducing permission friction,
- and structuring work so Claude Code can complete it in bounded stages.
Key Facts and Verified Findings
1) Native Claude Code features are the main official answer
1.1 Plan Mode
- Status: Verified
- Claim: Claude Code’s Plan Mode is meant for exploring a codebase, planning complex changes, and iterating on direction before editing. It can be started interactively or with
claude --permission-mode plan, including-p/ headless use. - Evidence: [R1]
- Why it matters: This directly matches the user’s need for “more planning steps.”
1.2 Subagents
- Status: Verified
- Claim: Claude Code subagents run in their own context windows, can have custom tool access and prompts, and are useful for specialized tasks and context isolation. Claude Code includes built-in subagents such as Explore, Plan, and general-purpose.
- Evidence: [R2]
- Why it matters: Offloading exploration/research/review to subagents prevents the main conversation from bloating and helps planning stay coherent longer.
1.3 Hooks
- Status: Verified
- Claim: Claude Code hooks can run at key lifecycle events and can be used to:
- re-inject context after compaction,
- block or allow actions,
- enforce checks before stopping,
- and automate workflow rules.
- Evidence: [R3]
- Why it matters: Hooks are one of the cleanest ways to force more planning rigor and stop Claude Code from exiting prematurely.
1.4 Skills
- Status: Verified
- Claim: Claude Code skills are reusable prompt/playbook packages in
SKILL.md. Claude can invoke them automatically when relevant or via slash commands, and bundled skills can orchestrate non-trivial work. - Evidence: [R5]
- Why it matters: This is the official mechanism for packaging repeatable planning workflows such as
/spec-feature,/break-into-prs,/review-plan.
1.5 CLAUDE.md and memory
- Status: Verified
- Claim:
CLAUDE.mdprovides persistent instructions loaded at the start of every session. Claude Code also has auto memory.CLAUDE.mdis the official place for stable project rules and workflow instructions. - Evidence: [R9]
- Why it matters: Planning instructions should not live only in conversation history; they should be anchored in
CLAUDE.md.
1.6 Programmatic usage / Agent SDK
- Status: Verified
- Claim: Claude Code has an official programmatic path via the Agent SDK and
claude -p, previously called headless mode. - Evidence: [R6]
- Why it matters: If the user wants deterministic multi-pass orchestration, this is the official path rather than trying to “smuggle” a consumer login through third-party tooling.
2) /loop is useful but limited
2.1 Session-scoped only
- Status: Verified
- Claim: Scheduled tasks created through
/loopare session-scoped and disappear when the current Claude Code process exits. - Evidence: [R4]
- Why it matters:
/loopis not durable automation.
2.2 Three-day expiry
- Status: Verified
- Claim: Recurring scheduled tasks in Claude Code automatically expire 3 days after creation.
- Evidence: [R4]
- Why it matters: Even if a loop works, it is bounded and temporary.
2.3 Good use case
- Status: Verified
- Claim:
/loopis appropriate for polling, re-running packaged commands, or doing time-based checks inside an active session. - Evidence: [R4]
- Why it matters:
/loopis best treated as a convenience layer, not the core solution for deep planning.
2.4 Wrong use case
- Status: Assistant-stated but strongly supported
- Claim:
/loopshould not be the foundation for long-lived architecture-planning orchestration. - Evidence basis: limitations in [R4]
- Why it matters: Using
/loopas the main answer to “do more planning cycles” will create brittle behavior.
3) Hooks are a major leverage point for deeper planning
3.1 Stop hook can keep Claude working
- Status: Verified
- Claim: A
Stophook can ask whether all tasks are complete; if not, Claude keeps working and receives the reason as the next instruction. - Evidence: [R3]
- Why it matters: This directly answers the user’s complaint that Claude Code stops too soon.
3.2 Agent-based stop verification
- Status: Verified
- Claim: Agent-based hooks can inspect files or run commands to verify conditions, for example verifying tests pass before allowing Claude to stop.
- Evidence: [R3]
- Why it matters: This is stronger than a simple prompt gate.
3.3 Re-inject context after compaction
- Status: Verified
- Claim: A
SessionStarthook with matchercompactcan re-inject critical context after compaction. - Evidence: [R3]
- Why it matters: This is a missing but important consideration that materially improves long planning sessions.
3.4 Infinite stop-hook loops are a real risk
- Status: Verified
- Claim: Claude Code docs explicitly document the need to guard against
Stophooks that keep firing forever. The hook should inspectstop_hook_activeand exit early when appropriate. - Evidence: [R3]
- Why it matters: The cure for early stopping can itself create a failure mode if configured badly.
4) Context management is a first-class problem
4.1 Context degradation is real
- Status: Verified
- Claim: Claude Code docs say context fills quickly, performance degrades as it fills, and Claude may forget earlier instructions as context becomes saturated.
- Evidence: [R10], [R11]
- Why it matters: A giant wandering session is usually worse than staged work.
4.2 What to persist
- Status: Verified
- Claim: Persistent rules belong in
CLAUDE.md, not only in conversation history. - Evidence: [R9], [R10]
- Why it matters: This is the right place for planning rules, acceptance criteria, and stop conditions.
4.3 Helpful commands and patterns
- Status: Verified
- Claim: Claude Code supports
/context,/clear,/resume,/rename, and custom compaction guidance; these are part of official context hygiene. - Evidence: [R8], [R10]
- Why it matters: Better context hygiene is often more important than “more looping.”
4.4 Skills and subagents reduce context pressure
- Status: Verified
- Claim: Skills load on demand and subagents use separate fresh context, reducing bloat in the main conversation.
- Evidence: [R5], [R10]
- Why it matters: The right architecture is “fan out work,” not “pile more into the same thread.”
4.5 Parallel sessions are official and often better than one giant loop
- Status: Verified
- Claim: Claude Code supports parallel work through separate sessions and documents git worktrees as the clean way to run parallel Claude sessions. It also warns that resuming the same session in multiple terminals will interleave messages and make the conversation messy; for a clean branch-off, use a forked session.
- Evidence: [R10]
- Why it matters: For some workflows, the right answer is not “more looping.” It is “split the work cleanly.”
5) Sandboxing is a missing but important autonomy lever
5.1 Reduced approval friction
- Status: Verified
- Claim: Claude Code sandboxing reduces the need for constant bash permission prompts by defining allowed boundaries up front.
- Evidence: [R11]
- Why it matters: Some “Claude stopped” complaints are really “Claude kept waiting for permission and momentum died.”
5.2 Security and autonomy balance
- Status: Verified
- Claim: Sandboxing is designed to increase autonomy while maintaining OS-level filesystem and network restrictions.
- Evidence: [R11]
- Why it matters: This is relevant when the user wants Claude Code to “keep doing its thing” with less babysitting.
5.3 Caveat
- Status: Verified
- Claim: Sandboxing only applies to Bash and child processes; it does not replace broader permission rules.
- Evidence: [R11]
- Why it matters: Sandboxing helps, but it is not a universal autonomy switch.
6) Agent teams are real, but not the default answer
6.1 Experimental status
- Status: Verified
- Claim: Claude Code agent teams are experimental, disabled by default, and have known limitations.
- Evidence: [R15]
- Why it matters: They should not be the first recommendation for a user who mainly needs better planning loops.
6.2 Communication advantage
- Status: Verified
- Claim: Agent teams are distinct from subagents because teammates can communicate directly and coordinate through a shared task list.
- Evidence: [R15]
- Why it matters: Use them only when agents genuinely need to collaborate, debate, or self-coordinate.
6.3 Token cost
- Status: Verified
- Claim: Agent teams use significantly more tokens than a single session.
- Evidence: [R15], [R8]
- Why it matters: They are powerful but expensive and potentially overkill for planning-only use cases.
6.4 Better default
- Status: Assistant-stated but strongly supported
- Claim: For most planning workflows, subagents are the better first step; move to agent teams only when independent collaborators need to talk to each other.
- Evidence basis: [R2], [R15]
- Why it matters: This is the cleaner and cheaper default.
7) Usage limits may be part of the “it stops” problem
7.1 Shared limits
- Status: Verified
- Claim: Claude Pro and Max plan usage limits are shared across Claude and Claude Code.
- Evidence: [R18]
- Why it matters: Hitting usage limits in chat or coding work can cause terminal work to stop unexpectedly.
7.2 Extra usage exists
- Status: Verified
- Claim: Paid plans can enable extra usage after hitting included limits, billed at standard API rates.
- Evidence: [R19]
- Why it matters: If the user’s real bottleneck is capacity rather than orchestration, this changes the solution path.
7.3 Incomplete diagnosis
- Status: Tentative / speculative
- Claim: The user’s observed stop behavior may be caused by usage limits, permission prompts, context saturation, task size, or early-stop behavior.
- Why it matters: The exact failure mode still needs diagnosis.
8) Official policy risk around subscription-based wrappers is real
8.1 Anthropic restriction
- Status: Verified
- Claim: Anthropic says Free/Pro/Max OAuth auth is intended only for Claude Code and Claude.ai, and using that auth in other tools/services is not permitted.
- Evidence: [R7]
- Why it matters: This makes “subscription-based wrapper” advice materially riskier than it might have been in the past.
8.2 OpenClaw’s own caution
- Status: Verified
- Claim: OpenClaw’s OAuth docs say Anthropic subscription use outside Claude Code has been restricted for some users before and recommend API key auth for production.
- Evidence: [R14], [R16]
- Why it matters: Even third-party tooling acknowledges the risk.
8.3 Practical conclusion
- Status: Assistant-stated but strongly supported
- Claim: Third-party tools that try to route consumer Claude subscription auth outside native Claude Code should be treated as policy-risk and operational-risk, even if they technically work at a given moment.
- Evidence basis: [R7], [R14], [R16]
- Why it matters: This should heavily influence recommendations.
9) Task Master is promising, but should be used carefully
9.1 Why it is relevant
- Status: Verified
- Claim: Task Master provides structured task workflows such as parsing a PRD, analyzing complexity, showing next task, and updating task status.
- Evidence: [R13]
- Why it matters: This directly maps to the user’s desire for more structured planning loops.
9.2 Claude Code provider example
- Status: Verified
- Claim: A Task Master GitHub example shows a
claude-codeprovider that claims no API key is required and supports settings such asmaxTurns. - Evidence: [R13]
- Why it matters: This explains why users are attracted to it for orchestration.
9.3 Documentation conflict
- Status: Verified + conflict flagged
- Claim: Task Master’s installation docs say “Now that you have Node.js and your first API Key…”, while other docs/examples show alternative auth/provider paths.
- Evidence: [R12], [R13], [R17]
- Why it matters: Another AI should not assume the auth story is simple or settled.
9.4 Recommendation status
- Status: Assistant-stated but strongly supported
- Claim: Task Master is best treated as a task decomposition layer, not as a blanket “subscription loophole.”
- Evidence basis: [R12], [R13], [R17], [R7]
- Why it matters: The value is in the workflow, not the auth hack.
10) OpenClaw is broader than the user’s immediate problem
10.1 What it is good at
- Status: Verified
- Claim: OpenClaw is positioned as a long-lived gateway / agent environment with multiple auth paths, optional channels, and broader orchestration possibilities.
- Evidence: [R14], [R20], [R21]
- Why it matters: OpenClaw may be useful for connectors, gateways, or autonomous multi-system workflows.
10.2 Anthropic auth recommendation
- Status: Verified
- Claim: For Anthropic in production, OpenClaw recommends API key auth over subscription setup-token auth.
- Evidence: [R14], [R16]
- Why it matters: This weakens the case for using OpenClaw mainly as a subscription-auth layer for Claude Code.
10.3 Practical recommendation
- Status: Assistant-stated but strongly supported
- Claim: OpenClaw is not the best first answer to “how do I get Claude Code to do deeper planning loops?”
- Evidence basis: [R1], [R3], [R4], [R6], [R7], [R14], [R16]
- Why it matters: Native Claude Code features solve the core planning problem more directly.
Major Decisions and Conclusions
-
[Verified + conclusion] Prefer native Claude Code workflow controls first.
Use Plan Mode, subagents, hooks, CLAUDE.md, skills, sandboxing, and context management before reaching for wrappers.
Evidence: [R1], [R2], [R3], [R5], [R9], [R10], [R11] -
[Assistant-stated but strongly supported] The default operating model should be staged, not monolithic.
Explore → Plan → Write plan file → Break into PR-sized tasks → Stop-gate → Execute next task.
Evidence basis: [R1], [R2], [R3], [R6], [R9], [R10] -
[Verified + conclusion] Use
/looponly for polling or repeated checks within a live session.
Do not treat it as durable orchestration.
Evidence: [R4] -
[Verified + conclusion] Use subagents before agent teams.
Agent teams are experimental and expensive.
Evidence: [R2], [R15] -
[Verified + conclusion] Treat third-party subscription-auth wrappers as policy-risk for Anthropic.
Evidence: [R7], [R14], [R16] -
[Assistant-stated but strongly supported] OpenClaw is a broader automation/gateway tool, not the cleanest planning-loop fix.
Evidence basis: [R14], [R16], [R20], [R21] -
[Assistant-stated but strongly supported] Task Master is the most relevant external planning layer discussed, but should be used with compliant auth and careful expectations.
Evidence basis: [R12], [R13], [R17], [R7]
Reasoning, Tradeoffs, and Why It Matters
| Option | Evidence status | Best use | Strengths | Weaknesses / risks | Recommendation |
|---|---|---|---|---|---|
| Native Claude Code: Plan Mode + hooks + subagents + CLAUDE.md | Verified | Better planning inside Claude Code itself | First-party, direct fit, policy-safe, designed for this | Requires setup discipline | Primary recommendation |
/loop | Verified | Polling / repeated checks in active session | Simple and built-in | Session-scoped, expires, not durable | Use narrowly |
Agent SDK / claude -p | Verified | Deterministic orchestration and scripting | Official programmatic path | More setup work | Strong secondary path |
| Subagents | Verified | Isolated research / review / task-specific work | Separate context, flexible, cheaper than teams | Less inter-agent collaboration | Use before teams |
| Agent teams | Verified | Parallel work where agents must communicate | Powerful for debate/review/cross-layer work | Experimental, expensive, more moving parts | Use selectively |
| Sandboxing | Verified | Reduce permission friction for bash autonomy | Fewer prompts, stronger boundaries | Needs correct config; only applies to bash | Important missing lever |
| Task Master | Verified + conflicting docs | Task decomposition, next-task workflow | Strong fit for planning structure | Doc/auth ambiguity; policy caution if misused | Potentially useful, careful use |
| OpenClaw | Verified | Long-lived gateway / broader automation | Flexible, multi-model, broader orchestration | Anthropic subscription path is risky; not the cleanest fix | Not first choice for this problem |
| One giant long session | Assistant-stated but supported | Almost never | Feels simple | Context degradation, drift, early forgetting | Avoid |
| Multiple independent sessions / worktrees | Verified | Parallel branches or separate task streams | Cleaner than forcing everything into one thread | Requires process discipline | Often better than “more looping” |
Recommended Playbook / Process
Phase 0 — Baseline setup
-
[Verified] Update Claude Code and check version.
Why: important features and fixes are version-gated.- Scheduled tasks require v2.1.72+. [R4]
- Agent teams require v2.1.32+. [R15]
- Auto memory requires v2.1.59+. [R9]
-
[Verified] Pay attention to recent changelog fixes.
Particularly relevant:2.1.78fixed plan mode being lost after compaction.2.1.78fixed an infinite loop when API errors triggered stop hooks.
Why: If the user experienced weird stop/plan-mode behavior on older versions, the latest release may have already improved it.
Evidence: [R21]
-
[Tentative / needs verification] Diagnose the actual stop condition.
Check whether the current pain is:- usage limit,
- permission friction,
- context saturation,
- giant task scope,
- or stop-hook / scheduling behavior.
Phase 1 — Make planning persistent, not conversationally fragile
-
[Verified] Create or tighten
CLAUDE.md.
Put stable project and workflow rules there:- architecture constraints,
- build/test commands,
- “explore first, then plan, then code,”
- “write/update the task file,”
- “do not stop until acceptance criteria are addressed or deferred.”
Evidence: [R9], [R10]
-
[Verified] Keep
CLAUDE.mdconcise and specific.
Claude docs advise concise, well-structured instructions and suggest targeting under ~200 lines per file.
Evidence: [R9] -
[Verified] Add compact instructions.
Use a “Compact Instructions” section or/compactguidance so compaction preserves what matters.
Evidence: [R10]
Phase 2 — Run planning as a deliberate workflow
Default prompt pattern
Status: Assistant-stated but strongly supported
Use a plan-first prompt such as:
Enter plan mode.
Study the codebase first.
Write a detailed plan to docs/current-plan.md that includes:
- assumptions
- architecture options
- edge cases
- migration risks
- test strategy
- rollout/rollback notes
- PR-sized tasks with acceptance criteria
Do not implement yet.Why this is recommended: It aligns with official Plan Mode, context, and persistence guidance.
Evidence basis: [R1], [R9], [R10], [R22]
After plan creation
Use a second-pass prompt such as:
Read docs/current-plan.md.
Select only the next task.
Implement only that task.
Run the relevant tests.
Update docs/current-plan.md with:
- what changed
- what remains
- blockers
- next step
Do not begin the next task unless the current one is complete.Why this is recommended: It forces bounded execution rather than drifting across the entire project.
Phase 3 — Use subagents to preserve the main session
-
[Verified] Use subagents for exploration, review, and edge-case checks.
Example roles:- architecture reviewer,
- test reviewer,
- security reviewer,
- codebase explorer.
Evidence: [R2]
-
[Assistant-stated but strongly supported] Good planning pattern:
- main session = synthesis and decision-making
- subagents = research and critique
This keeps planning sharper and cheaper than stuffing everything into one thread.
Phase 4 — Add stop gates so Claude does not stop early
-
[Verified] Add a prompt-based or agent-based
Stophook.
Prompt-based example logic:- “Check if all requested tasks are complete. If not, return what remains.”
Evidence: [R3]
- “Check if all requested tasks are complete. If not, return what remains.”
-
[Verified] Use agent-based stop hooks for real validation.
Example:- verify tests actually pass,
- check docs/task file were updated,
- inspect repository state.
Evidence: [R3]
-
[Verified] Guard against infinite loop behavior.
Respectstop_hook_activeor equivalent logic.
Evidence: [R3]
Practical stop criteria to enforce
Status: Assistant-stated but strongly supported
Before Claude stops, require all of the following:
- task file updated,
- acceptance criteria checked,
- tests run or explicitly deferred with reason,
- open questions logged,
- next action written down.
Phase 5 — Reduce autonomy friction
-
[Verified] Enable sandboxing where appropriate.
This can reduce bash approval friction materially.
Evidence: [R11] -
[Verified] Use sandbox auto-allow within safe boundaries.
This helps Claude keep moving when its work is mostly inside approved directories/domains.
Evidence: [R11] -
[Assistant-stated but strongly supported] This is especially important if “it stops” often really means “it keeps waiting for permission.”
Phase 6 — Choose the right orchestration layer
Use native Claude Code only
Best when: one repo, one operator, planning and execution inside the terminal.
Recommendation: default path.
Use Agent SDK / claude -p
Best when: you want deterministic scripted orchestration, CI/CD, or a custom controller.
Recommendation: official secondary path.
Evidence: [R6]
Use Task Master
Best when: you want structured decomposition, dependencies, “next task,” and planning state.
Recommendation: useful, but use with auth caution.
Evidence: [R12], [R13], [R17]
Use OpenClaw
Best when: you want a broader gateway / multi-channel automation environment.
Recommendation: not the cleanest first answer to deeper Claude Code planning loops; prefer API-based auth.
Evidence: [R14], [R16], [R20]
Use agent teams
Best when: multiple collaborators need to communicate directly, challenge each other, or coordinate complex multi-part work.
Recommendation: not default; use selectively.
Evidence: [R15]
Use git worktrees / forked sessions
Best when: you want parallel work without turning one session into chaos.
Evidence: [R10]
Recommendation: often a better answer than “make one session loop harder.”
Tools, Resources, Links, and References
Official Anthropic / Claude Code sources
-
[R1] Common workflows — Plan Mode and everyday workflows
https://code.claude.com/docs/en/common-workflows -
[R2] Subagents
https://code.claude.com/docs/en/sub-agents -
[R3] Hooks guide
https://code.claude.com/docs/en/hooks-guide -
[R4] Scheduled tasks /
/loop
https://code.claude.com/docs/en/scheduled-tasks -
[R5] Skills
https://code.claude.com/docs/en/skills -
[R6] Programmatic usage / headless / Agent SDK
https://code.claude.com/docs/en/headless -
[R7] Legal and compliance — OAuth restriction
https://code.claude.com/docs/en/legal-and-compliance -
[R8] Costs / context management
https://code.claude.com/docs/en/costs -
[R9] Memory /
CLAUDE.md/ auto memory
https://code.claude.com/docs/en/memory -
[R10] How Claude Code works / context / sessions / worktrees
https://code.claude.com/docs/en/how-claude-code-works -
[R11] Sandboxing
https://code.claude.com/docs/en/sandboxing -
[R15] Agent teams
https://code.claude.com/docs/en/agent-teams -
[R21] Changelog
https://code.claude.com/docs/en/changelog -
[R22] Best Practices for Claude Code
https://code.claude.com/docs/en/best-practices
Official support / usage docs
-
[R18] Using Claude Code with your Pro or Max plan
https://support.claude.com/en/articles/11145838-using-claude-code-with-your-pro-or-max-plan -
[R19] Extra usage for paid Claude plans
https://support.claude.com/en/articles/12429409-extra-usage-for-paid-claude-plans
Task Master sources
-
[R12] Task Master installation
https://docs.task-master.dev/getting-started/quick-start/installation -
[R13] Task Master Claude Code provider example
https://github.com/eyaltoledano/claude-task-master/blob/main/docs/examples/claude-code-usage.md -
[R17] Task Master configuration docs
https://github.com/eyaltoledano/claude-task-master/blob/main/docs/configuration.md
OpenClaw sources
-
[R14] OpenClaw gateway authentication
https://docs.openclaw.ai/gateway/authentication -
[R16] OpenClaw OAuth concepts
https://docs.openclaw.ai/concepts/oauth -
[R20] OpenClaw Anthropic provider docs
https://docs.openclaw.ai/providers/anthropic
Risks, Caveats, and Red Flags
-
[Verified] Policy risk: Using Anthropic consumer-plan OAuth in external tools/services is not permitted by Anthropic.
Evidence: [R7] -
[Verified] Operational risk:
/loopis not durable orchestration. It dies with the session and recurring tasks expire after 3 days.
Evidence: [R4] -
[Verified] Configuration risk: A badly written
Stophook can create an infinite loop or brittle behavior.
Evidence: [R3], [R21] -
[Verified] Cost risk: Agent teams use significantly more tokens than a single session.
Evidence: [R15] -
[Verified] Security risk: Overly permissive sandboxing or write access can create escalation/exfiltration risk.
Evidence: [R11] -
[Verified + conflict flagged] Documentation inconsistency risk: Task Master has mixed auth/provider messaging across docs and examples.
Evidence: [R12], [R13], [R17] -
[Assistant-stated but strongly supported] Process risk: Users often misdiagnose “Claude stopped too early” when the real issue is one of:
- usage exhaustion,
- context saturation,
- insufficient persistent instructions,
- giant task size,
- or constant permission friction.
-
[Verified] Session hygiene risk: Resuming the same session in multiple terminals can jumble history; use forked sessions or worktrees for cleaner parallel work.
Evidence: [R10]
Open Questions / What Still Needs Verification
-
[Tentative] What exact failure mode is the user seeing when Claude Code “stops”?
Possibilities:- usage limits,
- permission waits,
- context compaction,
- end-of-turn stop behavior,
- or oversized task structure.
-
[Tentative] What Claude Code version is the user currently running?
This matters because recent versions fixed:- plan mode lost after compaction,
- stop-hook-related infinite loop behavior.
Relevant source: [R21]
-
[Tentative] Which Claude plan and auth path is the user using right now?
Pro / Max / Team / API / Bedrock / Vertex will affect the cleanest solution path. -
[Tentative] Is the user actually trying to orchestrate Claude Code from another tool, or do they mainly want Claude Code itself to behave better?
This determines whether the next deliverable should be:- a
CLAUDE.md, - hooks,
- skills,
- or an Agent SDK controller.
- a
-
[Tentative] What is the current appetite for API spend versus staying inside subscription-only constraints?
This materially affects whether OpenClaw or Task Master are worth considering at all. -
[Tentative] Are there newer public clarifications from Task Master maintainers about the compliant way to combine Task Master with Claude Code after Anthropic’s current OAuth restrictions?
This should be checked again before making any production recommendation.
Suggested Next Steps
- Create or revise a project
CLAUDE.mdwith explicit planning and stop rules. - Add a
Stophook that refuses to stop until:- the plan/task file is updated,
- acceptance criteria are checked,
- next action is written.
- Add a
SessionStartcompact hook to re-inject critical rules after compaction. - Turn on sandboxing for safer autonomy and fewer bash prompts.
- Use Plan Mode by default for non-trivial work.
- Move research/review into subagents instead of bloating the main session.
- Use
/looponly for in-session polling, not as a durable orchestration layer. - If scripting is needed, use the official Agent SDK /
claude -ppath. - Treat OpenClaw as optional and API-first, not as the primary fix.
- Treat Task Master as a possible planning layer, but verify auth/compliance assumptions before production use.
Handoff Notes for Another AI
If you are continuing this work, assume the following:
What the user actually cares about
- The user does not primarily want theoretical “agent loops.”
- The user wants more reliable planning depth, better continuity, and less early stopping.
- The user is open to external tooling, but the recommendation should remain practical and policy-aware.
What should be done next
The most useful next artifact is probably not another research memo. It is one of these:
- a real
CLAUDE.mdtemplate for this workflow, - a
Stophook config, - a
compactre-injection hook, - a small skill set such as:
/spec-feature/break-into-prs/execute-next-task
- or a minimal Agent SDK orchestration script.
Guardrails for future advice
- Prefer official Claude Code features over wrappers when the user’s goal is deeper planning.
- Do not recommend third-party consumer OAuth routing as “safe” or “approved.”
- Clearly separate:
- what is verified by official docs,
- what is documented by third parties,
- and what is inference or opinion.
If more research is needed
Prioritize:
- Anthropic official docs,
- Anthropic support docs,
- official docs from the third-party tool,
- then GitHub examples / maintainer docs,
- then community reports only as clearly labeled user-reported evidence.
Reviewer Notes and Improvements Made
Reviewer mode used: Serious self-review pass (no separate reviewer agent capability was available here).
Improvements made beyond the original discussion
- Added official-policy verification around Anthropic OAuth restrictions, which is the most important constraint for subscription-based wrappers.
- Added sandboxing as a missing autonomy lever; this was not sufficiently emphasized in the original answer.
- Added context-management and
CLAUDE.mdguidance as a core part of the solution rather than a side note. - Added recent changelog fixes that may explain previously observed weirdness around plan mode and hooks.
- Flagged the Task Master documentation conflict instead of pretending the auth story is clean.
- Reframed the answer from “loop more” to “staged orchestration with persistent planning artifacts.”
- Added a clearer decision model for when to use:
- native Claude Code,
- Agent SDK,
- Task Master,
- OpenClaw,
- subagents,
- agent teams,
- and worktrees.
- Explicitly separated Verified, User-stated, Assistant-stated but unverified, and Tentative / speculative claims throughout the document.
Remaining limitations of this artifact
- It still does not diagnose the user’s exact stop condition because no runtime logs, screenshots, or version output were provided in this thread.
- It does not provide a fully tested production configuration for Task Master + Claude Code under Anthropic’s current policy constraints.
- It does not include live operational testing of OpenClaw or Claude Code on the user’s system.
Optional Appendix — Structured Summary
title: "Claude Code Looping, Planning, and Orchestration Playbook / Handoff Artifact"
date: "2026-03-17"
primary_question:
- "How do we get Claude Code to do more useful looping, especially for planning?"
- "Are OpenClaw or other tools sitting on top of Claude Code using the subscription model actually helpful?"
top_recommendation:
status: "Assistant-stated but strongly supported"
summary: "Use native Claude Code features first: Plan Mode, subagents, hooks, CLAUDE.md, sandboxing, context management, then optionally Agent SDK."
verified_key_points:
- "Plan Mode exists and is designed for multi-step planning."
- "Subagents use separate context windows."
- "Hooks can block stop and re-inject context after compaction."
- "/loop is session-scoped and recurring tasks expire after 3 days."
- "Agent SDK / claude -p is the official programmatic path."
- "Anthropic forbids using Free/Pro/Max OAuth tokens in other products/tools/services."
- "OpenClaw recommends Anthropic API keys over subscription setup-token for long-lived gateways."
- "Task Master is relevant for structured task decomposition, but its docs/auth story is mixed."
major_risks:
- "Policy risk around third-party subscription-auth wrappers."
- "Context saturation from giant sessions."
- "Infinite-loop or brittle behavior from badly configured stop hooks."
- "Token/cost explosion from agent teams."
- "Security risk from loose sandbox configuration."
best_next_artifacts:
- "CLAUDE.md template"
- "Stop hook config"
- "compact reinjection hook"
- "planning/execution skills"
- "minimal Agent SDK controller"
open_questions:
- "What exact failure mode causes Claude Code to stop for this user?"
- "What version and auth path is the user on?"
- "How much API spend is acceptable versus staying subscription-only?"