a flock of claudes

6-27-2025

1. Linear Issues

I’ve spent years building scaffolding for coding agents. Custom prompts, state machines, retry logic, context management systems. However, each time the infrastructure seemed to weigh the models down.

Then all of a sudden, they started actually using tools correctly. Turns out, they just needed to be trained on tool-use traces, and now the habits are baked in: when to call functions, how to recover from errors, when to retry versus escalate.

Claude Code was where it clicked for me. Specifically Opus 4. You give it a properly structured environment (filesystem access, bash, git) and it works. No hand-holding. No confused thrashing. It builds, tests, debugs, commits, like a competent (somewhat overeager) developer who happens to never sleep.

The infrastructure I’d been building suddenly made sense. It became force multiplication for a capable model instead of life support for a confused one.

Issue Tracking

The first limitation I ran into was constantly needing to explain everything.

Initially, every conversation started from scratch: “This is a Next.js project, use Bun to build, etc.” Eventually I added some CLAUDE.md files with general project info, which helped.

But I was still constantly providing context like

“So we were working on this bug in this file and it looked like we fixed it, but now it’s happening again.”

My role was mainly that of narrator, and I wasn’t very good at keeping track of things.

So, I added the Linear MCP server.

Linear is a project management platform designed for software teams. It provides a structured model where each entity (issues, projects, cycles, teams, and users) has well-defined relationships and properties accessible through a GraphQL API. The platform’s GitHub integration automatically syncs pull requests with issues and enables linking between code and tasks.

This interconnected data structure makes it well-suited for LLM orchestration. An agent can query the API to understand the context of a team’s work, from high-level project roadmaps down to individual task dependencies. The server gives Claude a comprehensive set of tools to view and edit issues in a workspace, and it laid the foundation for the rest of my “higher-order system.”

I created a new Linear team for my project and updated the user-level CLAUDE.md to explain the workflow:

- Every task starts with a Linear issue (create one if not provided)
- Track all progress via comments on the Linear issue

Now, whenever I had a task for Claude, instead of typing it into the CLI, I’d document it in a new issue with logs, screenshots, and context. Then I’d note the issue ID (e.g., GUI-10) and open the CLI with “Let’s get started on GUI-10.”

This approach seemed to provide tremendous clarity of purpose. Output code became more focused, and Claude was less likely to veer off into major refactors of unrelated sections.

As each issue progresses, Claude leaves a persistent trail of comments documenting insights, realizations, and decisions. Once complete, the code comes with a condensed report on the entire decision-making process.

Beyond the utility of external memory, I suspect that including Linear tools, and operating through them, subtly contextualizes that Claude is working on a “real team” contributing to a professional codebase.

As I settled into this workflow, most of my explanations shifted from ephemeral CLI queries to durable issues and comments with easy-to-reference IDs. For example:

Hey, that bug from GUI-12 is happening again.

This concisely gives the model detailed records of the issue and attempted solutions without needing to retype everything from memory.

Legibility Issue

To an LLM, the world is nothing more than the tokens you provide.

Context-providing tools should deliver relevant information and nothing else. Yet we keep offering clunky, unintuitive tools that output tons of useless structure and fluff, and require robotic interactions to use.

It might seem strange that we need to make machine structures human-readable before feeding them back to a machine, but I think this is an incredibly important and underappreciated aspect of creating LLM-facing software.

Consider the official Linear MCP server: it provides a standard implementation of their GraphQL API, which, while perfectly functional, refers to all objects (issues, comments, labels, states) via 32-character hex UUIDs.

This means that to move TEST-123 into “Planning,” Claude must make the following tool calls:

find_issue('TEST-123') 
> UUID      'fcb357b6-7719-4550-b6e0-8fa5d8554d69'
  team_uuid 'a210c784-b5d2-4dad-9dab-2ddd404b831e'

find_status('Planning', team='a210c784-b5d2-4dad-9dab-2ddd404b831e')
> 'de936f4d-5d04-41bd-8063-c5d85a319db6'

update_issue('fcb357b6-7719-4550-b6e0-8fa5d8554d69', 'de936f4d-5d04-41bd-8063-c5d85a319db6')

Try reading this out loud and imagine keeping track of all these 32-character UUIDs and what each one refers to. It’s an immense waste of these models' cognitive effort to spend so many tokens on illegible tool calls.

Further, the model messes up these complicated strings fairly often. It figures things out eventually, but each failed tool call introduces a history of failure into the chat record that it may be inclined to continue.

I want my agents to cluster around the “efficient teams building robust solutions” area of vector space rather than “stupid robot can’t even call basic tool.”

The easiest way to achieve this is to make things as intuitive and familiar as possible.

If a tool isn’t intuitive, why not?

I try to talk to Opus like a coworker: friendly, knowledgeable, casual, and using as many human words as possible:

"Hey, what are you working on?"
"Oh, just `CLA-110: Fix Backend Test Structure`."
"Ah, is it `Ready to Merge` or still `In Review`?"

Instead of:

"Hey, what are you working on?"
"Oh, just `fcb357b6-7719-4550-b6e0-8fa5d8554d69`."
"Ah, is it `de936f4d-5d04-41bd-8063-c5d85a319db6` 
or `a210c784-b5d2-4dad-9dab-2ddd404b831e`?"

But the UUID problem was just half. The Linear MCP also has a 65KB default limit on their API responses. This means that once an issue gets long enough (with Claude leaving detailed progress comments, they get long fast) anything beyond that threshold becomes invisible to the model.

Overall, it was complex to get information, and even once you got it, you might not actually have all of it.

So I built my own, or rather, Claude did.

A Better Bridge

This Linear MCP server provides Claude with direct, legible access to Linear, designed to limit the complexity of information/interaction while ensuring no data is truncated or lost.

Everything is readable:

team keys (SOFT)
issue identifiers (SOFT-123)
state names (“In Progress”)
label names (“Sonnet”)

The server handles all UUID resolution internally, so Claude never has to juggle those 32-character hex strings again.

To move TEST-123 to “Planning,” just call:

update_issue('TEST-123', 'Planning')

When Claude requests an issue, it gets back clean markdown formatting with just the essential information, no JSON clutter. The server strips out the noise and presents issues as readable documents with threaded comments. In addition, any images attached to Linear issues are automatically downloaded and cached locally.

The difference in practice is dramatic. Instead of struggling with each update, it treats Linear as a natural extension of the TodoWrite tool that it’s already comfortable using.

2. Scope & Concurrency

After a few days with the Linear MCP, I settled into a routine.

user: Let's work on GUI-32
  
claude: [gets issue via mcp]
        [reads files]
        Oh! I get it now!
        [edits files]
        [build/test/commit/pr]

This loop worked great for isolated tasks. But as the changes got more complex, two problems kept surfacing:

For cross-file work, Claude would max out its context and “compact” midway through, losing track of details. I’d find half-finished refactors with dangling imports.
I was finding issues faster than a single agent could fix them. Only one Claude could safely edit the codebase at a time.

Two-Step Workflow

The fix was to split each task into planning and implementation.

Planning: Claude operates read-only on main branch. A slash command tells it to identify critical files, document existing patterns, and build a focused roadmap. Everything gets compressed into Linear comments.

Implementation: A bash script spins up a fresh Claude with:

Dedicated branch and worktree
The condensed plan from Linear
Full edit privileges
Clear instructions to build, test, and PR

This killed both problems at once. Multiple agents could work in parallel on different branches. And each implementation agent started with pre-digested context instead of burning tokens on exploration.

The planning agent became a “context compiler”, reading widely/summarizing tightly, while the implementation agent got to spend its entire contextual budget on actually building things.

$ claude "/plan TEAM-123"
...
[plan complete]

$ ./cimplement.sh TEAM-123
...
[pr ready]

Automation

Typing commands got repetitive, so I automated it.

I spun up n8n in Docker and wrote a FastAPI server with a /dispatch/[task]/[issue_id] endpoint. When Linear issues move to “Plan”/“Build” state, n8n catches the webhook and dispatches the right agent.

The downside: my visibility dropped to just Linear comments. Usually these were end-of-work summaries, making it hard to catch misunderstandings early.