I Mass-Deleted 53,000 Lines of My Own Code (And the Plugin Got Better)
How claudeops went from an overengineered CLI to a zero-dependency Claude Code plugin in 10 days, and what I learned about building tools for AI agents
I Mass-Deleted 53,000 Lines of My Own Code (And the Plugin Got Better)
Ten days ago I started building a CLI tool to make Claude Code better at multi-agent orchestration. Today, the project is a directory with some markdown files and a few JavaScript scripts. No build step, no dependencies, no TypeScript compiler. The git history tells the story of someone who kept adding complexity and then violently ripping it out.
This is that story.
The Problem
Claude Code is good out of the box. But it works alone, forgets everything between sessions, and doesn't know your project's conventions unless you explain them every time. The agent delegation system (the Task tool) is powerful but Claude doesn't always use it well without guidance. It'll try to do everything itself when it should be fanning out work to parallel agents.
I wanted something that made Claude Code better at the thing it already does. Not a replacement. Not a wrapper. Just... better defaults.
What Already Existed
Before writing anything, I looked at what people had already built. The Claude Code plugin ecosystem was already crowded by late January 2026.
oh-my-claudecode (4.1k stars) was the biggest pure plugin. 32 agents, 31 skills, seven execution modes (autopilot, ultrapilot, ralph, ultrawork, ecomode, swarm, pipeline). It had smart model routing between Haiku and Opus, magic keyword shortcuts, and a persistent execution mode called "Ralph" that won't stop until verification passes. The kitchen-sink approach. If you wanted every possible workflow pre-built, this was it.
everything-claude-code (37.5k stars) took a different angle. It's a curated configuration collection from someone who'd been using Claude Code daily for 10+ months. 15 agents, 30 skills, 20 slash commands. Less about novel orchestration, more about battle-tested defaults. It also had a learning system that extracted patterns from your sessions.
claude-flow (13.5k stars) went full enterprise. 60+ agents, swarm topologies (mesh, hierarchical, ring, star), Raft/Byzantine/Gossip consensus protocols, a WASM-based "Agent Booster" that skips LLM calls for simple transforms, Q-Learning routing, and something called "RuVector Intelligence" with sub-50-microsecond adaptation times. It claims 84.8% on SWE-Bench. Whether any of that is real or just README theater, I genuinely don't know.
claude-mem (16.9k stars) focused on just one thing: memory. A Bun HTTP worker service with SQLite + Chroma vector search, lifecycle hooks that capture everything, and progressive disclosure that reduces token consumption by ~90%. Clean single-purpose design.
compound-engineering-plugin (7k stars) from Every Inc. took the most opinionated approach: four commands (plan, work, review, compound) in a cycle where each iteration builds on the last. The thesis is that 80% of engineering should be planning and review, 20% execution. Lean and focused.
So the landscape ranged from "here's one thing done well" to "here's a distributed system with Byzantine consensus." I wanted something in the middle.
Day 1: claude-kit (32,000 Lines of TypeScript)
My first commit was January 22nd at 3am. It was called "claude-kit" and it was already overengineered.
130 files. TOML configuration with Zod validation. Seven built-in "setups" (minimal, fullstack, frontend, backend, data, devops, enterprise). Three safety addons. A profile management system with inheritance. An addon marketplace with a local registry. A sync engine that generated ~/.claude files. MCP server management with context budget tracking. Cost tracking with budget alerts. 407 tests passing.
I'd built all of this before actually using it on a real project.
The architecture was a proper Node CLI: Citty for the command framework, @clack/prompts for interactive UI, tsdown for bundling, the whole deal. It was built on top of oh-my-claudecode as a dependency, pulling in its agents and skills.
I renamed it to "claude-code-kit" within hours because I realized "claude-kit" was too generic. Then renamed it again to "claudeops" later that same day when I decided the oh-my-claudecode dependency was the wrong call.
Three names in one day. That should've been a signal.
Day 2: v2.0 (Rip Out Everything, Start Over)
By the next morning I'd already realized the setup/profile/addon system was solving problems nobody had. Who's managing seven different Claude Code profiles with TOML inheritance? Nobody. People just want Claude to be smarter about their project.
So I did a breaking change 24 hours after the first commit. Consolidated 35 agents down to 12. Archived 23 agents and 9 skills. Removed the oh-my-claudecode dependency entirely. Wrote a migration guide (for a tool with zero users, because of course I did).
The rest of that day was fighting CI. Six consecutive fix commits trying to get npm OIDC trusted publishing to work with semantic-release. I ended up just using a plain NPM_TOKEN. Should've started there.
Then I added hooks for keyword detection and continuation checking, and updated all the docs. Version 2.0.1 shipped. It was cleaner, but still a full TypeScript CLI that needed npm install -g.
Day 2 (still): v3.0 (Add It All Back, But Different)
Same day. I couldn't help myself.
v3 added semantic intent classification: instead of keyword-matching "fix" or "build", it classified the user's intent and routed to the right agent/model/parallelism strategy. I built a guardrails layer (deletion protection, secret scanning, dangerous command detection). And an "AI-powered pack system" for installing capabilities from any GitHub repo.
This added 11,790 lines across 51 files. New modules for classification, routing, guardrails, and pack management. 595 tests passing. I wrote architecture docs, implementation plans, migration guides, and a pack system spec.
If you're keeping score: in two days, I'd shipped three major versions and accumulated roughly 45,000 lines of TypeScript.
Days 3-7: Feature Creep
The next five days were a blur of additions:
- v3.1: Wired up intent classification end-to-end
- v3.2: 21 skills library
- v3.3: Hooks library + profile polish
- v3.4: Documentation sprint
- v3.5: Removed the pack module (finally admitted nobody would use it), enhanced the hooks and skills CLI
- Swarm orchestration for multi-agent task coordination
- A codebase scanner that detected languages, frameworks, build systems, test runners, linters, CI/CD pipelines, databases, API styles, monorepo tools, and code conventions
- Profiles consolidated into the scanner output
- Session memory with team sharing
- A learning system for automated knowledge capture
The CLI had ballooned to 12 commands. I stripped it to 6, then kept adding features that needed more commands. The codebase scanner alone was a significant chunk of code. I was solving real problems now (the scanner genuinely produces useful output) but burying them inside an increasingly heavy CLI.
The Realization
On January 31st I wrote a test workflow document for validating the whole system. Writing it forced me to actually walk through the install process. npm install -g claudeops. Run cops init. Configure your profile. Run cops scan. Set up hooks manually.
It was too much friction. And the dirty secret was that most of the value came from three things:
- The agent definitions (markdown files that tell Claude how to delegate)
- The skills (markdown files that define workflows)
- The hooks (JavaScript files that run on events)
Everything else was infrastructure to deliver those three things. The TypeScript CLI, the TOML config parser, the profile system, the sync engine, the test suite. All of it existed to copy some markdown files and JSON into the right places.
Claude Code had shipped its plugin system by this point. A plugin is literally a directory with agents/, skills/, hooks/, and a plugin.json manifest. The runtime does the discovery, loading, and namespacing. Everything my CLI did, but built into the tool itself.
Day 10: Delete Everything
February 1st. I deleted the entire TypeScript application.
248 files changed, 5,782 insertions(+), 53,186 deletions(-)Gone: src/, tests/, profiles/, addons/, templates/, docs/, the build system, npm config, tsconfig, eslint config, vitest config. All of it.
What remained: agents/, skills/, hooks/, scripts/scan.mjs, and a plugin.json. The scanner survived because it does real work (1,245 lines of plain ESM JavaScript that produces structured JSON about your codebase). Everything else was markdown and small hook scripts.
Then I consolidated further. 18 agents down to 7. 12 skills down to 9. Removed custom model routing because Claude Code handles that natively now. The orchestration skill got absorbed into the init skill's CLAUDE.md template, since orchestration instructions belong in project config, not in a separate workflow.
The plugin format meant I had to figure out some new constraints. Skills need specific frontmatter. The name field in skill YAML interacts with plugin namespacing in non-obvious ways (I toggled it on, off, on, off across four commits). The commands/ directory enables slash-command autocomplete but its relationship to skills/ wasn't documented well. I symlinked the scanner binary into skill directories because CLAUDE_PLUGIN_ROOT wasn't available where I needed it.
Eight fix commits in a row. Not glamorous. But by the end of the day, it worked.
What Survived
The final plugin is small enough to list:
claudeops/
├── agents/ # 7 agent definitions
│ ├── architect.md # opus — deep analysis, debugging, system design
│ ├── executor.md # sonnet — code changes of any complexity
│ ├── explore.md # haiku — fast file search
│ ├── designer.md # sonnet — UI/UX and styling
│ ├── tester.md # sonnet — test planning and TDD
│ ├── security.md # opus — security audit and threat modeling
│ └── researcher.md # sonnet — tech evaluation
├── skills/ # 9 workflow skills
│ ├── autopilot/ # autonomous execution with swarm or pipeline
│ ├── debug/ # hypothesis testing, CI fixing, paste-and-fix
│ ├── review/ # parallel multi-agent code review
│ ├── scan/ # codebase analysis and tech debt
│ ├── init/ # project setup + CLAUDE.md generation
│ ├── doctor/ # plugin diagnostics
│ ├── learn/ # session knowledge capture
│ ├── query/ # natural language to SQL
│ └── create-skill/ # scaffold new skills
├── hooks/ # 11 event-driven hooks
│ ├── smart-approve.mjs # auto-approve safe commands, flag dangerous ones
│ ├── security-scan.mjs # catch secrets before they hit git
│ ├── git-branch-check.mjs # warn on main branch commits
│ ├── lint-changed.mjs # lint only edited files
│ ├── session-restore.mjs # restore state on session start
│ ├── session-save.mjs # save state on session end
│ └── ...
├── scripts/
│ └── scan.mjs # 1,245 lines of deterministic codebase analysis
└── plugin.json # manifestNo build step. No dependencies. No runtime services. You install it by cloning a directory and pointing Claude Code at it.
What I Learned About This Space
The Claude Code plugin ecosystem is having its "jQuery plugin" moment. Everyone's building the thing they personally need, and the results range from genuinely useful to wildly overengineered.
claude-flow is the extreme end. Byzantine consensus for AI agents is... a choice. I don't think coordinating Claude Code subagents requires the same fault tolerance as distributed databases. But maybe I'm wrong. The README makes claims I can't verify, and the architecture diagrams look more like a PhD thesis than a developer tool. To be fair, if even half of it works as described, it's impressive engineering. I just don't think most people need it.
oh-my-claudecode is closer to what I wanted but went wider instead of deeper. 32 agents means most of them are thin wrappers that add little over Claude's default behavior. Seven execution modes is six too many for most workflows. But the core ideas are right: model routing, persistent execution, and keyword-based skill invocation. I borrowed the proactive delegation pattern from their agent design.
everything-claude-code is probably the most practically useful of the lot if you just want good defaults. It's less a plugin and more a reference configuration. The 10+ months of daily usage shows in the quality of the prompts.
claude-mem gets my respect for doing one thing well. Memory is genuinely the biggest gap in Claude Code's UX, and a purpose-built solution with vector search is the right call. I went with a simpler approach (JSON files + hook-based save/restore) that works for solo use but doesn't scale the same way.
compound-engineering-plugin has the best thesis: plan more, code less. Four commands. No bloat. The "compounding" concept (each cycle improves the next) is the kind of opinionated design that actually changes behavior. My autopilot skill borrowed the plan-first mode idea from their workflow.
The Meta-Lesson
I built a 53,000-line CLI to deliver what turned out to be a directory of text files. The plugin system existed the whole time. I just didn't trust that markdown and JSON would be enough.
They are. Every agent definition is a markdown file with YAML frontmatter. Every skill is a SKILL.md. Every hook is a small JavaScript file that reads stdin and writes stdout. The Claude Code runtime handles discovery, namespacing, model routing, and tool access. All the infrastructure I built was duplicating work the host application already does.
This isn't unique to my project. Most of the tools in this space are wrapping Claude Code rather than extending it. The plugin format is new and under-documented, so people build external CLIs, background services, and web UIs to compensate. But the native extension points are already good enough for most use cases.
If you're thinking about building a Claude Code plugin, here's my advice: start with the directory. Put a SKILL.md in skills/your-thing/. Write an agent in agents/your-thing.md. Add a hook if you need event-driven behavior. Test it. Resist the urge to build a CLI around it.
You can always add complexity later. You can't always delete 53,000 lines of it.
claudeops is on GitHub if you want to try it or read the commit history yourself.