The Slingshot
Method.
271 AI skills on board. We carry almost none of them, and reach for any in milliseconds. Same firepower, ~1% of the weight.
Every skill you keep loaded is dead weight you carry forever.
A skill is only useful when its full instructions are in context. But to even know it exists, its name and description sit in the model's memory every single turn — used or not.
Stack all 271 and that's 14,651 tokens of cargo riding along on every message, most of it for skills you'll never touch this session. Pure drag. The bigger the toolbox, the heavier you fly, the sooner you hit the wall.
Don't carry the cargo. Slingshot to it the instant you need it.
Three layers. Cheapest first, smartest last. You only ever pay for the one that fires.
Keyword pass
instantThe prompt is matched against 271 skill names in pure local code. "django tests" lands django-tdd in ~50ms. No model, no cost. Catches the obvious, which is most of the time.
The local brain
on your GPUWhen keywords miss — "make my page load faster" shares no word with react-performance — a local Ollama model reads the task and decides what's actually required, by meaning. Runs on your own silicon, so the reasoning is free.
Fetch on demand
pay-per-useOnly the one chosen skill's full playbook is read into context, only at the moment it's needed, then it's gone. You pay for a tool the second you pick it up — never for the 270 still in the rack.
Same payload. Read the gauges.
On a flat plan it isn't a smaller bill. It's range. Fuel saved becomes distance you can travel before the wall.
Efficiency compounds into edge.
Longer runs
~14.5k tokens freed per turn means sessions go far deeper before they choke on their own context. More work per conversation.
Snappier turns
Less dead weight to re-read every message. The model spends its attention on your problem, not on a manifest of 271 tools.
Add freely
10 skills or 10,000 — baseline cost barely moves. The library grows; the drag doesn't. Tooling stops being a tax.
Free intelligence
The routing brain runs on hardware you already own. Every decision it makes is compute you don't rent. Local silicon, doing real thinking, at zero marginal cost.
Five moves. All local, all free.
The full walkthrough lives in BUILD.md. The shape of it:
SKILL.md — a name, a description, instructions.index.json: every skill's name + description + path.find) — score the prompt against skill names. Instant, no model.route) — Ollama embeds + reasons over the misses. 0 API cost.