Tunnel Vision

I dictate most of my work now. It’s faster than typing, and it feeds directly into the tools I use. But there’s a catch.

A cardioid microphone has a pickup pattern. You want to stay ten to twenty centimetres away, facing it. The mic is on a boom arm. The boom arm is attached to the desk. Your chair doesn’t rotate around the mic - it faces the screen. So your head doesn’t really move. The direction you’re looking doesn’t change.

Before you know it, you’ve had workdays where you fail to notice that the cardboard box you’ve been meaning to recycle has been sat next to your desk for a week.

The tether

Walking helps you think. A lot of people report this. They feel trapped on Zoom calls - it would seem strange to get up and start pacing, but the urge is there. There are real benefits to getting away from your desk when thinking hard.

But the microphone keeps you in place. And the screen keeps your eyes fixed. And the hyperfocus that comes with working on hard problems means you don’t notice you’re tethered until you stand up.

Useful fictions

At the end of the day, what are we doing? Mashing binary around to achieve some goals. That’s all we ever do.

But we’ve built up whole constellations of concepts to make it intelligible. Files, classes, methods. Tests, runs, runners, suites. Entities, events, interactions. Concepts upon concepts, each with their own instances - lots of different tests, lots of different runs of the same tests, and so on.

These are ergonomic fictions. They exist to help us think, not to describe some fundamental reality. The trade-off is supposed to be worth it - and you hope it is. But case in point: you can forget the fiction is even there.

Anthropic’s Hannah Moran put it simply: “Agents are models using tools in a loop.” When you describe it that way, it feels like a mirage. A user-facing fiction that doesn’t really describe the underlying thing. There isn’t an entity at all - just a process that comes and goes.

The mouse and GUI of AI

We’re already building abstractions for this era. Coding harnesses, context windows, tokenizers. Hallucinations, sycophancy, evals. These are just the things we’ve got names for so far - probably a tiny fraction of what’s already in play.

Steve Yegge’s latest post lists eight stages of AI engineering - from “zero or near-zero AI” through to “building your own orchestrator.” He’s naturally describing concepts that aren’t yet in common use. Partly because they help him express himself, but also because he’s proposing they might be useful to others.

The mouse and the GUI of AI are sat right there, waiting to be named. But I wonder if the eras of AI engineering will be stratified like computing’s eras - the 60s and 70s distinct from the 80s, then the web - but compressed. Much more rapid. Humans might struggle to develop the right metaphors in time before things are overturned by progress once again.

Something a little tragic

Alan Kay is at once optimistic and grumpy about the state of computing. Optimistic because he thinks things could be so much better. Grumpy because he thinks we’ve alighted on the wrong metaphors and practices far too early.

I suspect he’d look at us now - sat in chairs, eyes fixed on screens, talking into fixed-position microphones - and see something a little tragic. We’re still using QWERTY keyboards designed to slow down typewriters. We’re still using the same editors. Even terminal agents are basically thin shims between us and normal terminal output, showing diffs as they change files.

The first thing we want to do with AI isn’t something new. It’s to adapt it to our existing working practices.

Sooner or later we’re going to realise we’ve been staring at screens unnecessarily, and that the room around us is a mess, and we’ll feel silly for not having noticed earlier.

I don’t think it’s today yet though.