Strudel

I keep coming back to this video. Someone making music by editing JavaScript while the loop runs. Not writing notes - tweaking the rules that govern what happens next. The music plays continuously; intervene, adjust, listen, adjust again.

I’ve been doing something similar. A verbose org-mode to-do list, with a long-running Codex CLI instance watching and editing the same file. We interact primarily through the document. It updates me in bold text as it works. I still have the chat open, still jump in when needed - but it’s surprising how viable this already is. Same pattern: the loop runs, you tweak the rules.

This is what software engineering is becoming. I’ve been thinking about it, and a few things are clicking into place:

Orchestration, not composition. The pipeline runs. Agents pick up tickets, write code, submit PRs, review each other’s work. You’re not in there writing the code. You’re watching the loop, and when you see a pattern you don’t like, you update a prompt. You give better guidance. The loop keeps running.
It’s not a chatbot anymore. When ChatGPT came out, you naturally thought of it as conversation - a growing sequence of messages, you and the model taking turns. But there was always the option to hook in other stuff. A spreadsheet. A codebase. Jira plus GitHub. Once you do that, the data can change under the model’s feet while you work. You’re having an LLM-infused interaction with a data source. The form of the tool changes completely depending on what you plug it into - the same way Excel can be accounting software or a wedding planner depending on what’s in the cells.
Stateless agents, persistent state. Git and issues become the system of record. The agents are stateless - like a contracting project manager who comes in fresh and just looks at the board. If the board is enough to act on, it works. No context window growing forever. No conversation history. Just current state, decision, move on.
I can’t see past this. Until now I’ve had a rough sense of where things are heading, and I think I’ve been largely accurate. This is the first time I can’t see past the horizon. Make the models 10x more reliable, 10x more capable beyond here - what changes? Maybe you automate the watching-and-intervening that the human orchestrator is doing now. But the shape feels right. Loop running, human intervening, system of record underneath.
Mutation testing, chaos engineering. If implementation is becoming a solved problem, maybe human attention shifts to designing the stress tests that make systems robust. I’ve been meaning to read up on these properly. But I suspect even that won’t stay human for long. Models could pick it up quickly. More of a checkbox to tick soon than a new discipline to master.

Maybe that’s the ceiling. Maybe I just can’t see further yet.