Why Pi is worth trying
Lately I’ve been using Pi for all my agentic workflows outside of Claude. Here is why you should try it too.
What is Pi? Pi is an open source, third party, command-line AI harness, an alternative to Claude Code and Codex. Those vendor harnesses are intended to be used with their vendors’ models, Claude and the GPT models, and they come with many features built-in.
Pi is different. You can use it with any model. It’s minimalist, so it starts with relatively few features. However, it’s extremely customizable, thanks to a plugin system which allows you to define extensions which add deep features to the system. For a deeper look at Pi’s design philosophy and its relationship to OpenClaw, see Armin Ronacher’s excellent writeup.
Pi forces you to learn things worth learning
As often happens, extreme customizability is a double-edged sword. It means you can change things but also that you are obliged to learn new things. Right now, for Pi, it’s a good thing because what Pi obliges you to learn is maximally valuable knowledge. In particular, it helps you learn which AI to use, how to prompt an AI best for agentic coding, and also what is your best workflow for using an agentic harness.
The value of trying different AIs is obvious. They are subtly different. GPT-5.4 is indefatigable and nitpicky, Opus-4.6 is tasteful but a bit lazy. Etc.. Vendor harnesses lock you in to just one model. With Pi, you can register a half dozen models, and switch between them mid-session. You can watch a local model like gpt-oss-120b fail at an agentic workfload, and then switch to gpt-5.4 and watch it neatly dissect which failures were due to tool call precision vs agentic judgement. You can discover that glm-5.1 is really quite strong, but not as strong as folks on Twitter pretend. This is all very educational, and a good way to save money if you discover any weak models which are strong enough.
By “how to prompt,” I mean literally what material to put into the context window and send to an AI.
For instance, the vendor harnesses come with integrated search providers. With Pi, you need to select the search provider the AI should use. This is a bit annoying, but then you get to see what kind of search results the AI is actually using, and that is worth understanding.
More generally, vendor harnesses make it hard to see exactly what they send to the AI. Claude Code, for example, automatically inserts messages which help to keep the AI focused on the primary task. If those work, I want to understand the pattern so I can copy it in my own systems. If they don’t, I’d like to shut them off. Similarly, Codex performs compaction by calling a remote endpoint which returns encrypted blobs. The CLI is open source, but you still can’t see what’s inside those blobs or how compaction actually works.
We’re still learning how to use AIs well, and small changes in technique can have a large effect. So if some new technique works, I want to understand why. And if it doesn’t, then I want to be able to turn it off.
Maybe in a year, these choices will be understood, optimized, standardized, and blackboxed behind an interface, so you will not need to understand it any more than your CPU’s register count. But for now, knowing more translates directly into doing more.
The second half of what Pi forces you to learn, is what features you want — i.e., how you would actually prefer to use your harness.
Pi makes it easy to add and create new features, via skills or extensions. I’ve made one for accounting context window usage. From this I learned what now seems obvious to me, which is that tool outputs fill most of the window in an agentic session. I made another one for exactly replicating Codex’s remote compaction logic. I still don’t know if Codex is doing anything special, but now I can at least copy what it is doing. I have another extension which provides smoother interop with emacs. And I’m noticing extensions from the community which provide streamlined workflows for Karpathy’s autoresearch, and other fashionable new methods which were named and publicized just weeks ago.
Being able to spin up features on demand encourages exploration and invention of new workflows, which is just what is needed in such a new area.
Back to the future of live environments
In truth, I didn’t need to care about the implementation details of any of those extensions, but it strikes me as very significant that the AI was able to produce them.
It is a genius of Pi’s extension architecture that you can just ask an AI to add a new feature, and then it works. (Consider how many application code bases make it difficult for even the human being who developed it to add a new feature!) What is more, you usually don’t even need to restart Pi for the feature to take effect. The only other application I know with this property is Emacs.
This kind of extreme hackability may be both a vestige of the past and a window into the future. Maybe AI is bringing the era of the user interface to an end. Maybe in the future, the only stable interface will be English, and we will ask AI to generate on demand whatever custom feature or interface we want. If that is the future, then Pi represents a piece of it which you can play with right now.
If you want to hear more thoughts on Pi, Stefano and I discussed it in the latest episode of the 15-Minute Share It: