AI Vanguard: 10 Weeks Left

A few recent numbers, taken together, are quite telling.

Sam Altman has clearly been energized since the GPT-5.5 launch. Codex weekly downloads surged to 90 million. Paid users climbed from 3 million in March to over 4 million by late April. My own gut feeling lines up with that. Before 5.5, I had one $200 Codex account; now I have four, at$ 800 a month. Many are saying 5.5 shouldn't be called 5.5—it should be GPT-6. I once posted that "this is not a minor version," and that claim is being validated.

The Anthropic side is even more extreme. Dario Amodei did the math in a CNBC interview last week. From the start, they bet on exponential AI growth and built infrastructure for "10x per year." Q1 growth annualized to 80x. Annualized revenue ran from $9 billion at the start of the year to$ 30 billion in April. Infrastructure got crushed, so they signed a deal with SpaceX to lease the entire 220,000 GPU Colossus 1 data center in Memphis, unlocking 5-hour usage limits for Claude Pro / Max users.

That's the backdrop.

Against that backdrop, I spent a few hours yesterday talking with a friend. He uses AI heavily, knows his way around Terminal, builds things on open-source frameworks like OpenClaw, and is a power user at his company. A firm offered him a role as an AI transformation expert and he asked what I thought.

After the conversation, I realized a few judgments from it are worth pulling out on their own.

1. When Evaluating an Offer, Look for "Unrestricted Access to the Strongest Models"

I told him that at this particular moment, the offer itself isn't the most important thing. You're already in a good spot.

What truly matters is whether the platform can give you resources for near-unlimited use of the absolute best models, plus enough freedom to tinker in whatever direction you want.

What do I mean? For someone like me with a bit more tenure, I have ways around resource limits—I can burn $1,000 a month out of my own pocket for top-tier access and set my own direction. But for younger people just getting fired up about AI, the company still matters. Because you're creating value for someone else during work hours. If the model isn't the best, you're always one step behind. You can feel the gap, but you can't close it.

Without that foundational access, it's actually very hard to be an AI transformation expert.

2. Stop Running GPT-5.5 on Medium Effort

The second thing—I guessed right. I asked him, when you use GPT-5.5, you don't crank effort to the max, do you? To save money, control costs, do "routing," and dial effort down? Or just switch to Kimi, DeepSeek, Minimax?

He confirmed it. Most people I know use them that way.

I think there's a problem here.

I've reached a conclusion through repeated trial and error. Back with Opus 4.6, dropping effort from high to medium on the same model slashed accuracy from around 80% to roughly 30%. Same model, just one effort tier lower, and the entire workflow performed completely differently. After that, I never used medium effort again; Claude has stayed on x high ever since.

With GPT, I've been on x high from day one. The reason is simple. If a task runs well in Claude Code, I generally won't bother trying GPT. The moments that made me think "this thing is genuinely different" all came at maximum effort. That was true in the 5.4 era, and the gap is even more pronounced since 5.5.

So in one sentence: don't optimize costs prematurely at this stage.

Let's do the math. One top-tier model, one account, $200 a month. Two accounts give you more than enough headroom to run nearly 7*24 on a single vertical task. That's$ 400 a month—about 3,000 RMB. That's cheaper than hiring an intern, but the output approaches PhD-level researcher quality. What is the point of saving that small amount of money?

I currently spend $1,000 a month on myself, rotating across five accounts. If I switched to API calls at the same intensity, it would cost roughly$ 10,000.

Why not optimize? Because right now you don't know where the strongest model's boundary lies. The researchers don't know either—they haven't tested it in your domain. This is uncharted territory. What you need to do is take the strongest, most expensive, highest-effort setup and slam against that boundary, pushing it to the limit. If it truly can't do something, you'll know for certain.

As for "why can't we control costs first like traditional software"—wait until everyone has mapped out the boundaries and we're in the mass-deployment phase, then consider cost-effectiveness. We're not in that phase right now.

3. Ten Weeks Left

My friend asked: What about time? Can't I just take more time and experiment slowly?

I said, let me do the time math for you.

The world hasn't been broadly stunned yet because top-tier models are coming too fast, too dense, with intervals too short. From my own use, I've found something counterintuitive. Even with the most elite models like GPT-5.5 and Claude Opus, cranked to maximum effort, it still takes time to run a valuable direction to completion. It's fast, but not so fast that you "think it and it's instantly done." It proceeds step by step. What used to take a team months gets compressed to weeks. But you still have to walk through the steps.

Plot this on a timeline:

GPT-5.5 launched on April 23—two weeks ago.
At that point, a cohort already realized this time was different and started running their most ambitious directions on top of it.
Conservatively, within three months, a batch of things will ship from unimaginable people, unimaginable directions, unimaginable domains.

Three months = 12 weeks. Two have passed. Ten remain.

In ten weeks, remarkable results will enter the world and make everyone realize "the world is different." These results won't wait for you. In ten weeks, your market position, your label, your place in the seniority hierarchy may all need to be recalculated.

Is ten weeks long? No. It's short enough that you cannot afford to waste a week building an inefficient workflow or hesitating over which tool to use.

4. Don't Use IDEs, and Don't Mess with Third-Party Frameworks

Third topic: So what should I use?

My answer might offend some people, but I want to be clear.

Don't use IDEs. In the coding-agent scenario, the IDE is, in my view, a form factor that has already been obsoleted. It's especially unfriendly to people who aren't already senior programmers. You have to spend time learning a complex interface that means nothing to you, and in the end you still can't read the code the agent writes. Squeezing a tiny Terminal window in the middle—this form factor is fundamentally awkward.

The name "Terminal" has done Claude Code and Codex a disservice. When many people hear CLI or Terminal, they immediately think "programmer stuff." But after I helped some friends with zero programming background install Claude Code and Codex, not a single one said "I can't figure this out." It shows you what's important, compresses unimportant steps into logs, and you just judge at the key decision points. Beginners actually pick it up faster than an IDE.

If you can avoid it, don't use third-party frameworks like OpenClaw or Hermes. This is a bit counterintuitive. I have indeed helped people install these before. But looking at it now, Terminal plus the official CLI has matured to the point where it can do everything those frameworks do, and better.

Why? Because the official CLI is tailored to the official model's behavior. Claude Code connects to Claude models; Codex connects to GPT models. Caching mechanisms, error recovery, risk guardrails, context compression—all tuned for that specific model. Switch to "using OpenClaw with GPT" or "using Claude Code with Kimi," and it may run in theory, but in practice the effect is noticeably worse.

A recently popular open-source project proves this point. Someone built a Deep Code CLI specifically for DeepSeek V4, similar in form to Claude Code but tailored solely for the DeepSeek model. Many find this counterintuitive—aren't relays supposed to "connect to every model"? This path is actually the right one. Models have their own behavior; a carrier customized around a specific model delivers better results and cost efficiency.

5. Never Use a Completely Black-Boxed Agent

OpenClaw has another "advantage" that I find dangerous. It can dispatch tasks remotely and deliver results without you watching the process. Sounds great.

For some people this is a good thing. But for those exploring the boundaries, this feature should be disabled.

The foundation of collaborating with an agent is understanding. What kinds of work it does well, what it does poorly, what cognitive habits it has—you only learn these by watching it work. Once it becomes a black box, what you lose is judgment, not just the details.

Managing AI is like managing a new hire. The fastest way to learn is to watch them do every step. Treating it as a wishing machine that gives you results won't make you a stronger leader.

6. Want to Operate Anywhere, Anytime? SSH + tailscale + tmux Is Enough

My friend said another reason he likes OpenClaw is that it can be controlled from a phone, letting him dispatch tasks anytime, anywhere. This point has been overlooked far too much, so I'll address it specifically.

If you only use one laptop, feel free to skip this section.

If you want to control your home desktop from your phone, the required infrastructure is actually very mature. SSH is an ancient protocol that lets you log into one machine from another with high privileges. tailscale is a free virtual VPN that adds your desktop, laptop, and phone to the same VPN so they can talk directly via stable internal IPs. tmux is a background session tool: open a session on the desktop, cd into the project directory, launch Claude Code or Codex, and that session runs forever in the background. Disconnecting from the network or turning off your phone doesn't affect it. You can attach anytime to check progress.

On the phone side, pair it with a terminal app like Termius and connect in. The whole setup takes less than an hour.

The workflow after setup looks something like this: Before leaving home in the morning, drop a task into the desktop's tmux session. During the commute, attach from your phone to take a look; if progress looks off, adjust direction. While you're in meetings at the office, the agent keeps running. At lunch, attach again to review and give more feedback. When you get home, pick up right where you left off at the computer.

The entire chain is seamless. I currently have about 50 hours a week where the agent works on its own, out of sight, but I have a clear sense of what it's doing. The phone is the controller.

This kind of infrastructure is common among programmers, but it's severely undervalued in the context of "using AI to drive work transformation." It lets you scale from one laptop to a building's worth of compute without any middleware.

Putting All of the Above Together

Back to my friend's original question: Should I take the AI transformation expert offer?

I didn't give him a direct answer; I gave him my decision framework. First, see if that company can give you near-unlimited access to the strongest models. If not, the offer's value is limited. Second, in your daily work you must use the top-tier model at maximum effort. The vanguard phase is no time to save money. Third, there are only about ten weeks left; don't waste them on inefficient toolchains or debating whether "third-party frameworks might be better."

It all comes down to one sentence. Right now, no one knows the true boundary of the strongest model. What you need to do is not optimize costs, not adapt to existing workflows, but take the most powerful tools available and slam into that boundary to see if it can be pushed outward.

In ten weeks, the world will be shaken by a wave of unexpected breakthroughs. At that moment, the last thing you want is to look back and realize: these past few months I spent tweaking IDE configurations and optimizing model routing costs.

Time is the most expensive resource. Attention is second. Money is last. Don't get this order reversed for the next ten weeks.

AI Vanguard: 10 Weeks Left

1. When Evaluating an Offer, Look for "Unrestricted Access to the Strongest Models"

2. Stop Running GPT-5.5 on Medium Effort

3. Ten Weeks Left

4. Don't Use IDEs, and Don't Mess with Third-Party Frameworks

5. Never Use a Completely Black-Boxed Agent

6. Want to Operate Anywhere, Anytime? SSH + tailscale + tmux Is Enough

Putting All of the Above Together

References

Recommended Reading

Subscribe to Updates

Comments