https://x.com/rahulgs/status/2006090208823910573
rahul @rahulgsyes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason.
you are guaranteed to lose if you fall behind.
the no unforced-errors ai leader playbook:
For your team:
- use coding agents. give all engineers their pick of harnesses, models, background agents: Claude code, Cursor, Devin, with closed/open models. Hearing Meta engineers are forced to use Llama 4. Opus 4.5 is the baseline now.
- give your agents tools to ALL dev tooling: Linear, GitHub, Datadog, Sentry, any Internal tooling. If agents are being held back because of lack of context that’s your fault.
- invest in your codebase specific agent docs. stop saying “doesn’t do X well”. If that’s an issue, try better prompting, , linting, and code rules. Tell it how you want things. Every manual edit you make is an opportunity for improvement
- invest in robust background agent infra - get a full development stack working on VM/sandboxes. yes it’s hard to set up but it will be worth it, your engineers can run multiple in parallel. Code review will be the bottleneck soon.
- figure out security issues. stop being risk averse and do what is needed to unblock access to tools.
in your product:
- always use the latest generation models in your features (move things off of last gen models asap, unless robust evals indicate otherwise). Requires changes every 1-2 weeks - eg: GitHub copilot mobile still offers code review with gpt 4.1 and Sonnet 3.5 @jaredpalmer. You are leaving money on the table by being on Sonnet 4, or gpt 4o
- Use embedding semantic search instead of fuzzy search. Any general embedding model will do better than Levenshtein / fuzzy heuristics.
- leave no form unfilled. use structured outputs and whatever context you have on the user to do a best-effort pre-fill
- allow unstructured inputs on all product surfaces - must accept freeform text and documents. Forms are dead.
- custom finetuning is dead. Stop wasting time on it. Frontier is moving too fast to invest 8 weeks into finetuning. Costs are dropping too quickly for price to matter. Better prompting will take you very far and this will only become more true as instruction following improves
- build evals to make quick model-upgrade decisions. they don’t need to be perfect but at least need to allow you to compare models relative to each other. most decisions become clear on a Pareto cost vs benchmark perf plot
- encourage all engineers to build with ai: build primitives to call models from all code bases / models: structured output, semantic similarity endpoints, sandbox code execution. etc
What else am I missing?Dec 30, 2025 View on X →
Tuesday, December 30, 2025