M
Muratcan Koylan @koylanai
Thursday, December 18, 2025 import

Tweet

Since Manus is trending again, it's a good time to revisit their blog on context engineering & multi-agent systems. One of the reasons many people choose their agent instead of ChatGPT or others is that the Wide Research agent can run for a very long time, create tables or docs for you, and frankly, it's so far the best computer use agent. Deep researching multiple entities is more of a context problem than a planning problem. By item 8-9 in any multi-subject research task, LLMs start making things up. Not simplifying. Fabricating. - Items at the beginning and end get recalled; middle items get forgotten. - 400K tokens isn't 2x the cost of 200K. It's disproportionately more expensive in time and compute. - Models trained on chatbot-style interactions feel "impatient" when assistant messages get long. They rush to summarize, shift to bullet points, start cutting. This is why we should separate what we store from what the model sees. Your session holds everything. The working context is a filtered, compressed snapshot built fresh for each call. Manus's solution: parallel sub-agents with fresh context. Instead of one processor handling n items sequentially, they deploy n sub-agents processing n items simultaneously. With this "context isolation," sub-agents don't communicate with each other; all coordination flows through the main controller. This prevents context pollution. An error in one sub-agent doesn't propagate to others. Their full context engineering blog goes deeper on principles that apply beyond Wide Research: - Manus constantly rewrites a todo md file during tasks. This pushes the global plan into recent attention span, counteracting "lost in the middle" decay. - File system as memory. Treat files as externalized context; unlimited, persistent, and operable by the agent. Compression should always be restorable. A URL can replace page content; a path can replace document contents. - Keep errors visible. Don't hide failed actions. Leaving wrong turns in context helps the model avoid repeating mistakes. Error recovery is underrated in benchmarks but critical in production. Decompose. Parallelize. Synthesize.