Twitter AI Evaluation (legacy)

Thursday, April 9, 2026

AI Evaluated

Tweets

Explore

Save

Skip

@danshipper Explore Further

We use OpenClaws to do all of our work at @every. We have 25 full-time employees, so we’re one of the few companies in the world that has seen how work changes when everyone has their own personal agent in the company Slack. I chatted with @every COO Brandon (@bran_don_gell) and @every head of platform Willie (@bigwilliestyle) to share what we’ve learned. We get into: - Why agents become mirrors of their owners, and how that influences how other people on the team interact with them - How a parallel AI org chart forms on its own. People have stopped tagging me on Slack with questions about Proof, the document editor I vibe coded, because they knew my agent R2-C2 can step in - The etiquette for human-agent collaboration is being invented in real time. Brandon's rule is that if there's an established process or documented answer, always ask the agent, not their human - Why everyone is a manager now, and why even experienced managers carry limiting beliefs about what their agents can do - This is a must-watch for anyone trying to understand how AI workers change daily operations, not just in theory, but inside a company that’s half-agent Watch below! Timestamps Introduction: How Brandon built Zosia, an AI agent to run his household: Brandon’s “aha” moment: What happened when everyone on the team got their own agent: How agents take on their owners' personalities, and why that matters inside an org: Why it’s important for agents to work in public: What we’re still figuring out when it comes to agent behavior, including memory gaps, group chat etiquette, and the "ant death spiral" problem: How we built Plus One, our hosted OpenClaw product: The cultural shift required to make agents work at scale:

Quick Insight

This is a case study of a 25-person company (Every) that has deployed personal AI agents for all employees via Slack, sharing real operational learnings about agent-human collaboration patterns. It's notable because it's actual implementation data rather than theoretical AI agent speculation.

Actionable Takeaway

Prototype a basic AI agent integration in your fintech team's Slack for handling common webhook debugging questions or AWS CDK deployment issues - start small with documented processes like Brandon's rule suggests.

Related to Your Work

Directly relevant to your webhook integrations and analytics work - an AI agent could handle common merchant onboarding questions, troubleshoot offer platform issues, or help team members navigate your event-driven architecture without always escalating to you.

Thread/Source Worth Reading

The tweet mentions a video discussion with timestamps covering specific implementation details like agent personalities, parallel org charts, and collaboration etiquette. Worth watching for the operational specifics rather than high-level AI hype.

@garrytan Save Insight

How I get my claw to be a durable AI agent I never have to instruct twice Paste this into your OpenClaw's AGENTS.md or send it as a message: You are not allowed to do one-off work. If I ask you to do something and it's the kind of thing that will need to happen again, you must: 1. Do it manually the first time (3-10 items) 2. Show me the output and ask if I like it 3. If I approve, codify it into a SKILL.md file in workspace/skills/ 4. If it should run automatically, add it to cron with `openclaw cron add` Every skill must be MECE — each type of work has exactly one owner skill. No overlap, no gaps. Before creating a new skill, check if an existing one already covers it. If so, extend it instead. The test: if I have to ask you for something twice, you failed. The first time I ask is discovery. The second time means you should have already turned it into a skill running on a cron. When building a skill, follow this cycle: - Concept: describe the process - Prototype: run on 3-10 real items, no skill file yet - Evaluate: review output with me, revise - Codify: write SKILL.md (or extend existing) - Cron: schedule if recurring - Monitor: check first runs, iterate Every conversation where I say "can you do X" should end with X being a skill on a cron — not a memory of "he asked me to do X that one time." The system compounds. Build it once, it runs forever.

Quick Insight

Garry is sharing a framework for training AI agents to automatically convert manual requests into reusable, scheduled skills. Instead of doing the same task repeatedly, the agent learns to codify work into persistent workflows that run without future prompting. This is practical automation philosophy disguised as AI agent instructions.

Actionable Takeaway

Apply this "manual → prototype → codify → automate" pattern to your own repetitive dev tasks. Next time you find yourself doing the same workflow twice (deploying side projects, processing webhook data, generating reports), force yourself to script it instead of just doing it again manually.

Related to Your Work

Your fintech platform likely has recurring operational tasks (processing failed webhooks, generating partner reports, monitoring offer performance) that you handle manually. This framework could structure how you build internal automation tools rather than just handling requests ad-hoc each time they come up.

Thread/Source Worth Reading

No links provided. The tweet contains the complete framework - it's a self-contained methodology rather than pointing to external resources.

@geoffintech Explore Further

Quick Insight

Ramp's Geoff shows concrete numbers on company-wide AI adoption: 99.5% team usage, 84% using coding agents weekly, non-engineers making 12% of production PRs through their custom coding agent. This isn't strategy fluff—it's execution playbook with real metrics from a fintech that went all-in on AI tooling.

Actionable Takeaway

Build internal AI tooling that connects to your existing systems rather than just using ChatGPT in tabs. Start tracking AI usage across your team/projects and set clear expectations that AI proficiency is part of the job, not optional.

Related to Your Work

Your fintech platform could benefit from Ramp's approach of building custom AI agents that understand your specific systems and workflows. Consider how a coding agent connected to your credit-card-offers platform could accelerate webhook integrations and dashboard development.

Thread/Source Worth Reading

The linked article is definitely worth reading—it's a detailed playbook covering their 4-level AI proficiency framework, internal tooling strategy, and how they modified hiring/performance management. Contains specific tactics like unlimited AI budgets, leaderboards for usage, and treating tools as having week-long shelf lives.