All jobs

Agent Harness Engineer

Location
On-site • Munich
Employment Type
Full-time
Level
Mid-Senior Level

You're the person who makes the product do more things, for more customers, more reliably. You've built agents before - runtime, tools, memory, evals - and you have opinions about what makes them work.

You ship to production the day you write the code, with AI-assisted development as your default because it's how you're fastest.

If you've never built an agent, this isn't the role - we're hiring people who already have the muscle and want to build the agent everyone else will try to copy.

About the Company

We're building the AI coworker: it lives in Slack and Microsoft Teams, connects to thousands of tools, and does real work for real companies across finance, marketing, ops, and engineering. The goal is to replace half the SaaS stack with a single teammate.

What's Actually Going On

Someone in Slack asks the product to reconcile their Stripe payouts against their books. It does it, live, in under a minute. The customer tells their network, and two more teams sign up that week. That's the loop - and your job is to make it happen more often, across more surfaces, more reliably.

We handle 600K+ tool calls a day on a steep volume curve, with connections to thousands of tools - Salesforce, Notion, Stripe, QuickBooks, HubSpot, Shopify, and whatever customers ask for next. Each new integration unlocks a new shape of customer; each reliability win compounds.

What You'll Do

  • Build the agent runtime - sandboxed execution, tool orchestration, memory, and the closed loop where the agent checks its own work, turning a sentence in Slack into a verified action against a real API

  • Ship integrations - OAuth, webhooks, schema mapping, error handling - that work on the customer's actual data on day one, not on a fixture

  • Run the infrastructure - container orchestration, autoscaling, cost per task - so the product stays fast and cheap

  • Ship product features that go directly to users through Slack and Teams

  • Build whatever else needs building - small team, large surface

How You'll Know It's Working

  • You ship to production every day and the changelog has your name on it

  • A feature you built is the reason a customer closed

  • When something breaks at 3am, you can fix it because you understand the system end to end

  • The founders are writing less code because you've taken over surfaces they used to own

  • Week one: ship something to production and take ownership of a live surface

What You'll Bring

  • Personally built or significantly modified an AI agent harness, and can describe the trade-offs you made and why

  • Made an agent reliably close the loop - tests, linters, typecheckers, verification - so it checks its own work instead of hallucinating success

  • Built custom skills, CLIs, or MCP servers to make your own agentic coding faster

  • Agentic coding as your daily workflow, with informed opinions about which models, harnesses, and tools to reach for and when

  • Systems thinking: hard technical trade-offs, understanding how things break at scale, and judgment about what's worth doing well vs. fast

  • Range - from agent runtime to frontend component to deployment script in an afternoon

  • Genuine interest in how AI actually works: you've read the papers, read other people's prompts, and have opinions about evals

This role doesn't work without agentic coding fluency. If AI-assisted development isn't already how you work, you'll be behind on day one.

Even Better If

  • You've been a founder or early-stage builder

  • You've contributed to open-source AI tooling

  • You have deep model understanding and eval infrastructure experience

Why This Role Is Different

  • No layers - you work directly with both founders, and decisions get made in the room, not in a ticket

  • The agent is the product - every reliability win, new tool, and memory improvement shows up in retention the next week

  • Volume that forces you to be good - 600K+ tool calls a day means the lazy answer breaks in production immediately

  • The feedback loop is hours, not quarters

How We Work

Small team, high trust, low process. Ship your first week, talk to users your first day. Everyone owns something real - not a task, but a surface of the company customers depend on. Our infrastructure runs on Modal; you won't need every tool coming in, but you'll need to learn fast.

What's on Offer

  • Meaningful equity

  • The kind of ownership that only exists at this stage

  • Direct work with both founders on the product customers came for

Apply now
Agent Harness Engineer
Apply now