The agent stack

Five layers under every robot we ship.

We document the architecture because the buyers who matter ask. If your CTO wants a deeper read or a custom variant, write us. We will send the whitepaper and a sample repo.

Telemetry & Ops

Health, latency, intent logs, alerts, fleet view, training feedback loop. The dashboard a real on-call engineer watches at 2am when something is off.

Prometheus
Grafana
Sentry
Custom audit log
PagerDuty / Opsgenie hooks

Agent Layer

Task graph, brand voice, persona, safety envelope, escalation paths, handoff to humans. This is the layer that makes a robot feel like a coworker instead of a parrot.

Custom Python orchestrator
LangGraph patterns where they fit
Pydantic schemas
YAML personas
Eval harness with graded test set

Models

Whatever model wins for the task at hand. Long-context reasoning here, fast tool-calling there, on-device speech for latency, on-device LLM when the network is unreliable.

Claude 4 family for reasoning
GPT-4o for tool calling
Llama 3 8B Q4 on Orin (fallback)
Whisper-small.en (ASR)
Piper / ElevenLabs (TTS)
Moondream 2 (VLM)

Orchestration

The wiring between the SDK and the agent. Message bus, task scheduler, sensor fusion, the bits that ROS 2 does well and the bits we replace because ROS 2 does not.

ROS 2 Humble
micro-ROS bridge
DDS
Custom Python services
Redis for ephemeral state
PostgreSQL for episodic memory

Vendor SDK

The hardware floor. Manufacturer SDK, low-level motor control, sensor I/O, on-robot networking, kill switch, charging dock protocol.

Unitree SDK 2 (low + high level)
Boston Dynamics SDK
Figure / 1X partner APIs
CAN bus + GPIO for retrofits

Operating principles

Five rules we will not break.

Models change. Interfaces should not. The agent layer talks to a model interface, not a vendor SDK. Swapping Claude for a local Llama is a config change, not a refactor.
Every agent has an escalation path. If the agent cannot answer with a confidence threshold, it hands off to a human with structured context. No silent failures.
Evals are not optional. Every deployment ships with a graded eval set. Regressions break the build, not the customer experience.
Telemetry is a first-class citizen. If we cannot see what happened in production, we did not deploy a system, we deployed a prayer.
The kill switch is hardware, not software. Anything that moves around humans gets a physical, visible, single-action stop. No exceptions.

Want the deeper read?

Send your CTO. We will send the whitepaper.

Request the Whitepaper → Back to robotics