Phi9 — physical AI lab

Physical AI research, data systems, and deployable intelligence.

Phi9 captures real-world behavior, structures it for training, and evaluates what actually transfers — so research can become usable capability.

Explore the Lab See the Data

REAL-TIME SYNC MULTI-MODAL CAPTURE IMU EGO-CENTRIC VIDEO DOMAIN-DIVERSE DATA PRETRAINING DATA POLICY TRAINING DATA WORLD-MODEL DATA RETARGETING TASK LABELS EVALUATION TRACES REAL-TIME SYNC MULTI-MODAL CAPTURE IMU EGO-CENTRIC VIDEO DOMAIN-DIVERSE DATA PRETRAINING DATA POLICY TRAINING DATA WORLD-MODEL DATA RETARGETING TASK LABELS EVALUATION TRACES

The loop

From capture to evaluation, the physical AI loop.

One loop. Four stages. The work is turning real-world behavior into signal, training against it, and measuring what survives contact with reality.

001

Capture

Capture real-world demonstrations with synchronized motion, video, and structured task data. Start from real signal: task context, traces, and motion that can be used again.

002

Multiply

Multiply scarce data through retargeting, simulation, augmentation, and better structure. Stretch each capture further without letting intent or task definition drift away.

003

Train

Train policies and research systems on data that preserves intent, motion, and context. The point is not isolated models; it is a training layer that stays close to reality.

004

Evaluate

Evaluate what generalizes through benchmarks, failure analysis, and transfer tests. Treat deployment feedback and failure traces as part of the same loop.

4

Loop stages
12+

Modalities per capture
25+

Environment types

Featured system

The capture system behind the loop.

The Phi9 MoCap Rig records full-body motion and first-person video together, timed and labeled by task. The output is reusable training data instead of one-off recordings, so the same demonstration can move through training, fine-tuning, and deployment.

See the Data

Research questions

The bottlenecks we are actively working through.

These are not abstract themes. They are the constraints shaping the systems, experiments, and artifacts we are building now.

01

Data that carries intent, not just observation.

Most pipelines record visible motion but lose the task underneath it. We are working on capture that preserves action, context, and what the body was trying to achieve.
02

Physical data is expensive.

You cannot scrape physical behavior. Every demonstration needs a rig, a subject, a calibration, and a clean task. The work is making each capture travel further without losing signal.
03

Benchmarks that predict real-world performance.

A benchmark score means little if a policy falls apart on an unscripted task. We care about evaluation that predicts transfer, failure, and what survives outside the benchmark.
04

One loop, not three stages.

Capture, training, and evaluation still get treated as separate departments. We are trying to wire them into one visible loop so progress does not disappear between stages.

Research

Projects, notes, and technical progress from the lab.

Open work from the loop: systems, experiments, and technical questions published as they evolve.

reinforcement-learning February 22, 2026 4 min

Policy-based Deep RL: Lunar Landing

Pure Monte Carlo in an actor-critic setting — training a lunar lander with REINFORCE and analyzing the learned policy's decision-making.

reinforcement-learning February 12, 2026

The First Project: Hierarchical RL for Robotic Manipulation

From sparse rewards to Gaussian-gated hierarchical policies — how we trained a 7-DOF arm to pour coffee in under 2,000 iterations.

Methods

The concrete systems we are building around the loop.

Methods should feel like work, not philosophy. These are the concrete layers we are building now so demonstration, training, and deployment stay connected.

Rigs, sync, task framing, and sensor traces that start with the real world instead of a benchmark-only abstraction.

Labels, schemas, exports, and task boundaries that keep demonstrations reusable across research, training, and downstream tooling.

Retargeting, augmentation, and policy pipelines that stretch scarce behavior while preserving intent, motion, and context.

Benchmark fragments, transfer tests, and failure traces that keep the loop honest about what actually generalizes.

Contact

Building in physical AI?

If you are working on data, research systems, or deployment infrastructure for physical intelligence, write to us.

Write to

research@phi9.space

Read the manifesto

Physical AI research, data systems, and deployable intelligence.

From capture to evaluation, the physical AI loop.

Capture

Multiply

Train

Evaluate

The capture system behind the loop.

The bottlenecks we are actively working through.

Data that carries intent, not just observation.

Physical data is expensive.

Benchmarks that predict real-world performance.

One loop, not three stages.

Projects, notes, and technical progress from the lab.

Policy-based Deep RL: Lunar Landing

The First Project: Hierarchical RL for Robotic Manipulation

The concrete systems we are building around the loop.

Capture surfaces

Data structure

Training systems

Evaluation layer

Building in physical AI?