When you encounter a Sim2Real gap like this, there are two options. The easy option is to introduce a new reward, telling the robot not to do whatever bad thing it is doing. But the problem is that these rewards are a bit like duct tape on the robot — inelegant, missing the root causes. They pile up, and they cloud the original objective of the policy with many other terms. It leads to a policy that might work, but is not understandable, and behaves unpredictably when composed with new rewards.
The other, harder, option is to take a step back and figure out what it is about the simulations that differ from reality. Agility as a company has always been focused on understanding the physical intuition behind what we do. It’s how we designed our robot, all the way from the actuators to the software.
Our RL approach is no different. We want to understand the why and use that to drive the how. So we began a six-month journey to figure out why our simulated toes don’t do the same thing as our real toes.
It turns out there are a lot of reasons. There were simplifying assumption in the collision geometry, inaccuracies in how energy propagated through our actuators and transmissions, and instabilities in how constraints are solved in our unique closed-chain kinematics (formed by the connecting rods attached to our toe plates and tarsus). And we’ve been systematically studying, fixing, and eliminating these gaps.
The net result has been a huge step forward in our RL software stack. Instead of a pile of stacked-reward functions over everything from “Stop wiggling your foot” to “Stand up straighter,” we have a handful of rewards around things like energy consumption and symmetry that are not only simpler, but also follow our basic intuitions about how Digit should move.
Investing the time to understand why the simulation differed has taught us a lot more about why we want Digit to move a certain way in the first place. And most importantly, coupled with fast NVIDIA Isaac Sim, a reference application built on NVIDIA Omniverse for simulating an testing AI-driven robots, it’s enabled us to explore the impact of different physical characteristics that we might want in future generations of Digit.
GIPHY App Key not set. Please check settings