Giving robots a sense of touch.
LeRobot has made low-cost robot learning widely accessible, but its policies are still blind to contact. We add FlexiTac tactile sensing to the SO-100/SO-101 platform and find that a modest amount of touch meaningfully lifts success rates on contact-rich manipulation across four policy families.
* Equal contribution
Columbia University
The missing modality.
In recent years, open-source robot learning has exploded in popularity. LeRobot now packages datasets, imitation-learning baselines, and low-cost hardware like the SO-ARM100/101 that allows anybody with a laptop and a few hundred dollars worth of parts to access the cutting-edge of manipulation research. The latest vision-language-action models such as Pi0.5 and SmolVLA plug directly into this stack.
Yet almost all of this progress is driven by cameras. The robot sees the scene but does not feel it. For tasks where vision is limited or where touch is essential, such as inserting a tube into a rack, aligning a peg, or finding a pen in a cluttered bag, the policy is crippled and cannot view key aspects of the interaction itself. When the grasp slips or the object hides behind the gripper, the camera-only policy has nothing left to reason over.
To us, this is a glaring gap in the field. We believe it to be obvious that touch is the missing piece, and we wanted to know how much it actually matters once you add it to the standard LeRobot pipeline.
A drop-in tactile path.
Our setup makes minimal changes to the existing LeRobot pipeline. We replace the SO-100 (or SO-101) stock jaw with a tactile gripper
built around the FlexiTac
sensor, and we extend LeRobot's dataset and policy interfaces with a single
observation.tactile.* stream that all four policies consume
through the same schema.
On the policy side, we add one small hook per architecture. ACT, Pi0.5, and SmolVLA receive tactile maps as a handful of extra transformer tokens, while Diffusion Policy folds them into its global conditioning vector. Everything else stays the same and is plug and play.
For each task we collect paired datasets with and without tactile input, train matched policies from the same checkpoints, and evaluate on held-out trials. The documentation has the full command list and hardware guide.
What touch unlocks.
We evaluate over three different tasks, with four types of policies each. For every task, we report tactile vs. no-tactile success rates, a sped-up long rollout (10× speed), a short real-time clip (1× speed), and a typical no-tactile failure (2× speed).
Takeaways.
Across the three tasks and four policy families, a few things stood out to us:
- Tactile helps most when vision runs out. The largest gains show up on partial-observability tasks like pen retrieval inside a bag or peg alignment at the moment of contact, where the camera simply cannot see the relevant geometry.
- The signal generalizes across architectures. ACT, Diffusion Policy, Pi0.5, and SmolVLA all improve with the same tactile stream, suggesting the bottleneck was the modality, not any one policy's capacity.
- A little tactile goes a long way. Four tactile tokens is usually enough. Pushing the count higher adds noise without improving success.
-
Pi0.5 needs full fine-tuning. Action-expert-only and LoRA fine-tunes
collapse on tactile tasks; only full fine-tuning at a small learning rate (
2.5e-5) converges reliably.
Reproduce it.
Cite this work.
Please cite this work as
Naian Tao, Yifan He*, Wesley Maa*, Binghao Huang, Yunzhu Li, "LeFlexiTac: Adding Touch to Low-Cost Robot Learning", Columbia University RoboPIL Blog, 2026.
Or use the BibTeX citation:
@article{tao2026leflexitac,
author = {Tao, Naian and He, Yifan and Maa, Wesley and Huang, Binghao and Li, Yunzhu},
title = {LeFlexiTac: Adding Touch to Low-Cost Robot Learning},
journal = {Columbia University RoboPIL Blog},
year = {2026},
note = {https://github.com/TNA001-AI/lerobot_tactile},
}