Exploring Precision with Peg-Insertion Using Bimanual Robots: An Experiment with the ACT Model

Shantanu Parab
Sep 19, 2024
4 min read

Updated: Sep 20, 2024

Read The Aloha Hugging Face Getting Started Guide First

In robotic manipulation, precision tasks such as peg-insertion are vital for assessing a robot's capability to perform fine motor control. The experiment we conducted focuses on a bimanual robotic arm setup for inserting a peg into a hole. Both the peg and hole remain the same size, and the position is fixed throughout the experiment, which allowed us to explore specific metrics under controlled conditions.

Experiment Setup

This experiment utilized a bimanual robot with two manipulators. We equipped the setup with four cameras for visual perception:

Two cameras mounted on each wrist of the arms to capture close-up interactions.
Two additional cameras, one positioned atop and one positioned below, offering global views of the task space.

We trained the robot using 60 human demonstrations in the peg-insertion task, relying on the Action Chunking Transformers (ACT) model to learn the behavior from the demonstrations. The robot's objective was to replicate the actions observed in these demonstrations with high precision and consistency.

The collected training episodes are available on Hugging Face (Trossen Community), along with the trained models. These models are accessible for further experimentation, allowing researchers to test the trained robot or utilize evaluation episodes for additional training.

Importance of the Experiment

The core goal of this experiment was to understand how imitation learning models like ACT perform in a highly controlled setup with fixed parameters. Although the conditions in this experiment were highly constrained, such as:

Fixed starting positions and orientations of the peg and hole,
Consistent dimensions of both the peg and hole,

These constraints help in understanding the baseline performance of the model. By focusing on fixed variables, we are able to isolate key performance metrics like precision, success rate, and error recovery.

Metrics and Results

We measured the robot’s performance based on its ability to complete the peg-insertion task accurately across multiple evaluation runs. After training, the robot achieved a success rate of 80% during 30 evaluation rollouts. While this level of success is promising, the experiment also highlights areas for improvement, especially in the context of generalization beyond fixed conditions.

The primary metrics of this experiment include:

Success Rate: The robot achieved a success rate of 80%, completing 80% of the peg insertions successfully during 30 evaluation rollouts. This shows promising results, but also leaves room for improvement, especially when the environment or parameters are varied.

Task Completion Time: The task completion time was consistent with the length of the training episodes, approximately 15 seconds per insertion attempt. This indicates that the robot was able to replicate the pace demonstrated during the human demonstrations without delays or deviations.

Error Recovery: One of the most significant aspects of this experiment was the robot's ability to recover from errors. When disturbances occurred—such as when the peg or hole was forcefully removed—the robot was able to adjust its actions and attempt the task again. If the task could not be completed within the 15-second window, it would complete the insertion in the next episode. This showcases the robot’s ability to handle interruptions and recover from disturbances, a critical factor in real-world industrial applications.

Why This Experiment Matters

This experiment serves as a critical benchmark for understanding the limitations and capabilities of imitation learning models like ACT in handling constrained environments. Peg-insertion is widely used in robotics research because it requires high precision, fine motor control, and the ability to integrate sensor data for decision-making.

Controlled Parameters Help Isolate Performance Factors: By keeping the peg, hole, and positions fixed, we were able to focus on the robot’s ability to mimic human-like precision rather than complicating the experiment with too many variables.
Benchmarking Imitation Learning Models: Since imitation learning relies heavily on human demonstrations, this experiment highlights how well the ACT model can learn and replicate human actions. The high success rate shows promise, but it also reveals that improvements are necessary for more dynamic environments.

Next Steps: Generalization and Adaptability

While the current experiment succeeded within the confines of fixed parameters, generalization remains the key challenge in robotic manipulation. In real-world scenarios, the robot will need to handle:

Variations in peg and hole sizes,
Different starting positions and orientations,
Dynamic environments, where the object or target may move slightly.

Our next goal is to train the robot to handle these variations, making it more robust and adaptable. By loosening the constraints, the model will be tested on how well it generalizes beyond the data it was originally trained on.

Get Involved

If you are interested in testing the models or exploring the dataset for your own training, the collected episodes are available on Hugging Face. You can also access the trained models and experiment with them in your own environment. Additionally, the evaluation episodes are available for further analysis and comparison. The community is encouraged to build upon this experiment and contribute to refining the models to handle a wider variety of scenarios.

Feel free to reach out, share your findings, and help us push the boundaries of robotic manipulation!

Trained Model: https://huggingface.co/TrossenRoboticsCommunity/act_aloha_static_peg_insertion

Evaluation Dataset: https://huggingface.co/datasets/TrossenRoboticsCommunity/eval_aloha_static_peg_insertion

Training Dataset:

https://huggingface.co/datasets/TrossenRoboticsCommunity/aloha_static_peg_insertion