Due to unfortunate limitation of the design and adoption of large datasets in reinforcement learning and robotics, it is not apparent how to move towards an “ImageNet-scale” dataset for robotics that is useful for the entire research community. Hence, Sudeep Dasari along with fellow researchers at University of California, Berkeley proposed to collect data across multiple different settings, including from varying camera viewpoints, varying environments, and even varying robot platforms. Motivated by the success of large-scale data-driven learning, the researchers created RoboNet, an extensible and diverse dataset of robot interaction collected across four different research labs. The collaborative nature of this work allows them to easily capture diverse data in various lab settings across a wide variety of objects, robotic hardware, and camera viewpoints. Finally, they found that pre-training on RoboNet offers substantial performance gains compared to training from scratch in entirely new environments.
According to BAIR (Berkeley Artificial Intelligence Research), the RoboNet consists of 15 million video frames, collected by different robots interacting with different objects in a table-top setting. Every frame includes the image recorded by the robot’s camera, arm pose, force sensor readings, and gripper state. The collection environment, including the camera view, the appearance of the table or bin, and the objects in front of the robot are varied between trials. Since collection is entirely autonomous, large amounts can be cheaply collected across multiple institutions.
The Usage and Implementation of RoboNet
According to the researchers, post collecting a diverse dataset, they experimentally investigated how it could be used to enable general skill learning that transfers to new environments. First, the researchers pre-trained visual dynamics models on a subset of data from RoboNet, and then fine-tuned them to work in an unseen test environment using a small amount of new data. The constructed test environments, all include different lab settings, new cameras and viewpoints, held-out robots, and novel objects purchased after data collection concluded.
After tuning, Sudeep and his fellow researchers deployed the learned dynamics models in the test environment to perform control tasks – like picking and placing objects – using the visual foresight model based reinforcement learning algorithm.
After that, they could then numerically evaluate if their pre-trained controllers can pick up skills in new environments faster than a randomly initialized one. In each environment, the researchers used a standard set of benchmark tasks to compare the performance of their pre-trained controller against the performance of a model trained only on data from the new environment. The results show that the fine-tuned model is ~4x more likely to complete the benchmark task than the one trained without RoboNet. Impressively, the pre-trained models can even slightly outperform models trained from scratch on significantly (5-20x) more data from the test environment. This suggests that transfer from RoboNet does indeed offer large performance gains compared to training from scratch!
Researchers Involved in RoboNet
Frederik is a PhD student in Computer Science at UC Berkeley advised by Prof. Sergey Levine and Prof. Chelsea Finn (Stanford CS department). In his research at the Berkeley Artificial Intelligence Laboratory (BAIR) he focuses on the development of algorithms for robotic manipulation using techniques from deep learning, deep reinforcement learning and classical robotics. He completed a Bachelor’s degree in mechatronics and information technology and a master’s degree in “Robotics Cognition Intelligence” at TU Munich (TUM).
Previously he has worked at the mechatronics institute of the German Aerospace Center (DLR) on the mechanical design and control system of a quadruped robot.
Stephen is a third-year Electrical Engineering and Computer Science student at UC Berkeley.
He is fortunate to work with Prof. Sergey Levine and his mentor Frederik Ebert as an undergraduate researcher in the Robotic AI & Learning Lab. His research interests currently lie in robotics and reinforcement learning. He recently spent an amazing summer working with Dr. Roberto Calandra and others of the Facebook AI Research Robotics team.
Suraj is a PhD student in Computer Science at Stanford University, where he works at the intersection of machine learning, computer vision, and robotics. Specifically, he is interested in problems relating to multi-task learning, hierarchical reinforcement learning, and perception for robotics. He is co-advised by Professors Chelsea Finn and Silvio Savarese, and is funded by the National Science Foundation Graduate Fellowship.
Suraj completed his Bachelors in Computer Science at the California Institute of Technology (Caltech), where he worked with Yisong Yue on multi-agent reinforcement learning. In the past he has worked at Google Brain and General Electric Current.
Bernadette is a PhD Student in the GRASP lab at University of Pennsylvania advised by Dr. Kostas Daniilidis. Her research interests broadly lie in developing meaningful representations of sensory data in robotic systems for intelligent autonomous decision making. Her current work focuses on neuromorphic approaches to perceptual decision making.
Prior to starting her PhD, Bernadette was a Senior Software Engineer at Lockheed Martin Corporation where she worked from 2014 to 2019. She received an M.A. in Mathematics, M.A. in Economics, and B.S. in Mathematics and Economics from The University of Alabama in 2014.
Karl is a Robotics Masters student at the University of Pennsylvania. He graduated from the University of Massachusetts Amherst with a major in Computer Science, a concentration in Robotics, and a minor in Mathematics. He completed his undergraduate thesis on Lazy Localization under Professor Roderic A. Grupen at the Laboratory for Perceptual Robotics. Between earning his undergraduate degree and starting his masters, he spent six months working at MIT Lincoln Laboratory on semantic mapping, visual slam, and augmented reality. He is interested in applying artificial intelligence to allow robots to act in complex, changing, and uncertain environments.
Siddharth is a Masters student at University of Pennsylvania pursuing Electrical Engineering and he works in the domain of General Robotics. He believes in a pull-your-sleeves-up and get-it-done approach. He is familiar with the various aspects of Robotic Systems Development including and not limited to Path/Trajectory Planning and Tracking, Control System Design, mapping and bayesian filtering techniques. He is also familiar with Geometric Perception based techniques and Learning based Vision paradigms. Siddharth is now trying to bridge the gap between the planning/control frameworks and the Vision frameworks.
Sergey received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.
Chelsea is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Her lab, IRIS, studies intelligence through robotic interaction at scale, and is affiliated with SAIL and the Statistical ML Group. She also spends time at Google as a part of the Google Brain team. Chelsea is interested in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. Previously, she completed her Ph.D. in computer science at UC Berkeley and her B.S. in electrical engineering and computer science at MIT.
Sudeep is a PhD student at the Robotics Institute in Carnegie Mellon’s School of Computer Science. He aspires to build scalable robotic learning algorithms, which can parse the visual world and enable autonomous agents to perform complex tasks in diverse environments. He is advised by Professor Abhinav Gupta.
In a prior life, he was an undergraduate student at UC Berkeley – where he worked with Professor Sergey Levine on deep reinforcement learning/machine learning research. He also worked at Los Alamos National Laboratory with Dr. David Mascareñas on cyber-phsyical systems research.
Sudeep Dasari’s Thoughts on RoboNet
According to Sudeep, “this work takes the first step towards creating learned robotic agents that can operate in a wide range of environments and across different hardware. While our experiments primarily explore model-based reinforcement learning, we hope that RoboNet will inspire the broader robotics and reinforcement learning communities to investigate how to scale model-based or model-free RL algorithms to meet the complexity and diversity of the real world.”
He further added, “since the dataset is extensible, we encourage other researchers to contribute the data generated from their experiments back into RoboNet. After all, any data containing robot telemetry and video could be useful to someone else, so long as it contains the right documentation. In the long term, we believe this process will iteratively strengthen the dataset, and thus allow our algorithms that use it to achieve greater levels of generalization across tasks, environments, robots, and experimental set-ups.”