Key Robotics Datasets and Open-Source Code

Explore top LinkedIn content from expert professionals.

Summary

Key robotics datasets and open-source code are resources that provide researchers and engineers with essential information and tools to develop, train, and test robots for tasks like movement, manipulation, and sensing. These datasets typically include recorded robot actions, sensor data, and supporting code, making cutting-edge robotics research more accessible to everyone.

  • Explore diverse datasets: Take advantage of publicly available collections that feature various robot tasks, environments, and sensory inputs to broaden your research or project scope.
  • Utilize open-source tools: Download and apply code libraries and tutorials that accompany the datasets to speed up development and testing without starting from scratch.
  • Join collaborative projects: Engage with research communities sharing new datasets and code to stay updated and contribute to ongoing advancements in robotics.
Summarized by AI based on LinkedIn member posts
  • View profile for Madhur Behl

    Robotics & AI Professor | Team Principal @Cavalier Autonomous Racing | Amazon Scholar

    6,864 followers

    Announcing the release of RACECAR, the world’s first full-scale high-speed autonomous racing open dataset! 🏎 The dataset contains 11 interesting racing scenarios across two race tracks which include solo laps, multi-agent laps, overtaking situations, high-accelerations, banked tracks, obstacle avoidance, pit entry and exit at different speeds. Multi-Sensor (LIDAR, GNSS, RADAR, and Camera) data is available in both Open Robotics #ros2 and nuTonomy #nuScenes format, providing flexibility for researchers interested in robotics, computer vision, and autonomous driving. Six university teams who raced in the Indy Autonomous Challenge during 2021-22 season have contributed to this dataset. I would like to express my sincere appreciation to these teams for their valuable contributions: Cavalier Autonomous Racing FTM Institute of Automotive Technology TUM KAIST MIT-PITT-RW PoliMOVE Autonomous Racing Team TII Unimore Racing A paper authored by Amar Kulkarni John Chrosniak Emory Ducote Florian Sauerbeck Andrew Saba Utkarsh Chirimar John L. Marcello Cellina Madhur Behl will be presented at IROS '23, providing further insights into the dataset and its applications including benchmarking problems in #mapping, #localization, and #objectdetection To access the RACECAR dataset and accompanying tutorials, please visit: Data and Code: https://lnkd.in/e723QiH3 Paper: https://lnkd.in/eDMrBeHp Demo Reel: https://lnkd.in/esxMmgcH We are also grateful to the Amazon Web Services (AWS) Open Data program for their support in facilitating the sharing of this data. This is a step towards making full-scale autonomous racing accessible to the wider research community. I invite you to explore this groundbreaking dataset and pushing the boundaries of autonomous racing research! #autonomousvehicles #autonomousracing #ai #computervision #robotics #data #iros23 #research #opendata #technology

  • View profile for Jason Corso

    Toyota Professor of AI at Michigan | Voxel51 Co-Founder and Chief Scientist | Creator, Builder, Writer, Coder, Human

    23,367 followers

    Mistakes! Mistakes! Mistakes! Physical AI assistants today can tell you "that's wrong"—but not 𝘄𝗵𝗮𝘁 went wrong, 𝘄𝗵𝗲𝗻 it became irreversible, or 𝘄𝗵𝗲𝗿𝗲 in the frame the mistake lives. That's like a teacher who only marks an X on your paper without any explanation. Our latest work, Mistake Attribution (MATT), 𝘁𝗼 𝗮𝗽𝗽𝗲𝗮𝗿 𝗮𝘁 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟲, goes beyond mistake detection in egocentric videos. The challenge is "fine-grained understanding." When someone picks up a bolt instead of a hammer, the current problem definition can flag the error, but they can't tell you which part of the instruction was violated, pinpoint the exact frame where recovery became impossible, or localize the mistake region in that frame. This greatly limits the practical value of physically-grounded upskilling with Physical AI assistants. We solve this with two contributions. First, MisEngine, a data engine that automatically constructs mistake datasets from existing action-recognition corpora, producing datasets two orders of magnitude larger than anything previously available. Second, MisFormer, a unified model that jointly attributes mistakes along semantic, temporal, and spatial dimensions, outperforming task-specific methods across the board as a single model. Full information about the work is included in the links below, including open-source code, pre-trained models, and our new datasets (Ego4D-M and EPIC-KITCHENS-M) with full attribution annotations. 📄 Paper: https://lnkd.in/e5ySVbwh 🌐 Project page: https://lnkd.in/e5nV_Qe9 💻 Code: https://lnkd.in/eHgbtyt4 🤗 Dataset and Weights: https://lnkd.in/ethnFkPQ Coauthors: Yayuan Li, Aadit J., Filippos Bellos From my teams at Voxel51 and University of Michigan College of Engineering University of Michigan Robotics Department Electrical and Computer Engineering at the University of Michigan Michigan AI Lab

  • View profile for Carmelo (Carlo) Sferrazza

    Incoming Assistant Professor at UT Austin | Sr. Applied Scientist at Amazon FAR | Postdoc UC Berkeley | PhD ETH Robotics | Artificial Intelligence | Humanoids | Tactile Sensing

    1,726 followers

    Ever wondered what robots 🤖 could achieve if they could not just see – but also feel and hear? We introduce FuSe: a recipe for finetuning large vision-language-action (VLA) models with heterogeneous sensory data, such as vision, touch, sound, and more. We use language instructions to ground all sensing modalities by introducing two auxiliary losses. In fact, we find that naively finetuning on a small-scale multimodal dataset results in the VLA over-relying on vision, ignoring much sparser tactile and auditory signals. By using FuSe, pretrained generalist robot policies finetuned on multimodal data consistently outperform baselines finetuned only on vision data. This is particularly evident in tasks with partial visual observability, such as grabbing objects from a shopping bag. FuSe policies reason jointly over vision, touch, and sound, enabling tasks such as multimodal disambiguation, generation of object descriptions upon interaction, and compositional cross-modal prompting (e.g., “press the button with the same color as the soft object”). Moreover, we find that the same general recipe is applicable to generalist policies with diverse architectures, including a large 3B VLA with a PaliGemma vision-language-model backbone. We open source the code and the models, as well as the dataset, which comprises 27k (!) action-labeled robot trajectories with visual, inertial, tactile, and auditory observations. This work is the result of an amazing collaboration at Berkeley Artificial Intelligence Research with the other co-leads Joshua Jones and Oier Mees, as well as Kyle Stachowicz, Pieter Abbeel, and Sergey Levine! Paper: https://lnkd.in/dDU-HZz9 Website: https://lnkd.in/d7A76t8e Code: https://lnkd.in/d_96t3Du Models and dataset: https://lnkd.in/d9Er5Jsx 

  • View profile for Aaron Prather

    Director, Robotics & Autonomous Systems Program at ASTM International

    84,858 followers

    Meet 𝐃𝐑𝐎𝐈𝐃, a large-scale, in-the-wild robot manipulation dataset with input from numerous universities and R&D organizations. Creating large, diverse, high-quality robot manipulation datasets represents a crucial milestone in advancing more capable and robust robotic manipulation policies. However, generating such datasets presents significant challenges. Collecting robot manipulation data across varied environments entails logistical and safety hurdles and substantial hardware and human resources investments. Consequently, contemporary robot manipulation policies primarily rely on data from a limited number of environments, resulting in constrained scene and task diversity. In their collaborative endeavor, the authors introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset comprising 76k demonstration trajectories or 350 hours of interaction data. This dataset was meticulously amassed across 564 scenes and 86 tasks, with contributions from 50 data collectors across North America, Asia, and Europe over 12 months. Their collective efforts illustrate that training with DROID yields policies characterized by enhanced performance, increased robustness, and superior generalization capabilities. The authors also make the entire dataset, along with the code for policy training and a comprehensive guide for replicating their robot hardware setup, available as open-source resources. Universities and Organizations that made up the DROID dataset team include: Stanford University University of California, Berkeley Toyota Research Institute Carnegie Mellon University The University of Texas at Austin Université de Montréal The University of Edinburgh Princeton University Columbia University University of Washington KAIST UC San Diego Google DeepMind University of California, Davis University of Pennsylvania 📝 Research Paper: https://lnkd.in/gGFFsKYK 📊 Project Page: https://lnkd.in/gbH8kqfv 🖥️ Dataset: https://lnkd.in/g5akx89p

  • View profile for Sina Pourghodrat (PhD)

    Surgical Robotics Engineer

    9,449 followers

    🚀Amazon FAR (Frontier AI & Robotics) introduce OmniRetarget: teaching humanoids to interact with objects and their environment, just like humans do 𝘏𝘦𝘳𝘦’𝘴 𝘵𝘩𝘦 𝘮𝘢𝘪𝘯 𝘪𝘥𝘦𝘢 (𝘴𝘪𝘮𝘱𝘭𝘪𝘧𝘪𝘦𝘥): In robotics, teaching humanoid complex skills means showing them how humans move and interact — but just copying human motions (or using them as kinematic references) don’t work cleanly. Human body vs. robot body: not the same shape, not the same joints, not the same kinematics. On top of that, interactions (touching objects, walking on surfaces) are often lost or distorted during retargeting (the process of adapting human motions to robot bodies). But OmniRetarget fixes these. What is OmniRetarget? A system that converts human motion + human scenes into robot-compatible motion while preserving interactions (contacts, spatial relations) with objects and terrain. Uses an interaction mesh to model where contacts happen (hand touching box, feet on ground) and keeps them consistent when mapping to a robot. From one demonstration (a recording of a human performing the task), it can generate many variations: different robots, object positions, terrains. Why it’s better than older approaches? Older methods often ignore interaction preservation, leading to artifacts like foot sliding or unrealistic motions. OmniRetarget enforces both robot limits (joints, geometry) and real interactions (which part touches what) at the same time. Produces 8+ hours of high-quality trajectories, beating baselines in realism and consistency. Trained reinforcement learning (RL) policies can now perform long, complex tasks (up to 30 seconds) on a physical humanoid (Unitree G1). 📖Open-source contribution They are releasing the OmniRetarget Dataset — over 8 hours of humanoid loco-manipulation and interaction data — freely available on Hugging Face: [https://lnkd.in/eYBn2hfe] Why this matters? Robots don’t just need to move, they must interact with the world. High-quality, interaction-aware data has been a major bottleneck. OmniRetarget makes this data available to the community, helping researchers and companies build humanoids that can operate in cluttered, object-rich environments. 📖 Full paper: https://lnkd.in/ej2But4W 👩💻GitHub: https://lnkd.in/ejmUahtr 👩🔬 Authors: Lujie Yang, Xiaoyu Huang, Zhen Wu, Angjoo Kanazawa, Pieter Abbeel, Carmelo Sferrazza, C. Karen Liu, Rocky Duan, Guanya Shi Thank you, Lujie Yang, for giving permission to use the video: Video: Unitree G1 humanoid carries a chair, climbs, leaps, and rolls, all in real time, using only its own body senses (no vision or LiDAR). A big step toward agile, human-like loco-manipulation.

  • View profile for Adithya Murali

    Staff Research Scientist at NVIDIA | MIT TR35, Prev CMU PhD, Berkeley AI Research

    3,176 followers

    We have apparently reached “peak data” in training LLMs. But it's the opposite story in Physical AI - researchers still scramble to collect high-quality data on robots. To address this gap, we just released Physical AI Dataset initiative on HuggingFace at #GTC this year. We released a suite of commercial-grade, pre-validated datasets which we hope helps the research community to build next-gen models in robotics - from AV development, robotic manipulation to dexterity. We also open-sourced a massive dataset for robotic grasping- over 57 million grasps, computed for the Objaverse XL (LVIS) dataset. These grasps are specific to three common grippers: Franka Panda, the Robotiq-2f-140 for industrial picking and a suction gripper. Blog: https://lnkd.in/g_5-8iV5 Dataset: https://lnkd.in/gj98gXxT

Explore categories