Skip to main content

Open Datasets Overview

This section mainly collects:

  • Open-source datasets for embodied AI
  • Data collected via imitation learning / teleoperation
  • Data format conversion and normalization
  • Data visualization and quality-check methods

You can create a dedicated document per dataset later, for example:

  • oxe.md
  • libero.md
  • droid.md
  • bridgev2.md

Each dataset article is recommended to cover:

  • Source of the dataset and the range of tasks it covers
  • Data format and field descriptions
  • License
  • Loading and preprocessing workflow
  • Which models it is suitable for reproducing
  • Common pitfalls encountered in practice