The open-source hub of ML-ready quantum datasets

Harness the power of quantum mechanics data with just one line of code.


from openqdc.datasets import Spice

dataset = Spice(
    energy_unit="kcal/mol",
    distance_unit="ang",
    array_format = "torch"
)

first_entry = dataset[0] # dict of torch array
for data in dataset.as_iter(atoms=True):
    print(data) # Atoms object

What is openQDC?

OpenQDC is a dataset repo and library that contains nearly 40 datasets, covering 1.5 billion geometries across 70 atom species and 250+ QM methods. Every dataset is ready to use in one line of code.

Explore the datasets

The quantum methods for all datasets have been checked and standardized. The datasets have also been annotated with metadata like units and labels.