You are currently on a failover version of the Materials Cloud Archive hosted at CINECA, Italy.
Click here to access the main Materials Cloud Archive.
Note: If the link above redirects you to this page, it means that the Archive is currently offline due to maintenance. We will be back online as soon as possible.
This version is read-only: you can view published records and download files, but you cannot create new records or make changes to existing ones.

×

Recommended by

Indexed by

Machine learning on multiple topological materials datasets

Yuqing He1,2, Pierre-Paul De Breuck1, Hongming Weng2, Matteo Giantomassi1, Gian-Marco Rignanese1*

1 UCLouvain, Institut de la Matiere Condensée et des Nanosciences (IMCN), Chemin des Étoiles 8, Louvain-la-Neuve 1348, Belgium

2 Beijing National Laboratory for Condensed Matter Physics and Institute of Physics,Chinese Academy of Sciences, Beijing, China

* Corresponding authors emails: gian-marco.rignanese@uclouvain.be
DOI10.24435/materialscloud:zk-gc [version v2]

Publication date: Feb 26, 2025

How to cite this record

Yuqing He, Pierre-Paul De Breuck, Hongming Weng, Matteo Giantomassi, Gian-Marco Rignanese, Machine learning on multiple topological materials datasets, Materials Cloud Archive 2025.32 (2025), https://doi.org/10.24435/materialscloud:zk-gc

Description

A dataset of 35,608 materials with their topological properties is constructed by combining the density functional theory (DFT) results of Materiae and the Topological Materials Database. Thanks to this, machine-learning approaches are developed to categorize materials into five distinct topological types, with the XGBoost model achieving an impressive 85.2% classification accuracy. By conducting generalization tests on different sub-datasets, differences are identified between the original datasets in terms of topological types, chemical elements, unknown magnetic compounds, and feature space coverage. Their impact on model performance is analyzed. Turning to the simpler binary classification between trivial insulators and nontrivial topological materials, three different approaches are also tested. Key characteristics influencing material topology are identified, with the maximum packing efficiency and the fraction of p valence electrons being highlighted as critical features.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
topoclass.ipynb
MD5md5:e97367839d6f923849f6ab81ff7830f8
48.8 KiB Python notebook to analyze the data
topoclass.json.gz
MD5md5:57066e24446f9f128c63a8e1698880de
515.6 MiB JSON gzipped file with all the data
topoclass.chemiscope.json.gz
MD5md5:d9c6901e2781b55382233974c5360bfc
Visualize on Chemiscope
19.8 MiB Chemiscope visualization of the dataset, that can be used interactively on Materials Cloud
optimade.jsonl
MD5md5:2859bad2900c0ad1057e8b3d357f452e
101.8 MiB OPTIMADE file with all the data
optimade.yaml
MD5md5:2b92ac74ef6a143d4ca43807b007da2a
Go to the OPTIMADE API
237 Bytes OPTIMADE yaml configuration file

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Website
Journal reference (Paper describing the work performed)
Yuqing He at al., in preparation

Keywords

topological materials machine learning database

Version history:

2025.32 (version v2) [This version] Feb 26, 2025 DOI10.24435/materialscloud:zk-gc
2025.27 (version v1) Feb 12, 2025 DOI10.24435/materialscloud:xx-xb