The dataset format matches the format of the input file for the Chemiscope visualization tool (http://chemiscope.org). For this type of usage, we refer the reader to the documentation (https://chemiscope.org/docs/). One of the convenient ways to directly inspect the content of the files in python is the JSON package. In order to use it, one has to unzip the archives and later do, for instance: import json with open('degenerate_ch4.chemiscope.json', 'r') as f: degenerate = json.load(f) print(type(degenerate)) # prints "" dataset contains structures themselves: print(len(degenerate['structures'])) # prints "441" print(degenerate['structures'][0]) # prints "{'size': 5, 'names': ['C', 'H', 'H', 'H', 'H'], 'x': [0.67735027, 1.64813429, 1.24974391, 0.10780292, 0.67492029], # 'y': [0.67735027, 0.6759466, 1.24253147, 0.09820788, 0.66251502], 'z': [1.1, 1.29564905, 1.67717848, 1.66593051, 0.11004904]}" and several properties: print(degenerate['properties'].keys()) # prints "dict_keys(['Delta_2', 'OMFP_2', 'sOMFP_2', 'PS_1', 'OMFP_1', 'energy', 'sOMFP_1', 'Delta_1', 'BS_1', 'BS_2', 'PS_2'])" Delta_i is the perturbation towards i-th (i = 1, 2) singular direction in atomic positions space, features_i is the projection of the features on corresponding singular direction in the feature space. Energy corresponds to the DFT energies in Hartrees calculated using PBE functional and cc-pvdz basis by psi4 (http://www.psicode.org/psi4manual/master/index.html).