You are currently on a failover version of the Materials Cloud Archive hosted at CINECA, Italy.
Click here to access the main Materials Cloud Archive.
Note: If the link above redirects you to this page, it means that the Archive is currently offline due to maintenance. We will be back online as soon as possible.
This version is read-only: you can view published records and download files, but you cannot create new records or make changes to existing ones.
{ "updated": "2024-12-17T10:14:49.705919+00:00", "created": "2024-12-13T08:44:55.390836+00:00", "revision": 4, "id": "2481", "metadata": { "title": "Reproducible HPC software deployments, simulations and workflows", "edited_by": 576, "is_last": true, "doi": "10.24435/materialscloud:b4-ex", "license": "Creative Commons Attribution Share Alike 4.0 International", "publication_date": "Dec 17, 2024, 11:14:49", "mcid": "2024.202", "contributors": [ { "email": "lars.bilke@ufz.de", "affiliations": [ "Helmholtz Centre for Environmental Research \u2013 UFZ, Leipzig, Germany" ], "familyname": "Bilke", "givennames": "Lars" }, { "affiliations": [ "Helmholtz Centre for Environmental Research \u2013 UFZ, Leipzig, Germany" ], "familyname": "Fischer", "givennames": "Thomas" }, { "affiliations": [ "Helmholtz Centre for Environmental Research \u2013 UFZ, Leipzig, Germany" ], "familyname": "Meisel", "givennames": "Tobias" }, { "affiliations": [ "Helmholtz Centre for Environmental Research \u2013 UFZ, Leipzig, Germany" ], "familyname": "Naumov", "givennames": "Dmitri" } ], "status": "published", "owner": 1594, "description": "Reproducibility in running scientific simulations on high-performance computing (HPC) environments is a persistent challenge due to variations in software and hardware stacks. Differences in software versions or hardware-specific optimizations often lead to discrepancies in simulation outputs. While Linux containers are commonly used to standardize software environments, tools like Docker lack reproducibility in image creation, requiring archiving of binary image blobs for future use. This method turns containers into black boxes, preventing verification of how the contained software was built.\n\nIn the linked paper, we demonstrate how we use GNU Guix to create our software stack bit-by-bit reproducible from a source bootstrap. Our approach incorporates a portable OpenMPI implementation, optimized software builds, and deployment via Apptainer images across three HPC environments. We show that our reproducible software stack facilitates consistent multi-physics simulations and complex workflows on diverse HPC platforms, exemplified by the OpenGeoSys software project. To ensure provenance of our findings, we utilized the AiiDA workflow manager.\n\nThis dataset includes the complete AiiDA provenance database underlying the results presented in the paper. The AiiDA workflow itself is defined in and can be reproduced with this repository: https://gitlab.opengeosys.org/bilke/hpc-container-study.", "_files": [ { "checksum": "md5:3fc102a23a330522dd01dca25ec1ec01", "size": 2696712815, "description": "AiiDA archive", "key": "hpc-container-study-1.aiida" } ], "keywords": [ "OpenGeoSys", "AiiDA", "HPC", "workflows", "reproducibility", "GNU Guix" ], "conceptrecid": "2480", "references": [ { "citation": "L. Bilke et al, 2025: Reproducible HPC software deployments, simulations and workflows (in preparation)", "type": "Preprint" } ], "version": 1, "_oai": { "id": "oai:materialscloud.org:2481" }, "license_addendum": null, "id": "2481" } }