Xiaogang (Marshall) Ma, a geoinformatics and data science researcher at the University of Idaho, has been leading a team to ensure that Mindat, a powerful mineral database, provides FAIR (findable, accessible, interoperable, and reusable) data.
Ma recently worked with the creator of Mindat, Jolyon Ralph, alongside Anirudh Prabhu and Shaunna Morrison of the Carnegie Institution for Science, to publish how the team is accomplishing this lofty feat in the Geoscience Data Journal.
Their work, which was funded by the National Science Foundation, has been documented in an article entitled OpenMindat: Open and FAIR mineralogy data from the Mindat database.
While the Mindat website is popular among Earth science researchers and educators, a machine interface for open data had never been established before this project.
“We first cleansed the data within the overall database to improve the quality of each entry,” Ma said. “Next, we built a data sharing platform and machine interface that allows both researchers and search-bots to find applicable data that suits their needs.”
The team now plans to develop software packages in R and Python to make it easier for interested users to query and download datasets from the platform. An initial open-source R package is already made available on GitHub. With this package, massive records, such as the 5,960 mineral species approved by the International Mineralogical Association, can be retrieved with just a few rows of code. In the next two years, the team will organize datathon activities for Earth and planetary scientists to learn the open data platform and the necessary skills of data access and analysis. Several team members will also give presentations about the work at community conferences, such as those organized by the Geological Society of America and the American Geophysical Union.
This work was funded by National Science Foundation award number 2126315.