Permanently storing digital archaeological datasets
It is the end of your archaeological research project, and you may be wondering where to deposit your data. After the excavation, all of the finds are drawn, scanned, digitized, and the database is completed. Perhaps you have also accumulated a lot of data through further scientific analysis of the archaeological remains. Some of the archaeological data will make it into the publication, but what about the rest of this large dataset?
The term 'big data' often describes a dataset resulting from scientific research that ends up being rather substantial in both quantity and digital file size. Big data are essentially at the core of the current archaeological research. Yet only a small percentage of the overall archaeological dataset makes its way into the publications. Such dataset is precious and a result of years-long work and a lot of effort, and, in a sense, irreproducible. It is thus essential to ensure its long-term preservation.
Apart from storing and curating the data throughout the project, once the project is finished, the researcher must deposit the final research dataset to preserve its scientific integrity. "In our faculty, people such as myself are happy to offer help with this process in the form of Data Management consultations," says Kate Mokranova, a data management student assistant at the Faculty of Archaeology.
All researchers are encouraged by the Faculty of Archaeology to deposit their data on Dans-EASY - an online archiving system for depositing and reusing research data—and link it through permanent DOI to their publications.
"The Faculty of Archaeology has been using Dans-EASY for depositing datasets for years. However, there still seems to be little general awareness about the steps one has to take before depositing a dataset. That is where data stewards of our faculty, such as me, get involved and help with the whole procedure of depositing on Dans-EASY that may be at first perplexing," remarks Kate.
Among the various things that researchers must consider before and during the data deposition process are the correct Creative Commons licensing, i.e., how much Open Access the overall dataset will be, and sustainable file formats that ensure the long-term preservation of the overall dataset. Besides, it is not sufficient to solely deposit the digital files, rather, these must be supplemented by metadata to make the dataset understandable to others.
"Metadata will ensure that other archaeologists can use the deposited dataset in the future. Thoroughly documenting the actual context of the dataset is, for this reason, extremely important and helps to prevent the data from being misunderstood or used incorrectly in the future. The researcher creating metadata can do so with help of Dans-EASY guidelines. These guidelines help the researchers think through the questions on data documentation, such as what codes and variables should one use to document the data, and how should the files receive their name," explains Kate.