There are a variety of reasons for both researchers and society at large to make research data available.
- Reusing data saves money and reduces duplication of effort. The researchers can focus on producing new data instead of collecting data that already exists. Easier to assess whether it is worth conducting a new study
- A greater impact - citations and downloads of data give recognition to the researcher
- Shared data means benefit to society - publicly funded research should be freely available to the public.
- More visibility generates more collaborations - other researchers become aware of your research and data, which can contribute to new contacts and collaboration opportunities.
- Research becomes more transparent; by sharing the data, the possibility of checking and repeating research results increases.
- The data meets the so-called FAIR principles, which are guidelines for providing support for ensuring that data is accessible and reusable for both machines and humans. FAIR stands for Findable, Accessible, Interoperable and Reusable.
- Meet the requirements and recommendations that publishers, funders and educational institutions may have on sharing data.
How you as a researcher should share your research data depends a lot on the kind of data it is about, for example, video recordings with children require a different procedure than data on soil samples. To get support on how to proceed in your particular case, contact the DAU function at Högskolan Väst.
There are several different ways to publish and make data available. Research data can be published together with a publication or as a separate data set. Sometimes the data is made available on a publisher's website, but even more often it is deposited in a so-called data repository
An important step in being able to share your research data is that the data has been handled correctly, i.e. that it has been documented and stored in a clear and accessible way so that it is easy for future users to take part in it. By planning the data management and establishing a data management plan, the sharing of data is also facilitated.
As mentioned earlier, each researcher's data needs to be looked at on a case-by-case basis and whether it is possible to share it openly. There may be ethical, legal or copyright reasons not to share and publish research data. This may apply to data that, for example, contains sensitive personal data. Another aspect is confidential data that contains, for example, company secrets.
"As open as possible, as closed as necessary"
The quote above illustrates the approach to making research data available - that it should be as open as possible to promote reuse and speed up research but at the same time be closed to protect the privacy of research participants.
However, this does not mean that even if you as a researcher have sensitive research data you will never have to share it. The majority of research data in Sweden is carried out at universities and is covered by the principle of public disclosure, which means that it can be disclosed to other researchers after ethics or confidentiality review. Therefore, research participants should not be promised that the data will never be disclosed to others.
So even if it is not possible to share research data completely openly, there is always the option of writing a description of the data, so-called metadata, and making it available.
In order to protect research data and make it clear to future users how it may be reused, the author can set a license on the data. Often it is a so-called Creative Commons license (CC license) that is most common. A CC license can clarify how and if the data may, for example, be transformed, shared and adapted and in what contexts. There are several different CC licenses and many have in common that the author must have recognition if the data is used.
Embargo means in this context that there is a longer period of time between the data being published until it is allowed to be reused. Embargo is common for articles in the context of publication, but relatively uncommon for research data. Examples of the application of embargoes are when the data may contain trade secrets for a company. In general, embargoes for publicly funded research data are not recommended unless there is an absolute requirement.
A data repository is a database that stores and makes available research data or descriptions of the data. By publishing your research data in a repository, you give your data a description (metadata) so that it becomes interoperable and possible to reuse.
Your research data is also given a so-called persistent identifier (PID), making it easier to find and cite. A persistent identifier is a unique ID number that identifies an object permanently. Examples are ISBN for books and DOI for articles. The latter is also used when identifying research datasets.
Examples of repositories
There are many data repositories worldwide where data can be shared and published.
- DORIS - is a Swedish national and certified tool for publishing data provided by the consortium Svensk Nationell Datatjänst (SND) and is free. There, research data can be shared openly or made searchable through published descriptions of datasets that for various reasons cannot be shared openly. This is the best option for sharing research data that contains sensitive data such as personal data.
- Zenodo - is created by the European research organizations CERN and OpenAire. Here you can load up to 50 GB free of charge.
- Figshare - is a comprehensive repository and accessible to all disciplines. It is possible to load up to 20 GB free of charge.
- Re3data - is a register of both general and subject-specific data repositories. Here you can search and filter on different categories.