Basic concepts of data governance

Answer toA: Research data is any information collected, observed or generated in the course of a research project that serves as a basis for obtaining research results and drawing conclusions. Research data can be:
  • Quantitative data such as temperature or pressure measurements in laboratory experiments
  • Textual data, such as interview transcripts or notes from literature analysis
  • Image data, such as microscopy or geographical maps
  • Video data, such as recordings of experiments or field studies
  • Audio data, such as recordings of interviews with researchers or sound observations
  • Software code, such as data analysis or modelling scripts
Answer toA: Data that are not directly related to scientific research are not considered research data. Research data are not:
  • Administrative records of the study, such as financial statements or personnel files
  • Commercial or private communications in the study, such as emails or correspondence
  • Legal documents such as employment contracts or cooperation agreements
  • Marketing material for the study, such as promotional leaflets
These data do not contribute to the scientific analysis or evidence base of the research project and therefore do not qualify as research data.
Answer toA: Research data management (RDM) is a systematic approach that involves planning, collecting, storing, documenting, sharing and archiving data. It is essential to ensure the quality of data, its long-term availability and its reusability in other studies. The PDP helps to comply with legal and ethical requirements, as well as ensuring compliance with the requirements of the study funders.
Answer to: FAIR principles are guidelines that helps improve the management and sharing of research data. The principles state that research data must be made:
  • Findable (Findable: data and its metadata are easily found by other researchers and systems
  • Accessible (accessible): the data are available and the conditions for access are clearly stated
  • Interoperable (interoperable): data are compatible with other systems and datasets
  • Reusable (reusable): data have been prepared in such a way that they can be reused in the future
For more on FAIR, see here.
Answer toA: Open data is publicly available research data that can be freely used, shared and analysed. However, research data does not always need to be made open data. Research data should not be released as open data if it contains personal data, such as names of respondents or health information, which must be protected under the GDPR, or commercial information, such as patentable research results protected by intellectual property rights. Similarly, data shall not be made public if this could be harmful to the research participants, authors or the public.
Answer toA: Open data and FAIR data are not synonymous, but they share one goal making data as accessible as possible. The Open Data approach focuses on the unrestricted release of data, while FAIR organising research data efficiently and documenting it sufficiently so that it is easy to find, accessible, interoperable and reusable.
The FAIR principles do not impose a mandatory opening of data, but ensure that data are easily accessible and can be used as widely as possible, preserving confidentiality where necessary. This means that data can comply with the FAIR principles but not be publicly available if protected by privacy or ownership restrictions.
Answer to: Dataset (English: dataset) is a structured set of information and data collected according to the purpose and methodology of the study. The data set is usually organised in tables or other structured forms and consists of a number of data elements or values that have been collected and prepared for analysis. For example, in an epidemiological study, the dataset could include a table with patients’ age, sex, symptoms and treatment outcomes. In a sociological study, the dataset could include tabulated and structured responses of respondents to various survey questions. In the humanities, a dataset can even be physical objects, such as a collection of paintings with notes on them.
A well-organised dataset also includes documentary and explanatory information to help navigate the dataset.
In the context of repositories, a dataset is research data and accompanying documentation deposited or self-archived in an online repository, creating a descriptive metadata record.
Answer toA: To be reusable, data must follow the FAIR principles – findable, accessible, interoperable and reusable. It should be accompanied by detailed and standardised metadata on data structure, format, content and context of acquisition. It is recommended to store the data in a machine-readable format, e.g. CSV, JSON, in a trusted repository with a persistent identifier, e.g. DOI, and clear licensing, e.g, Creative Commons. Tools are available to F-UJI to help assess data compliance with the FAIR principles. Regular checking of availability and standards ensures the long-term usability.

Basic concepts of data governance

Answer toA: Research data is any information collected, observed or generated in the course of a research project that serves as a basis for obtaining research results and drawing conclusions. Research data can be:
  • Quantitative data such as temperature or pressure measurements in laboratory experiments
  • Textual data, such as interview transcripts or notes from literature analysis
  • Image data, such as microscopy or geographical maps
  • Video data, such as recordings of experiments or field studies
  • Audio data, such as recordings of interviews with researchers or sound observations
  • Software code, such as data analysis or modelling scripts
Answer toA: Data that are not directly related to scientific research are not considered research data. Research data are not:
  • Administrative records of the study, such as financial statements or personnel files
  • Commercial or private communications in the study, such as emails or correspondence
  • Legal documents such as employment contracts or cooperation agreements
  • Marketing material for the study, such as promotional leaflets
These data do not contribute to the scientific analysis or evidence base of the research project and therefore do not qualify as research data.
Answer toA: Research data management (RDM) is a systematic approach that involves planning, collecting, storing, documenting, sharing and archiving data. It is essential to ensure the quality of data, its long-term availability and its reusability in other studies. The PDP helps to comply with legal and ethical requirements, as well as ensuring compliance with the requirements of the study funders.
Answer to: FAIR principles are guidelines that helps improve the management and sharing of research data. The principles state that research data must be made:
  • Findable (Findable: data and its metadata are easily found by other researchers and systems
  • Accessible (accessible): the data are available and the conditions for access are clearly stated
  • Interoperable (interoperable): data are compatible with other systems and datasets
  • Reusable (reusable): data have been prepared in such a way that they can be reused in the future
For more on FAIR, see here.
Answer toA: Open data is publicly available research data that can be freely used, shared and analysed. However, research data does not always need to be made open data. Research data should not be released as open data if it contains personal data, such as names of respondents or health information, which must be protected under the GDPR, or commercial information, such as patentable research results protected by intellectual property rights. Similarly, data shall not be made public if this could be harmful to the research participants, authors or the public.
Answer toA: Open data and FAIR data are not synonymous, but they share one goal making data as accessible as possible. The Open Data approach focuses on the unrestricted release of data, while FAIR organising research data efficiently and documenting it sufficiently so that it is easy to find, accessible, interoperable and reusable.
The FAIR principles do not impose a mandatory opening of data, but ensure that data are easily accessible and can be used as widely as possible, preserving confidentiality where necessary. This means that data can comply with the FAIR principles but not be publicly available if protected by privacy or ownership restrictions.
Answer to: Dataset (English: dataset) is a structured set of information and data collected according to the purpose and methodology of the study. The data set is usually organised in tables or other structured forms and consists of a number of data elements or values that have been collected and prepared for analysis. For example, in an epidemiological study, the dataset could include a table with patients’ age, sex, symptoms and treatment outcomes. In a sociological study, the dataset could include tabulated and structured responses of respondents to various survey questions. In the humanities, a dataset can even be physical objects, such as a collection of paintings with notes on them.
A well-organised dataset also includes documentary and explanatory information to help navigate the dataset.
In the context of repositories, a dataset is research data and accompanying documentation deposited or self-archived in an online repository, creating a descriptive metadata record.
Answer toA: To be reusable, data must follow the FAIR principles – findable, accessible, interoperable and reusable. It should be accompanied by detailed and standardised metadata on data structure, format, content and context of acquisition. It is recommended to store the data in a machine-readable format, e.g. CSV, JSON, in a trusted repository with a persistent identifier, e.g. DOI, and clear licensing, e.g, Creative Commons. Tools are available to F-UJI to help assess data compliance with the FAIR principles. Regular checking of availability and standards ensures the long-term usability.