Quantitativeo data processing and analysis
Data ReviewViewing and preparation: aall data revisions, cleansing, transformations and systematisation steps. This process can be done in different ways. For example, if the data a programming language is used for processing, then code or a script with descriptive comments can be saved. If the data processing is performed in so-called point and click programmes, pieto the mayor, Excel, SPSS and Stata, then the processing steps are preferably documented at ReadMe in the files or other forms documentation.
important to record:
-
Missing values (in English: missing values) handling: how did you handle the missing values, e.g. did you fill with the mean or NA, etc.?
-
Non-characteristic values or outliers (in English: outliers) handling: how did you identify and handle values that are illogical or significantly different from other values, e.g. omit them, transform them, etc.?
-
Data transformations: do you perform data transformations such as normalisation, logarithmic transformation? Why and how?
-
Coding and categorisation: describe how you coded or categorised the data, e.g. did you create age groups?
Data analysis methods: describe in detail all statistical methods and tests used.
-
Descriptive statistics (English: descriptive statistics): indicate which descriptive statistics you calculated, e.g. arithmetic mean, median, standard deviation, frequency
-
Follow-up statistics: inferential statistics): if you use inferential statistical methods such as t-test, ANOVA, regression analysis, please provide details
-
-
Specific tests: name the tests, e.g. two independent samples t-test, Pearson correlation coefficient
-
-
Assumptions (English: assumptions): verify and document that the data are consistent with the assumptions of the tests used, e.g. normal distribution, homogeneity
-
-
Statistical significance: indicate P values and significance level, e.g. p<0.05
Software and toolsIn your data management plan, please indicate the software and tools you use to process and analyse the data. For quantitative data processing and analysis, it is recommended to use tools that can document the processing, transformation and analysis of data in a scripted sequence. Programming languages such as R and Pythonand data analysis software based on them, RStudio and JupyterLabwhich offers many different packages for data processing, analysis and visualisation. Saves the scripts you create so that the data processing and analysis steps are easy to repeat when needed. It is recommended to publish the scripts together with the dataset to facilitate the implementation of open science principles.
Data processing and analysis tools such as Excel are popular among researchers , SPSS and Stata. If the data are worked with in these applications, and no syntax files are created to store the processing and analysis scripts, then special care must be taken to document the actions performed on the data. This can be done ReadMe in a file, codebook or other documentation.
Visualisation and interpretation of results: clearly and concisely describe the results of the statistical analysis. Include tables, graphs and charts to visualise the data. Interpret the results and explain what they mean in the context of the study.
Data visualisation helps to better understand and interpret data. Reproducible data visualisation is possible using programming language packages such as ggplot2 (R package) and matplotlib(Python package). Tools such as Tableau, Power BI and Looker Studio, which provide interactive graphs and charts, but they are not always free. If these environments provide scripting, save them and add them to the data; if not, describe the steps taken.