Data distribution analysis – a preliminary approach to quantitative data in biomedical research

Przemysław Guzik; Barbara Więckowska

doi:10.20883/medical.e869

Authors

Przemysław Guzik Department of Cardiology – Intensive Therapy, Poznan University of Medical Sciences, Poland; University Centre for Sports and Medical Studies, Poznan University of Medical Sciences, Poland https://orcid.org/0000-0001-9052-5027
Barbara Więckowska Department of Computer Science and Statistics, Poznan University of Medical Sciences, Poland https://orcid.org/0000-0002-1811-2583

DOI:

https://doi.org/10.20883/medical.e869

Keywords:

statistical analysis, medical research, quantitative data, normal distribution, parametric tests

Abstract

Statistical analysis is an integral part of medical research. It helps transform raw data into meaningful insights, supports hypothesis testing, optimises study design, assesses risk and prognosis, and facilitates evidence-based decision-making. The statistical analysis increases research findings' reliability, validity and generalisability, ultimately advancing medical knowledge and improving patient care. Without it, meaningful analysis of the data collected would be impossible. The conclusions drawn would be unsubstantiated and misleading.

Many health professionals are unfamiliar with statistical analysis and its basic concepts. The analysis of clinical data is an integral part of medical research. Identifying the data type (continuous, quasi-continuous or discrete) and detecting outliers are the first and most important steps. When analysing the data distribution for normality, graphical and numerical methods are recommended. Depending on the type of data distribution, appropriate non-parametric or parametric tests can be used for further analysis. Data that are not normally distributed can be normalised using various mathematical methods (e.g., square root or logarithm) and analysed using parametric tests in the next steps.

This review provides essential explanations of these concepts without using complex mathematical or statistical equations but with several graphical examples of various statistical terms.

Downloads

Download data is not yet available.

References

Sulla ANK. Determinazione empirica di una legge didistribuzione. Giorn Dell’inst Ital Degli Att. 1933;4:89–91. doi: 10.12691/ajams-1-1-2.

Shapiro SS, Wilk MB. An analysis of variance test for normality (complete samples). Biometrika. 1965;52:591–611. doi: 10.2307/2333709.

Royston P. Approximating the Shapiro-Wilk W-test for non-normality. Statistics and computing. 1992;2:117–119. doi: 10.1007/BF01891203.

Royston P. A Toolkit for Testing for Non-Normality in Complete and Censored Samples. The Statistician. 1993;42:37. doi: 10.2307/2348109.

Shapiro SS, Francia RS. An approximate analysis of variance test for normality. Journal of the American statistical Association. 1972;67:215–216. doi: 10.1080/01621459.1972.10481232.

D’Agostino R, Pearson ES. Tests for departure from normality. Empirical results for the distributions of b 2 and√ b. Biometrika. 1973;60:613–622. doi: 10.2307/2335012.

D’Agostino RB, Belanger A, D’Agostino Jr RB. A suggestion for using powerful and informative tests of normality. The American Statistician. 1990;44:316–321. doi: 10.2307/2684359.

Anderson TW, Darling DA. A test of goodness of fit. Journal of the American statistical association. 1954;49:765–769. doi: 10.2307/2281537.

Thode HC. Testing For Normality. New York: CRC Press; 2002.

Jarque CM, Bera AK. Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics letters. 1980;6:255–259. doi: 10.1016/0165-1765(80)90024-5.

Lilliefors HW. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown. Journal of the American Statistical Association. 1967;62:399–402. doi: 10.1080/01621459.1967.10482916.

Lilliefors HW. On the Kolmogorov-Smirnov Test for the Exponential Distribution with Mean Unknown. Journal of the American Statistical Association. 1969;64:387–389. doi: 10.1080/01621459.1969.10500983.

Lobato IN, Velasco C. A simple test of normality for time series. Econ Theory [Internet]. 2004 [cited 2023 Jun 8];20. doi: 10.1017/S0266466604204030.

Mishra P, Pandey CM, Singh U, Gupta A, Sahu C, Keshri A. Descriptive Statistics and Normality Tests for Statistical Data. Ann Card Anaesth. 2019;22:67–72. doi: 10.4103/aca.ACA_157_18. Cited: in: : PMID: 30648682.

Demi̇R S. Comparison of Normality Tests in Terms of Sample Sizes under Different Skewness and Kurtosis Coefficients. International Journal of Assessment Tools in Education. 2022;9:397–409. doi: 10.21449/ijate.1101295.

Farcomeni A, Ventura L. An overview of robust methods in medical research. Stat Methods Med Res. 2012;21:111–133. doi: 10.1177/0962280210385865.