Bartoszek K, Guidotti E, Iacus SM, Okrój M
Nonlinear Dyn - (-) 1-29 [2020-06-26; online 2020-06-26]
As the COVID-19 outbreak is developing the two most frequently reported statistics seem to be the raw confirmed case and case fatalities counts. Focusing on Italy, one of the hardest hit countries, we look at how these two values could be put in perspective to reflect the dynamics of the virus spread. In particular, we find that merely considering the confirmed case counts would be very misleading. The number of daily tests grows, while the daily fraction of confirmed cases to total tests has a change point. It (depending on region) generally increases with strong fluctuations till (around, depending on region) 15-22 March and then decreases linearly after. Combined with the increasing trend of daily performed tests, the raw confirmed case counts are not representative of the situation and are confounded with the sampling effort. This we observe when regressing on time the logged fraction of positive tests and for comparison the logged raw confirmed count. Hence, calibrating model parameters for this virus's dynamics should not be done based only on confirmed case counts (without rescaling by the number of tests), but take also fatalities and hospitalization count under consideration as variables not prone to be distorted by testing efforts. Furthermore, reporting statistics on the national level does not say much about the dynamics of the disease, which are taking place at the regional level. These findings are based on the official data of total death counts up to 15 April 2020 released by ISTAT and up to 10 May 2020 for the number of cases. In this work, we do not fit models but we rather investigate whether this task is possible at all. This work also informs about a new tool to collect and harmonize official statistics coming from different sources in the form of a package for the R statistical environment and presents the "COVID-19 Data Hub."