Showing posts with label data manipulation. Show all posts
Showing posts with label data manipulation. Show all posts

Friday, April 17, 2020

17/4/20: Error Runs in COVID19 reported data


In my previous post, I referenced some preliminary stats relating to my analysis of significant outliers in the data on the number of cases reported by different countries (see: https://trueeconomics.blogspot.com/2020/04/17420-covid19-updated-charts-and.html). Here are more detailed results:

By country (listing only countries with > 5,000 cases as of April 17, 2020):


And a summary set of statistics (top part is for countries > 5,000 cases and bottom part for countries with > 1,000 cases):

By way of explaining: I have used two methods for detecting observations of 'suspect quality' or 'outliers':

  • Firstly, I am only focusing on outliers below the 'normal reporting trend' (potential under-reported observations);
  • Secondly, I use two criteria (labeled 1. and 2. in the tables above): 1. relates to the cases where an observation is below the trend and reported day count is < 20 cases; 2. relates to the cases where an observation is below the trend and reported day count is < 50
In all of the above, I am only looking at 'suspect' observations after the date when the country in question reported its first 50 or higher daily count. The reason for this is that it is a commonly accepted view that during the early stages of contagion (small number of cases reported daily), the data does not exhibit a trend.

Final explanation: for trend, I used country-fitted power law trends.

Feel free to draw your own conclusions about different countries, based on this data, but for those interested in my insights:

Highest death rate so far on per capita basis is observed in 
  • Belgium at 425.2 persons per 1 million of population, with decent quality of data (potential error range of 0%-9.5%)
  • Spain at 409.4 persons per 1 million of population, with very strong data quality (0%-2.3% potential error range)
  • Italy at 366.9 persons per 1 million of population, with zero potential error in cases reporting
  • France at 267.5 persons per 1 million of population and zero potential error in reported cases
  • UK at 206.5 persons per 1 million of population and zero to 2.5% potential error in reported cases.
The above results broadly remain unchanged when one controls for duration of the contagion period (number of days since > 50 cases were reported for the first time).

Highest rates of infection (per 1 million of population) have been recorded in
  • Spain - 3,913
  • Switzerland - 3,129
  • Belgium - 3,048
  • Italy - 2,796
  • Ireland - 2,743
Adjusting for the days since the onset of contagion, the rates of infection are the highest in:
  • Spain - 91 per day in contagion stage
  • Ireland - 85.45
  • Switzerland - 74.51
  • Belgium - 72.56
  • Portugal - 53.9
Adjusting for the days since the onset of contagion, the rates of infection are the lowest in:
  • India 0.37
  • China 0.67
  • Indonesia 0.69
  • Pakistan 1.03
  • Japan 1.72