- Dr. Corrie Block
How are States Responding to COVID-19?: Exploratory Questions, Descriptive Statistics & Correlat
On the Coronavirus Tracker, Real Clear Politics provides national and global COVID-19 data for these variables: Deaths, Deaths per 1 million of the population, Tests, Confirmed Cases, Confirmed Case Fatality Rate, Confirmed Cases per 1 million of the population and Seasonal Flu Deaths for a ten year average.
Data analysis suggests statistically significant relationships between many of these variables. Statistical significance is important because statistical significance is used to determine whether an outcome is a result of a relationship between specific variables or if the outcome is the result of chance, indicating that there is no relationship between variables.
Public health does not need to be left to chance as states consider when and how to open up after social distancing and shelter in place orders.
This blog shares the approach taken to explore these data and the relationships that seem to exist between these variables. This data exploration is being shared in a series of blogs.
According to David R. Krathwohl,
We usually don’t think of statistics as useful in an exploratory mode, because their use in the research literature is almost always to provide evidence for or to disconfirm some hypothesis, structure, or model. However, statistics have long been used to check data for unexpected relationships, although that activity is not usually mentioned in the reports. (2009, p. 394).
The Data
The data are constantly being updated, showing moment to moment changes as we face new diagnoses and new fatalities.
It seems appropriate to pause here to offer respect to our fellow human beings impacted by this virus. These data points represent the lives of human beings. It matters to me that we keep human lives in the center of our work with COVID-19.
The data in the following descriptive statistics and analysis were accessed on April 15, 2024 at about 2:53pm EST. First, I used Minitab to draw some pictures of the data and run some descriptive statistics.
Correlations
The questions that guide this exploration are:
· How are states responding to COVID-19?
· What do the data suggest?
· Do variables seem to walk together?
· How do these data help us
o as we consider when to open up?
o as we establish systems to monitor COVID-19 beyond sheltering in place
Asking if data walk together led me to run some correlations.
I ran Pearson Product-Moment correlations. The Pearson correlation assumes that the relationship between two variables is linear. Meaning, a change in one of the two variables is accompanied with a proportional change in the other variable. Thus, the variables seem to have a predictive relationship.
Please note, correlations do not allow us to make causal statements. A correlation cannot be used to say that one variable causes another variable because both variables may be influenced by other variables. The accuracy of prediction tends to increase as the correlation increases. The symbol r represents the correlation and indicates the extent of the relationship between two variables by a number between +1.00 and -1.00. Correlations are usually reported as a two-digit decimal such as .73. Lower correlations suggest poor prediction. A correlation of zero indicates there is no regular relationship between the two scores and we cannot predict one variable with the other.
Statistical Significance
We care about the statistical significance of these correlations because statistical significance is used to determine whether an outcome is a result of a relationship between two specific variables or if the outcome is the result of chance, hence no relationship exists between the variables.
As we consider when and how to open up, we want to make decisions that lead to social interactions that do not spread the virus. We care to make decisions based on predictable relationships suggested by the data.
Statistical significance is interpreted with the p-value in the correlation matrix plots below. Generally, a p-value that is less than or equal to .05 is standard to determine that the correlation is statistically significant. We get concerned about p-values that are right around .05 because these may suggest borderline results that could be due to chance. A p-value that is less than or equal to .001 tends to suggest statistical significance that is probably not borderline and most likely not due to chance.
Our Statistically Significant Correlations
For the most part, these variables seem to walk together based on the correlations. The correlations seem positive, meaning, the two variables vary together, when one is high the other is also high.
Many of the correlations seem to be strong and statistically significant, meaning the appearance of variables walking together may not be because of chance. The state may be doing something that influences the relationship between these two variables.
Tests
Overall, testing seems to be crucial. These data suggest that there are statistically significant relationships between COVID-19 testing and many of the other variables. For the most part, the correlations with testing are .69 and above.
More exploration is needed to find out what states are doing related to testing.
Tests seem to have a strong statistically significant relationship with
· Confirmed Cases, r=.90 and p<.001
· Deaths, r=.86 and p<.001
· Seasonal Flu Deaths (CDC10Yavg), r=.80 and p<.001
· Deaths per 1M, r=.74 and p<.001
· Confirmed Cases per 1M pop, r=.69 and p<.001
Tests seem to have a statistically significant relationship with
· Confirmed Case Fatality Rate, r=.29 and p<.05
Deaths
Deaths seem to have a strong statistically significant relationship with
· Confirmed Cases, r=.99 and p<.001
· Tests, r=.86 and p<.001
· Seasonal Flu Deaths (CDC10-Year Avg), r=.51 and p<.001
Deaths per 1M
Deaths per 1M human beings seem to have a strong statistically significant relationship with
· Confirmed Cases per 1M pop r=.97 and p<.001
· Tests, r=.74 and p<.001
Deaths per 1M human beings seem to have a statistically significant relationship with
· Seasonal Flu Deaths (CDC10Yavg), r=.36 and p<.05
Confirmed Cases
Confirmed Cases seem to have a strong statistically significant relationship with
· Deaths, r=.99 and p<.001
· Tests, r=.90 and p<.001
· Seasonal Flu Deaths (CDC10Yavg), r=.58 and p<.001
Confirmed Case Fatality Rate
Of all these correlations, Confirmed Case Fatality Rate fascinates me the most. I have so many questions about what is happening with Confirmed Case Fatality Rate. On the whole, states seem united to stop COVID-19 fatalities. So, what is going on? Why do we have these lower correlations with Confirmed Case Fatality Rates? Could it be that our efforts to fight this virus are working? I will explore this further in upcoming blogs.
Confirmed Case Fatality Rate seems to have a statistically significant relationship with
· Tests r=.29 and p<.05
· Seasonal Flu Deaths (CDC10Yavg), r=.22 and p<.05
Confirmed Cases per 1M pop
Confirmed Cases per 1M pop seem to have a strong statistically significant relationship with
· Deaths per 1M human beings, r=.97 and p<.001
· Tests, r=.69 and p<.001
Confirmed Cases per 1M pop seem to have a statistically significant relationship with
· Seasonal Flu Deaths (CDC10-Year Avg), r=.30 and p<.05
Seasonal Flu Deaths (CDC10Yavg)
States with more seasonal flu deaths, on a ten year average, may have more COVID-19 cases.
Because the data are in different metrics, a ten year seasonal flu average and real time changing COVID-19 data, we are hesitant to suggest that states use this correlation for action.
The correlation between Seasonal Flu Deaths, for a ten year average, and Confirmed Deaths per 1 M the correlation, r=.30 and p<.05, appears to suggest less of a correlation than the correlation of Seasonal Flu Deaths, for a ten year average, and Confirmed Cases, r=.58 and p<.001.
This leads me to ask, When we have a more common metric do we lose the relationship between Seasonal Flu Deaths and confirmed cases of COVID-19? What’s the correlation between Seasonal Flu Deaths per capita and Confirmed Cases per capita. This correlation will be shared in the next blog.
Conclusions cannot be drawn that lead to opening up if we are using these seasonal flu data for a 10 year average with confirmed cases and/or confirmed cases per 1M.
Similar issues with the Seasonal Flu Deaths, for a ten year average, with Deaths and Deaths per 1M human beings. It seems that Seasonal Flu Deaths, for a ten year average, may have a strong statistically significant relationship with Deaths, r=.51 and p<.001. But, when we consider Seasonal Flu Deaths, for a ten year average, with Deaths per 1M human beings, the correlation does not appear as strong, r=.36 and p<.05.
Because these two sets of correlations using Seasonal Flu Deaths changed with a more standard population measure, per 1M human beings, we hesitate to make any conclusions.
It leads me to ask, what are the correlations for Seasonal Flu Deaths, for a ten year average, with Deaths and Confirmed Cases using a common per captia metric? It may be even better to run correlations of raw data for Seasonal Flu Deaths and Confirmed Cases with 2020 data without using a ten year average of Seasonal Flu Deaths for a measure of comparing and contrasting how states are responding to COVID-19.
Seasonal Flu Deaths (CDC10Yavg) seem to have a strong statistically significant relationship with
· Tests, r=.80 and p<.001
· Confirmed Cases, r=.58 and p<.001
· Deaths, r=.51 and p<.001
Seasonal Flu Deaths (CDC10Yavg) seem to have a statistically significant relationship with
· Deaths per 1M human beings, r=.36 and p<.05
· Confirmed Cases per 1M, r=.30 and p<.05
· Confirmed Case Fatality Rate, r=.22 and p<.05
Concluding #1 in the Series: How are States Responding to COVID-19?
Thank you to our elected officials, who from the onset, chose the sanctity of human lives with social distancing, shelter in place orders and pauses on money owed for goods and services such as utilities and rent/mortgages.
It is imperative that data analysis continues to be used as our elected officials work together across party lines to determine a good time to open up.
The publicly accessible data from the Coronavirus Tracker seems to suggest that there are statistically significant relationships between the variables, Deaths, Deaths/1Mpop, Tests, Confirmed Cases, Confirmed Case Fatality Rate, Confirmed Cases/1Mpop and Seasonal Flu Deaths (CDC10-Year Avg).
This first blog provides descriptive statistics and correlations. Correlations do not allow us to conclude that the relationships we see are causal. Regardless of whether a relationship is causal, a correlation allows for prediction; thus, such relationships are extremely useful (Krathwohl, 2009).
Dr. Corrie Block is a professor of Policy Studies, Measurement and Evaluation at Bellarmine University and a Senior Fellow at Pegasus Institute