COVID-19 ANALYTICS: WHO IS AT RISK? WHO WILL BE NEXT?
We used the pattern mining engine of our software to analyze a prominent data set with about 500 attributes, covering demographics, economics, infrastructure, etc. for all of the 3,007 US counties. We were interested why some counties experience higher COVID-19 death rates — the number of deaths in terms of population size.
Our findings yield a better understanding of this raging pandemic, can assist local authorities to predict future COVID-19 death rates, can inform health policy on important correlations, can help with the allocation of resources, such as testing kits and stations , and can aid in targeted community information campaigns.
Pattern analysis -- finding the critical factors that identify counties at risk
Our analysis reveals that it is rarely just one feature that exposes a county to a higher than average COVID-19 death rate. Rather, it is usually a combination of features which when true at the same time provide a vivid narrative of these fateful circumstances.
In most cases, the number of features required to sufficiently describe a pattern is just a few. This makes them easy to explain, leading to a better understanding of the hidden processes and relations. Essentially, each pattern is a knowledge facet told in the domain’s language.
In May, our AI algorithm automatically identified 297 sets of US counties. We found that 985 US counties are at high risk, and Mississippi, Louisiana, and Georgia have the highest density of high-risk counties at a coverage of 80-90%. These numbers have changed somewhat as the pandemic raged on.
OUR COVID-19 RISK EXPLORER DASHBOARD
We designed the COVID-19 Risk Explorer shown on the left to allow users to explore the counties and their risk patterns, if any. This particular image shows the top three socio-economic risk factors for Chickasaw County, MS and its death rate curve on the top right. We see that this county participated in many risk patterns which can be selected in the bottom panel of the dashboard and viewed in the middle panel on the right.
Other counties that have the selected risk pattern are also shaded (according to death rate). This enables users to make predictions and draw analogies for possible interventions. To start exploring the risk patterns of your county simply click here and a new tab or window with our dashboard will open.
DETAILED ANALYSES OF SOME EXAMPLE PATTERNS
Counties high at risk: set #1
This is a map of US counties that all have one thing in common — on average they have a higher COVID-19 death rate than all of the US counties averaged together. Follow this link to learn why these counties are at high risk. What else do they have in common?
Counties high at risk: set #2
There is more than one set of features that could put a county at higher risk. On the left is another map of US counties for which our software has identified a feature pattern that associates with higher than usual COVID-19 death rate on average. Follow this link to learn why these counties are at high risk. What are their common properties?
Counties high at risk: set #3
Here is a third map of counties unified by a set of features that indicate an unusually high COVID-19 death rate on average. Click this link to learn what these critical features and their values ranges are.
Correlation analysis -- finding factors that amplify risk
Counties where risk correlates with a factor: set #4
The scatterplot on the left makes one think that there is no apparent correlation between severe housing cost burden and COVID-19 death rate; the correlation is a mere 10%. But in fact there are county patterns where such a correlation holds. Click this link to find out what they are.
Counties where risk correlates with a factor: set #5
Similarly, per the scatterplot on the left, there is no apparent correlation between a county’s unemployment rate and its COVID-19 death rate; the correlation factor is just 13%. But do not rush to this conclusion; we found that there IS a correlation but only for counties that fulfill certain population criteria, as elaborated on this page.
Predictive analysis -- identifying counties that will be impacted soon
Our patterns can predict COVID-19 death rate!
This shows our software’s capability to predict a county’s future fate in the ongoing pandemic. The three sets highlighted here were randomly selected from the patterns we found; we did not ‘cherry pick’ the best results. For all patterns, the average COVID-19 death rate increased 2-3 times the US average during this time frame. Click here to learn more.
The bigger picture
We can think of every US county as an (observational) experiment; each has certain characteristics which makes it unique, and similar to some others at the same time. Our pattern mining engine looks for regions in this feature space that are occupied with similar counties that all respond in a similar way to a given target variable of interest — the COVID-19 death rate.
The criteria that determine ‘similarity’ are grounded in sophisticated statistical pattern mining — a core technology we market. It can be applied to any domain, not just to predict the outcomes for a pandemic disease. Contact us to find out how we can help you to find important features in your data. Chances are high that we can help you.
Interested to learn more about our overall approach to pattern mining?
If you like to know more about our approach to pattern mining in high-dimensional feature spaces please visit this page. It offers a gentle introduction to the subject. For more a more rigorous treatment we plan to post a few technical briefs on our technology in the near future.