Exploring Pandemic Data

March 24, 2020 Livestream​
Christopher Wolfram, Stephen Wolfram, Bob Nachbar, etc.

Basic Data Exploration

Here is the epidemiological data from the Wolfram Data Repository:
rawCovidData=ResourceData["Epidemic Data for Novel Coronavirus COVID-19"];
In[]:=
Show the data available for each region:
rawCovidData[1/*Keys]
In[]:=
AdministrativeDivision
Country
GeoPosition
ConfirmedCases
RecoveredCases
Deaths
Out[]=
Show which countries have not reported cases so far:
GeoListPlot[Complement[EntityList["Country"],Normal@Keys@rawCovidData[GroupBy["Country"],Total,"ConfirmedCases"]]]
In[]:=
Out[]=
Generate the time series of confirmed cases, totaled by country:
countryTs=Normal[rawCovidData[GroupBy["Country"],Total,"ConfirmedCases"]];
In[]:=
Make a log-log plot of the latest reported number of cases as a function of population:
ListLogLogPlot[AssociationThread[Keys[countryTs],Transpose[{EntityValue[Keys[countryTs],"Population"],#["LastValue"]&/@Values[countryTs]}]],PlotRangeAll]
In[]:=
Out[]=

Basic Time Series

Find the raw numbers of confirmed cases for each country:
countryValues=#["Values"]&/@countryTs;
In[]:=
Show growth for the 25 countries with the largest current number of reported cases:
DateListPlot[TakeLargestBy[countryTs,#["LastValue"]&,25],PlotRangeAll]
In[]:=
China
Italy
United States
Spain
Germany
Iran
France
South Korea
Switzerland
United Kingdom
Netherlands
Austria
Belgium
Norway
Canada
Out[]=
Make a log plot of the number of cases as a function of time, in each case starting when the number of cases first exceeded 100:
(Countries are indicated by tooltips)
ListLogPlot[KeyValueMap[Tooltip[#2,#1]&,DeleteCases[Select[GreaterThan[100]]/@N@countryValues,{}]],JoinedTrue]
In[]:=
Out[]=

Daily Growth Rates

Show the ratio of cases for each country on successive days (starting when each country first identified more than 100 cases):
ListLinePlot[DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}],PlotRange{{0,30},{1,All}}]
In[]:=
China
Italy
Spain
Germany
Iran
United States
France
South Korea
Switzerland
United Kingdom
Netherlands
Austria
Belgium
Norway
Portugal
Out[]=
Show daily ratios for countries with more than 10000 cases, smoothed with a radius of 2 days:
ListLinePlot[KeyValueMap[Callout[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>10000&],{}]],PlotRange{{0,30},{1,All}},PlotStyleThickness[0.007]]
In[]:=
Out[]=
Include all countries with more than 5000 cases:
ListLinePlot[KeyValueMap[Legended[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines",PlotStyleThickness[0.007]]
In[]:=
China
Italy
Spain
Germany
Iran
United States
France
South Korea
Switzerland
United Kingdom
Out[]=
Find the mean daily ratio across countries with more than 5000 cases:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]
In[]:=
Out[]=
Find the mean daily ratio across all countries reporting cases:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@countryValues,{}]]
In[]:=
Out[]=
Find the mean across all countries, with the values for each country starting with the country first reported more than 50 cases:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[50]]/@N@countryValues,{}]]
In[]:=
Out[]=
Average over countries with more than 5000 cases, but do not take the mean across days:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]
In[]:=
Out[]=
Average over all countries reporting cases, but do not take the mean across days:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]
In[]:=
Out[]=

Investigating Results

We wanted to understand the seemingly linear decrease in average daily ratios.
Find the linear term in a fit of the first 30 days of the data:
fitSlope=m/.FindFit
[◼]
RaggedMeanAround
[Values[DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]][[All,;;UpTo[30]]]],m*x+b,{m,b},x
In[]:=
-0.0090±0.0005
Out[]=
This corresponds to change of average daily ratio with a slope of about 1 in 111 days:
1/fitSlope
In[]:=
-111.±6.
Out[]=

Summarizing Country Daily Ratio Data

Show daily ratio by country, together with the average over all reporting countries:
Show​​ListLinePlot[DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@countryValues,{}],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines"],​​ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]],PlotStyleDirective[Thick,Red]​​
In[]:=
China
Italy
Spain
Germany
Iran
United States
France
South Korea
Switzerland
United Kingdom
Netherlands
Austria
Belgium
Norway
Portugal
Out[]=
Include only countries with more than 5000 cases reported:
Show​​ListLinePlot[KeyValueMap[Legended[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines"],​​ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotStyleDirective[Thick,Red]​​
In[]:=
China
Italy
Spain
Germany
Iran
United States
France
South Korea
Switzerland
United Kingdom
Out[]=
Compare results for all countries, and countries with more than 5000 cases:
ListLinePlot​​
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],​​
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]​​
In[]:=
Out[]=
Show results successively dropping certain countries:
ListLinePlot
[◼]
RaggedMeanAround
[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@#,{}]]&/@N@FoldListKeyDrop,Select[countryValues,Max[#]>5000&],
China
COUNTRY
,
South Korea
COUNTRY
,
Singapore
COUNTRY
,
Japan
COUNTRY

In[]:=
Out[]=
Show how many countries are included in the averages for each day, including only countries with more than 5000 current cases:
ListLinePlot@Total@PadRight[Sign[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]]
In[]:=
Out[]=
Show how many countries are included in the averages for each day, including all countries reporting cases:
ListLinePlot@Total@PadRight[Sign[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]]
In[]:=
Out[]=

Possible Model for Results

In the standard SIR continuum epidemiological model, the number of infected people is i[t], and there is a “force of infection” β.
Solve assuming an infinite supply of susceptible people, and fixed force of infection; the result is a pure exponential:
DSolve[i'[t]β*i[t],i[t],t]
In[]:=
{{i[t]
tβ


1
}}
Out[]=
Solve assuming a force of infection that varies linearly with time:
DSolve[i'[t](m*t+b)*i[t],i[t],t]
In[]:=
i[t]
bt+
m
2
t
2


1

Out[]=
DLog
bt+
m
2
t
2


1
,t
In[]:=
b+mt
Out[]=
A typical model is that the distribution of times between becoming infected and showing symptoms is an exponential distribution.
Show the PDF for an exponential distribution:
Plot[PDF[ExponentialDistribution[1/7],t],{t,0,30}]
In[]:=
Out[]=
The ratio of successive values will be given roughly by the log of the PDF:
Plot[Log[PDF[ExponentialDistribution[1/7],x]],{x,0,30}]
In[]:=
Out[]=
More accurately, it is the ratio of PDF to CDF:
Plot[Log[PDF[ExponentialDistribution[1/7],x]/CDF[ExponentialDistribution[1/7],x]],{x,0,30}]
In[]:=
Out[]=

Network-Based Modeling

The continuum SIR model does not accurately represent human contacts, especially when they are limited by social distancing. It is better to consider a network, although it is not clear what the correct network should be.
Generate a typical example of network that models certain features of human networks:
RandomGraph[WattsStrogatzGraphDistribution[500,0.1,5]]
In[]:=
Out[]=
At larger scales, human networks will tend to reflect actual geographical (i.e. spatial) relations, and so will have features of random planar networks.
Generate an example of a random planar network:
MeshConnectivityGraph[DelaunayMesh[RandomPoint[Disk[],1000]]]
In[]:=
Out[]=
Make larger examples of these types of graphs:
g1=RandomGraph[WattsStrogatzGraphDistribution[10000,0.1,5]]
In[]:=
Graph
Vertex count: 10000
Edge count: 50000

Out[]=
g2=MeshConnectivityGraph[DelaunayMesh[RandomPoint[Disk[],100000]]]
In[]:=
Graph
Vertex count: 100000
Edge count: 299826

Out[]=
For the model human network, the graph diameter is still quite small:
GraphDiameter[g1]
In[]:=
10
Out[]=
Starting from one node in the graph (i.e. one person) this shows the number of nodes reached after n steps in the random planar graph:
ListPlot@Table[MeanAround@N@Table[Length@VertexOutComponent[g2,RandomChoice@VertexList[g2],n],100],{n,10}]
In[]:=
Out[]=

Data from Actual Contagion Networks

Singapore has carefully tracked cases, and generated a network giving information on contagion.
Import the data:
rawGraph=Import["https://co.vid19.sg/api/cases/network_graph","RawJSON"];
In[]:=
Show the network from this data:
g=VertexDelete[Graph[rawGraph["nodes"][[All,"id"]],#from#to&/@rawGraph["edges"]],"unknown"]
In[]:=
Out[]=
Find cases of person-to-person transmission, giving the case numbers involved:
transmissions=Map[ToExpression,StringDrop[#,5]&/@Select[DeleteMissing[Lookup[{"from","to"}]/@rawGraph["edges"],1,2],AllTrue[StringMatchQ["Case "~~___]]]]
In[]:=
Plot case numbers for transmission pairs:
ListPlot[transmissions,Epilog{Red,InfiniteLine[{0,0},{1,1}]}]
In[]:=
Out[]=
This data seems to indicate that most transmissions are found by backtracing. The analysis should be repeated with actual report times included.

Analysis of Government Responses

Import a dataset of measures implemented by governments:
governmentMeasures=ResourceData[ResourceObject[CloudObject["https://www.wolframcloud.com/obj/christianp/DeployedResources/Data/Coronavirus-COVID-19-Pandemic-Government-Measures"]]];
In[]:=
Show counts of measures implemented:
governmentMeasures[Counts/*ReverseSort,"Measure"]
In[]:=
Introduction of quarantine policies
229
Travel restrictions
191
Limit public gatherings
183
Health screenings in airports and border crossings
122
Schools closure
117
Flights suspension
116
Border closure
116
Economic measures
80
Public services closure
73
Strengthening the public health system
71
General recommendations
66
Awareness campaigns
65
Visa restrictions
53
General lockdown
49
State of emergency declared
46
Emergency administrative structures activated or established
39
Additional health/documents requirements upon arrival
34
Border checks
23
Arrests for noncompliance
9
Fines for noncompliance
9
rows 1–20 of 32
Out[]=
Show a word cloud of measures taken:
governmentMeasures[WordCloud,"Measure"]
In[]:=
Out[]=
Show a date histogram of when measures were implemented:
governmentMeasures[DateHistogram,"ImplementationDate"]
In[]:=
Out[]=
minImplementationTime=DateObject/@Select[Normal[governmentMeasure