# Exploring Pandemic Data

Exploring Pandemic Data

## Basic Data Exploration

Basic Data Exploration

rawCovidData=ResourceData["Epidemic Data for Novel Coronavirus COVID-19"];

In[]:=

Show the data available for each region:

rawCovidData[1/*Keys]

In[]:=

Out[]=

Show which countries have not reported cases so far:

GeoListPlot[Complement[EntityList["Country"],Normal@Keys@rawCovidData[GroupBy["Country"],Total,"ConfirmedCases"]]]

In[]:=

Out[]=

Generate the time series of confirmed cases, totaled by country:

countryTs=Normal[rawCovidData[GroupBy["Country"],Total,"ConfirmedCases"]];

In[]:=

Make a log-log plot of the latest reported number of cases as a function of population:

ListLogLogPlot[AssociationThread[Keys[countryTs],Transpose[{EntityValue[Keys[countryTs],"Population"],#["LastValue"]&/@Values[countryTs]}]],PlotRangeAll]

In[]:=

Out[]=

## Basic Time Series

Basic Time Series

Find the raw numbers of confirmed cases for each country:

countryValues=#["Values"]&/@countryTs;

In[]:=

Show growth for the 25 countries with the largest current number of reported cases:

DateListPlot[TakeLargestBy[countryTs,#["LastValue"]&,25],PlotRangeAll]

In[]:=

Out[]=

Make a log plot of the number of cases as a function of time, in each case starting when the number of cases first exceeded 100:

(Countries are indicated by tooltips)

(Countries are indicated by tooltips)

ListLogPlot[KeyValueMap[Tooltip[#2,#1]&,DeleteCases[Select[GreaterThan[100]]/@N@countryValues,{}]],JoinedTrue]

In[]:=

Out[]=

### Daily Growth Rates

Daily Growth Rates

Show the ratio of cases for each country on successive days (starting when each country first identified more than 100 cases):

ListLinePlot[DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}],PlotRange{{0,30},{1,All}}]

In[]:=

Out[]=

Show daily ratios for countries with more than 10000 cases, smoothed with a radius of 2 days:

ListLinePlot[KeyValueMap[Callout[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>10000&],{}]],PlotRange{{0,30},{1,All}},PlotStyleThickness[0.007]]

In[]:=

Out[]=

Include all countries with more than 5000 cases:

ListLinePlot[KeyValueMap[Legended[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines",PlotStyleThickness[0.007]]

In[]:=

Out[]=

Find the mean daily ratio across countries with more than 5000 cases:

ListLinePlot[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]

In[]:=

Out[]=

Find the mean daily ratio across all countries reporting cases:

ListLinePlot[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@countryValues,{}]]

In[]:=

Out[]=

Find the mean across all countries, with the values for each country starting with the country first reported more than 50 cases:

ListLinePlot[Values@DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[50]]/@N@countryValues,{}]]

In[]:=

Out[]=

Average over countries with more than 5000 cases, but do not take the mean across days:

ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]

In[]:=

Out[]=

Average over all countries reporting cases, but do not take the mean across days:

ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]

In[]:=

Out[]=

### Investigating Results

Investigating Results

We wanted to understand the seemingly linear decrease in average daily ratios.

Find the linear term in a fit of the first 30 days of the data:

fitSlope=m/.FindFit[Values[DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]][[All,;;UpTo[30]]]],m*x+b,{m,b},x

In[]:=

-0.0090±0.0005

Out[]=

This corresponds to change of average daily ratio with a slope of about 1 in 111 days:

1/fitSlope

In[]:=

-111.±6.

Out[]=

### Summarizing Country Daily Ratio Data

Summarizing Country Daily Ratio Data

Show daily ratio by country, together with the average over all reporting countries:

ShowListLinePlot[DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@countryValues,{}],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines"],ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]],PlotStyleDirective[Thick,Red]

In[]:=

Out[]=

Include only countries with more than 5000 cases reported:

ShowListLinePlot[KeyValueMap[Legended[#2,#1]&,DeleteCases[MeanFilter[Ratios[#],2]&/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotRange{{0,30},{1,All}},PlotTheme"DashedLines"],ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],PlotStyleDirective[Thick,Red]

In[]:=

Out[]=

Compare results for all countries, and countries with more than 5000 cases:

ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]],[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]

In[]:=

Out[]=

Show results successively dropping certain countries:

ListLinePlot[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@#,{}]]&/@N@FoldListKeyDrop,Select[countryValues,Max[#]>5000&],,,,

In[]:=

Out[]=

Show how many countries are included in the averages for each day, including only countries with more than 5000 current cases:

ListLinePlot@Total@PadRight[Sign[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@Select[countryValues,Max[#]>5000&],{}]]]

In[]:=

Out[]=

Show how many countries are included in the averages for each day, including all countries reporting cases:

ListLinePlot@Total@PadRight[Sign[Values@DeleteCases[Ratios/@Select[GreaterThan[100]]/@N@countryValues,{}]]]

In[]:=

Out[]=

### Possible Model for Results

Possible Model for Results

In the standard SIR continuum epidemiological model, the number of infected people is i[t], and there is a “force of infection” β.

Solve assuming an infinite supply of susceptible people, and fixed force of infection; the result is a pure exponential:

DSolve[i'[t]β*i[t],i[t],t]

In[]:=

{{i[t]}}

tβ

1

Out[]=

Solve assuming a force of infection that varies linearly with time:

DSolve[i'[t](m*t+b)*i[t],i[t],t]

In[]:=

i[t]

bt+

m

2

t

2

1

Out[]=

DLog,t

bt+

m

2

t

2

1

In[]:=

b+mt

Out[]=

A typical model is that the distribution of times between becoming infected and showing symptoms is an exponential distribution.

Show the PDF for an exponential distribution:

Plot[PDF[ExponentialDistribution[1/7],t],{t,0,30}]

In[]:=

Out[]=

The ratio of successive values will be given roughly by the log of the PDF:

Plot[Log[PDF[ExponentialDistribution[1/7],x]],{x,0,30}]

In[]:=

Out[]=

More accurately, it is the ratio of PDF to CDF:

Plot[Log[PDF[ExponentialDistribution[1/7],x]/CDF[ExponentialDistribution[1/7],x]],{x,0,30}]

In[]:=

Out[]=

### Network-Based Modeling

Network-Based Modeling

The continuum SIR model does not accurately represent human contacts, especially when they are limited by social distancing. It is better to consider a network, although it is not clear what the correct network should be.

Generate a typical example of network that models certain features of human networks:

RandomGraph[WattsStrogatzGraphDistribution[500,0.1,5]]

In[]:=

Out[]=

At larger scales, human networks will tend to reflect actual geographical (i.e. spatial) relations, and so will have features of random planar networks.

Generate an example of a random planar network:

MeshConnectivityGraph[DelaunayMesh[RandomPoint[Disk[],1000]]]

In[]:=

Out[]=

Make larger examples of these types of graphs:

g1=RandomGraph[WattsStrogatzGraphDistribution[10000,0.1,5]]

In[]:=

Graph

Out[]=

g2=MeshConnectivityGraph[DelaunayMesh[RandomPoint[Disk[],100000]]]

In[]:=

Graph

Out[]=

For the model human network, the graph diameter is still quite small:

GraphDiameter[g1]

In[]:=

10

Out[]=

Starting from one node in the graph (i.e. one person) this shows the number of nodes reached after n steps in the random planar graph:

ListPlot@Table[MeanAround@N@Table[Length@VertexOutComponent[g2,RandomChoice@VertexList[g2],n],100],{n,10}]

In[]:=

Out[]=

## Data from Actual Contagion Networks

Data from Actual Contagion Networks

Singapore has carefully tracked cases, and generated a network giving information on contagion.

Import the data:

rawGraph=Import["https://co.vid19.sg/api/cases/network_graph","RawJSON"];

In[]:=

Show the network from this data:

g=VertexDelete[Graph[rawGraph["nodes"][[All,"id"]],#from#to&/@rawGraph["edges"]],"unknown"]

In[]:=

Out[]=

Find cases of person-to-person transmission, giving the case numbers involved:

transmissions=Map[ToExpression,StringDrop[#,5]&/@Select[DeleteMissing[Lookup[{"from","to"}]/@rawGraph["edges"],1,2],AllTrue[StringMatchQ["Case "~~___]]]]

In[]:=

Plot case numbers for transmission pairs:

ListPlot[transmissions,Epilog{Red,InfiniteLine[{0,0},{1,1}]}]

In[]:=

Out[]=

This data seems to indicate that most transmissions are found by backtracing. The analysis should be repeated with actual report times included.

## Analysis of Government Responses

Analysis of Government Responses

Import a dataset of measures implemented by governments:

governmentMeasures=ResourceData[ResourceObject[CloudObject["https://www.wolframcloud.com/obj/christianp/DeployedResources/Data/Coronavirus-COVID-19-Pandemic-Government-Measures"]]];

In[]:=

Show counts of measures implemented:

governmentMeasures[Counts/*ReverseSort,"Measure"]

In[]:=

Out[]=

Show a word cloud of measures taken:

governmentMeasures[WordCloud,"Measure"]

In[]:=

Out[]=

Show a date histogram of when measures were implemented:

governmentMeasures[DateHistogram,"ImplementationDate"]

In[]:=

Out[]=

minImplementationTime=DateObject/@Select[Normal[governmentMeasures[G