First we have had some discussion that confidence intervals calculated on serially dependent data will underestimate the true confidence intervals. See the discussion at the end of this post: https://community.wolfram.com/groups/-/m/t/1887823
Nevertheless I think the confidence intervals are useful because they do show when there is not very good predictive value to the model given the data. When you calculate confidence intervals the general assumption is that the model is correct and that your data has random measurement errors. When you add a constraint, you may get a parameter value that does not exactly fit the model. Consequently when you use the model parameter to calculate fit residuals for the data, which are used to calculate the standard error, they may no longer be correct. Thus the warning is activated. The constraint only applies to the parameter fitting, not the method of calculation of confidence intervals. The method used to calculate the fit residuals is given in the Possible Issues part of the NonlinearModelFit documentation.
So a careful scientist should trust his data and question the model. I have had a similar problem with the Italian case data, so I will share what is wrong with the model. Recall the logistic differential equation.
f(t)
t
f(t)
L
If you remove the term, , you have the differential equation for exponential growth. The reason the logistic model works for a novel virus spread is that at onset virtually everyone is susceptible and there is no treatment. The only method of control is quarantine, which attempts to isolate the infection to a limiting population size L. If the quarantine measures don’t succeed and L keeps growing, such that L ≅ r f(t), then the f(t) terms cancel and you are back to exponential growth. The most likely explanation for the data problem is that the people who died were sick before the quarantine measures were very successful at limiting the exposed population, thus the cases were growing exponentially. On a log plot below, the data looks linear, consistent with exponential growth model.
1-
f(t)
L
Out[]=
The code below compares the exponential and logistic fit. I initially encountered similar problem with the Italian case data, but that has now broken away and the logistic model has become clearly superior.
In[]:=
cases=deathCases;
In[]:=
exNLM=NonlinearModelFit[cases,P0,{k,{P0,2},{t0,-5}},t];exNLM@"BestFitParameters"Quantity[Log[2]/k/.%(*Doublingtime*),"Day"]exNLM@"RSquared"
k(t-t0)
Out[]=
{k0.293694,P06.17496,t0-5.70601}
Out[]=
Out[]=
0.999129
In[]:=
logNLM=NonlinearModelFitcases,,{{k,0.2},{L,5000},{t0,20}},t;logNLM@"BestFitParameters"Quantity[Log[2]/k/.%(*Doublingtime*),"Day"]logNLM@"RSquared"
L
1+
-k(t-t0)
Out[]=
{k0.315875,L6221.53,t016.9309}
Out[]=
Out[]=
0.999248
The RSquared value for the two methods is nearly the same.
In[]:=
Show[Plot[{exNLM[t],logNLM[t]},{t,cases[[1]][[1]],cases[[-1]][[1]]},GridLinesAutomatic,PlotRangeAll,PlotLegends"Expressions",PlotLabel"Exponential and Logistic Models"],ListPlot[cases,PlotRangeAll,PlotStyleDarker@Red]]
Out[]=
Clearly the confidence interval bands are better for the exponential model.
In[]:=
deathBands[t_]=exNLM["SinglePredictionBands",ConfidenceLevel0.9];dth0=Plot[{exNLM[x],deathBands[x]},{x,1,15},Filling{2{1}},PlotStyle{Darker@Red,Darker@Orange,Darker@Orange},ImageSize600,GridLinesAutomatic,PlotLabel"Italy COVid-19 Deaths"];dth1=ListPlot[deathCases,PlotRangeAll];Show[dth0,dth1]
Out[]=