Acceleration by Stepsize Hedging

This notebook verifies the algebraic identities in Section 5 of the paper
​“Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule
​by Jason M. Altschuler and Pablo A. Parrilo [arXiv:2309.07879]

Definitions and relations

Normalizing transform for the Silver Stepsize and its inverse (equation 3.5)
In[]:=
ztob=z(1+kz)/(1+z);​​btoz=b(b-1)/(k-b);
​
Defining equations for splitting the Normalized Silver Stepsizes (equation 3.1)
In[]:=
rec={y2nz2n==zn^2,z2n-y2n==2(zn-zn^2)};
​
Silver Convergence Rate (equation 3.9)
In[]:=
tn=((1-zn)/(1+zn))^2;​​t2n=((1-z2n)/(1+z2n))^2;
​
Silver Stepsizes (equation 3.4)
In[]:=
an=ztob[yn];​​bn=ztob[zn];​​a2n=ztob[y2n];​​b2n=ztob[z2n];
​
Quantities from Definition B.2
In[]:=
r=(1-zn)/(1-z2n);​​c=t2n/tn(r+(1+r)(z2n+zn)/(z2n-zn));​​phi=t2nk/(k-2)(z2n+zn)/(z2n-zn);
​
Quantities from Lemma B.3
In[]:=
d12=t2n/(1-zn)(z2n+zn)/(z2n-zn);​​d21=ky2nt2n/(1-zn)(z2n+zn)/(z2n-zn);​​d31=t2n/(1-zn)(1+ky2n)-t2n(1+(k-1)zn+kzn^2)/(1-zn)^2;​​d32=t2n(1+(k-1)z2n+kz2n^2)/(1-z2n)^2-ctn(1+(k-1)zn+kzn^2)/(1-zn)^2;​​d13=-2kznt2n/(1-zn)^2;​​d23=2kz2nt2n/(1-z2n)^2-c2kzntn/(1-zn)^2;
​
Main Matrices
The four 4x4 matrices in Lemma 5.3 (definitions in Appendix B.4)
In[]:=
EE={{t2n/tn-ctn,ctna2n-t2nbn/tn,0,0},​​{0,t2nbn^2/tn-ctna2n^2,0,0},​​{0,0,c-1,b2n-cbn},​​{0,0,0,cbn^2-b2n^2}};​​S={{-1/k(d12+d13+d21+d31),d12/k+d21+d13/k+d31,(d12+d21)/k,-d12-d21/k},​​{0,-(d12+d13+d21+d31),-d21-d12/k,d12+d21},​​{0,0,-1/k(d21+d12+d23+d32),d21/k+d12+d23/k+d32},​​{0,0,0,-(d21+d12+d23+d32)}};​​L1=(k-2)/k/(1-zn){{zn-2+1/k,a2n(1-zn)+(k-1)/k,1-1/k,0},​​{0,2a2n(zn-1)-(kzn-1),-1+1/k,0},​​{0,0,0,0},​​{0,0,0,0}};​​L2=(k-2)/k/(1-zn){{0,0,-1+zn,1-zn},​​{0,0,a2n(1-zn),-a2n(1-zn)},​​{0,0,2-zn-1/k,-1+zn},​​{0,0,0,1-kzn}};​​L=phi(L1+rL2);
​
VerifyproofofTheorem5
Check the quadratic form in the iterates and gradients (equation 5.6)
In[]:=
J=S+L-EE;​​Simplify[J,rec]
Out[]=
{{0,0,0,0},{0,0,0,0},{0,0,0,0},{0,0,0,0}}
​
Check the linear form in function values (equation 5.9)
In[]:=
W=t2n(kzn-1)(z2n+zn)/(1-zn)/(z2n-zn);​​f1=d12+d13-d21-d31+W;​​f2=d21+d23-d12-d32+rW;​​f3=d31+d32-d13-d23-W(1+r);
In[]:=
FullSimplify[{f1,f2,f3},rec]
Out[]=
{0,0,0}