In[]:=
CompoundExpression[
]
​​deploy
Wed 22 Mar 2023 15:26:20

Fast powers of diagonal + rank1 matrices

Goal: ​
t
(I-(A-B))
≃ exp(-t(A-B))​For diagonal A, symmetric rank1 B, can we approximate the exponential quantity above quickly and accurately?Mathoverflow discussion.​Top-level notes: Diagonal + rank1 section​​

Code

Background:
​
Gives diagonal, rank1 matrices from eq Eq 5. from Bordelon “structured” paper​
generateDiagRank1 returns DPR1 matrix A-B such that exp(-t(A-B)) is the state transition matrix at time t
​
{a0,b0}=generateDiagRank1[d];
A=DiagonalMatrix[a0];
B={b0}.{b0};
​
Related notebooks:
- forum-velikanov-berthier-example.nb, simulations for p=1, p=1.1, p=1.4
- gd-vs-sgd.nb utilities for plotting loss curves, finding step times
​
Notation:
b -- batch size b
p -- power-law decay constant 1<p<2
d -- dimensions
1<d<
10
10
​
t -- timestep
1<t<
2
d
(important values, t=2, t=d,
t=
2
d
)
​
​
​
(batch-size-formulas.graffle / bertier), https://www.dropbox.com/s/cwyhmkzsavj9pdh/batch-size-formulas.graffle?dl=0
In[]:=
(*​​A-B is DPR1 matrix for A=diag(a),B=b.b ​​​​exp(-t(A-B)) gives the state transition matrix where​​{a,b}=generateDiagRank1[d], then​​​​ ()​​ *)​​generateDiagRank1[d_]:=Module{},​​(* Largest value of step size contractive in Schatten-1 norm *)​​p=1.1;​​h=Table[
-p
i
,{i,1,d}];​​h=hTr[h];​​α=
1
2Max[h]+Tr[h]
;​​a0=2 α h(1-α h);​​b0=α h;​​​​(* For small sizes, run sanity checks *)​​If[d<=100,​​On[Assert];​​A=DiagonalMatrix[a0];​​B={b0}.{b0};​​​​Assert[And@@Thread[{Norm[A],Norm[B],Norm[A-B]}<=1]];​​Assert[And@@(PositiveSemidefiniteMatrixQ/@{A,B,A-B})];​​​​(* check that matrices are non-negative *)​​nnMatrixQ[mat_]:=And@@Thread[Flatten[mat]>=0];​​Assert[nnMatrixQ[A]];​​Assert[nnMatrixQ[B]];​​];​​{a0,b0}​​;​​​​(* Faster version of A.x DPR1 matrix A stored globally as dd,u *)​​fastMatmul[x_]:=x-dd*x+u*u.x;​​(* {{"Exact value: ",ones.MatrixPower[ii-(A-B),t,ones]},​​{"Exact fast: ",ones.Nest[fastMatmul,ones,t]}}//TableForm *)​​​​d=100; (* dimensions *)​​{a0,b0}=generateDiagRank1[d];​​A=DiagonalMatrix[a0];​​B={b0}.{b0};​​​​ones=ConstantArray[1.,d];​​ii=IdentityMatrix[d];​​​​exact[t_]:=ones.MatrixPower[ii-(A-B),t,ones];​​exp[t_]:=ones.MatrixExp[-t(A-B)].ones;​​​​On[Assert];​​Assert[exact[d]==61.54928435838899`]​​Assert[exp[d]==61.62225463274567`]​​

Tests

​

Evaluation

In[]:=
d=100;(*dimensions*)​​{a0,b0}=generateDiagRank1[d];​​​​A=DiagonalMatrix[a0];​​B={b0}.{b0};​​​​ii=IdentityMatrix[d];​​ones=ConstantArray[1.,d];​​t=d;​​​​TableForm[​​{{"Exact value: ",ones.MatrixPower[ii-(A-B),t,ones]},​​{"exp approximation: ",ones.MatrixExp[-t(A-B)].ones},​​{"cheaper exp approximation: ",ones.MatrixExp[-t(A)].ones}}​​]​​​​
Out[]//TableForm=
Exact value:
61.5493
exp approximation:
61.6223
cheaper exp approximation:
55.54
In[]:=
ones.MatrixPower[ii-(A-B),t,ones]