In[]:=
deploy
Tue 21 May 2019 11:14:18

Util


Problem setup: Quadratic

Least squares fit of y=wx by using SGD on w with shape [1,2]
​
Globals : dsize, X, Y, w0, err, gradList, lossList, fullGradList, loss, gradients, Cmat
In[]:=
SeedRandom[0];​​dsize=1000;(*numberofdatapoints*)​​{X,Y}=generateXY[.1,dsize];​​err[w_]:=w.X-Y;(*residuals,(1,dsize)*)​​err[w_,i_]:={err[w][[All,i]]};(*residuals,(1,dsize)*)​​grad[w_]:=err[w].X;(*fullgradient*)​​grad[w_,i_]:=​​v2r[err[w][[1,i]]Transpose[X][[i,All]]];(*gradientfor1example*)​​loss[w_]:=toscalar[err[w].err[w]/2];​​loss[w_,i_]:=toscalar[err[w,i].err[w,i]/2];​​​​(*Matrixofallgradients(dsize,2),i'throwisgradientfori'thexample*)​​gradients[w_]:=(DiagonalMatrix@Flatten@err[w]).X;​​(*EmpiricalFishermatrixatcurrentpointestimatedfromthewholedataset*)​​Cmat[w_]:=gradients[w].gradients[w]/dsize;​​(*EmpiricalFishermatrixatcurrentpointestimatedfromexamplei*)​​Cmat[w_,i_]:=grad[w,i].grad[w,i];​​​​maxIters=10000;​​(*sequenceofexamplestosample*)​​indices=RandomChoice[Range[dsize],maxIters];​​​​optimizeSgd[lr_,w0_,iters_]:=Module[{g,w},​​{pointList,gradList,fullGradList,lossList}={{},{},{},{}};​​w=w0;​​For[iter=1,iter≤iters,iter++,​​g=grad[w,indices[[iter]]];​​pointList=pointList~Append~w;​​gradList=gradList~Append~g;​​fullGradList=fullGradList~Append~grad[w];​​lossList=lossList~Append~loss[w];​​w=w-lr*g;​​]​​];​​​​ListPlot[Transpose@X,AspectRatio1,PlotLabel"Distribution of x"]
Out[]=

Non-stationary

​
In[]:=
w0={{1,2}};​​numSteps=100;​​η=0.05;​​optimizeSgd[η,w0,numSteps]​​ListPlot@lossList​​​​bound=2;​​plt1=ContourPlot[loss[{{x,y}}],{x,0,bound},{y,0,bound},Contours100,ContourShadingNone];​​plotPoints=Flatten/@pointList;​​plt3=Graphics[{Red,PointSize[0.01],Point[plotPoints]}];​​plt4=Graphics[{Blue,Line[plotPoints]}];​​Show[{plt1,plt3,plt4}]​​​​(*EmpiricalFishermatrixatcurrentpointestimatedfromexamplei*)​​Cmat[w_,i_]:=grad[w,i].grad[w,i];​​​​ol=MapThread[DotProduct,{gradList,pointList}];​​or=MapThread[.5ηTr[Cmat[#1,#2]]&,{pointList,indices[[;;numSteps]]}];​​ListPlot[{MovingAverage[ol,100],or}]​​
Out[]=
20
40
60
80
100
550
600
650
Out[]=
Out[]=
20
40
60
80
100
0.05
0.10
0.15
0.20
0.25
0.30

Stationary

In[]:=
w0={{1,1}};​​​​numSteps=1000;​​η=0.05;​​optimizeSgd[η,w0,numSteps]​​ListPlot@lossList​​​​bound=2;​​plt1=ContourPlot[loss[{{x,y}}],{x,0,bound},{y,0,bound},Contours100,ContourShadingNone];​​plotPoints=Flatten/@pointList;​​plt3=Graphics[{Red,PointSize[0.01],Point[plotPoints]}];​​plt4=Graphics[{Blue,Line[plotPoints]}];​​Show[{plt1,plt3,plt4}]​​​​(*EmpiricalFishermatrixatcurrentpointestimatedfromexamplei*)​​Cmat[w_,i_]:=grad[w,i].grad[w,i];​​​​ol=MapThread[DotProduct,{gradList,pointList}];​​or=MapThread[.5ηTr[Cmat[#1,#2]]&,{pointList,indices[[;;numSteps]]}];​​ListLinePlot[{movingAvg[ol,100],movingAvg[or,100]}]

Stationary Isotropic

Isotropic high-dimensional Case