Bank statements analysis in Mathematica

In this file I am going to investigate my own outgoing expenses from my online banking. First I will import the first six months of 2017 as a comma-separated value (CSV) file. Second I will use some Mathematica built-in functions in order to classify my personal outgoing expenses. Such a classification will allows me to obtain the trend of each class.
​
Note that this analysis is basically the same one for every class of my personal outgoing expenses. Thus, I will comment only the first class, i.e. Supermarket expenses.

Importing CSV file and Visualization of my Dataset

Importing CSV file:
fileImp=Import["f5.csv"];
Dimension of data frame:
Dimensions[fileImp]
{193,4}
Data frame visualization:
Dataset[fileImp]
Buchungstag
Beguenstigter/Zahlungspflichtiger
Betrag
0
02.06.17
GA NR00002112 BLZ26050001 0
-15.
1
02.06.17
PayPal Europe S.a.r.l. et Cie S.C.A
-14.24
2
02.06.17
GA NR00002257 BLZ26050001 0
-15.
3
01.06.17
FITNESS FIRST GERMANY GMBH
-25.
4
01.06.17
-0.7
5
01.06.17
-3.9
6
30.05.17
PayPal Europe S.a.r.l. et Cie S.C.A
-38.9
7
29.05.17
GA NR00002218 BLZ26050001 0
-30.
8
29.05.17
GA NR00002257 BLZ26050001 0
-15.
9
26.05.17
tegut... gute Lebens
-8.74
10
23.05.17
GA NR00002120 BLZ26050001 0
-20.
11
23.05.17
ALDI GmbH + Co. KG HANN.MUENDEN
-11.99
12
22.05.17
GA NR00002257 BLZ26050001 0
-20.
13
22.05.17
GA NR00002214 BLZ26050001 0
-15.
14
22.05.17
GA NR00002214 BLZ26050001 0
-10.
15
19.05.17
GA NR00002112 BLZ26050001 0
-10.
16
18.05.17
GA NR00002170 BLZ26050001 0
-10.
17
17.05.17
tegut... gute Lebens
-4.24
18
17.05.17
NETTO MARKEN-DISCOU.
-4.08
showing 1–20 of 193
Dimensions[Dataset[fileImp]]
{193,4}
Transforming our data frame into a list of associations and padding the associations with zeros where we have no values:
named=Map[AssociationThread[First[fileImp]PadRight[#,4]]&,Rest[fileImp]];

Selecting specific columns and changing date format

Selecting interested columns:
selectedColumns=named[[All,{"Buchungstag","Beguenstigter/Zahlungspflichtiger","Betrag"}]];
Dataset[selectedColumns]
Buchungstag
Beguenstigter/Zahlungspflichtiger
Betrag
02.06.17
GA NR00002112 BLZ26050001 0
-15.
02.06.17
PayPal Europe S.a.r.l. et Cie S.C.A
-14.24
02.06.17
GA NR00002257 BLZ26050001 0
-15.
01.06.17
FITNESS FIRST GERMANY GMBH
-25.
01.06.17
-0.7
01.06.17
-3.9
30.05.17
PayPal Europe S.a.r.l. et Cie S.C.A
-38.9
29.05.17
GA NR00002218 BLZ26050001 0
-30.
29.05.17
GA NR00002257 BLZ26050001 0
-15.
26.05.17
tegut... gute Lebens
-8.74
23.05.17
GA NR00002120 BLZ26050001 0
-20.
23.05.17
ALDI GmbH + Co. KG HANN.MUENDEN
-11.99
22.05.17
GA NR00002257 BLZ26050001 0
-20.
22.05.17
GA NR00002214 BLZ26050001 0
-15.
22.05.17
GA NR00002214 BLZ26050001 0
-10.
19.05.17
GA NR00002112 BLZ26050001 0
-10.
18.05.17
GA NR00002170 BLZ26050001 0
-10.
17.05.17
tegut... gute Lebens
-4.24
17.05.17
NETTO MARKEN-DISCOU.
-4.08
16.05.17
GA NR00002112 BLZ26050001 0
-15.
showing 1–20 of 192
Changing date format from “Day.Month.Year” into “Day,Month,Year” in the date column:
Bstag=named[[All,{"Buchungstag"}]];
InterpDataE=Interpreter["StructuredDate",DateFormat{"Day",".","Month",".","YearShort"}];
InterpDataE[Bstag];
Selecting the month and year from each element of date column and creating a new column in the data frame for each of them:
prova=MapAt[InterpDataE,selectedColumns,{1;;,1}];
InterpDataE2=Map[With[{d=#Buchungstag},<|"month"->d[[1,2]],"year"->d[[1,1]],#|>]&,prova];
Dataset[InterpDataE2]
month
year
Buchungstag
Beguenstigter/Zahlungspflichtiger
Betrag
6
2017
2 Jun 2017
GA NR00002112 BLZ26050001 0
-15.
6
2017
2 Jun 2017
PayPal Europe S.a.r.l. et Cie S.C.A
-14.24
6
2017
2 Jun 2017
GA NR00002257 BLZ26050001 0
-15.
6
2017
1 Jun 2017
FITNESS FIRST GERMANY GMBH
-25.
6
2017
1 Jun 2017
-0.7
6
2017
1 Jun 2017
-3.9
5
2017
30 May 2017
PayPal Europe S.a.r.l. et Cie S.C.A
-38.9
5
2017
29 May 2017
GA NR00002218 BLZ26050001 0
-30.
5
2017
29 May 2017
GA NR00002257 BLZ26050001 0
-15.
5
2017
26 May 2017
tegut... gute Lebens
-8.74
5
2017
23 May 2017
GA NR00002120 BLZ26050001 0
-20.
5
2017
23 May 2017
ALDI GmbH + Co. KG HANN.MUENDEN
-11.99
5
2017
22 May 2017
GA NR00002257 BLZ26050001 0
-20.
5
2017
22 May 2017
GA NR00002214 BLZ26050001 0
-15.
5
2017
22 May 2017
GA NR00002214 BLZ26050001 0
-10.
5
2017
19 May 2017
GA NR00002112 BLZ26050001 0
-10.
5
2017
18 May 2017
GA NR00002170 BLZ26050001 0
-10.
5
2017
17 May 2017
tegut... gute Lebens
-4.24
5
2017
17 May 2017
NETTO MARKEN-DISCOU.
-4.08
5
2017
16 May 2017
GA NR00002112 BLZ26050001 0
-15.
showing 1–20 of 192

List of Month

In order to plot the following results I need to have a list of the first of six months
Months={1,2,3,4,5,6};

Supermarket expenses

Let us focus on my supermarket expenses. This analysis takes into consideration the following four supermarkets: Rewe, Tegut, Karstadt, Netto. Such supermarkets are extracted by using the following command:
Supermarkets=Select[InterpDataE2,#[[4]]"ALDI GmbH + Co. KG HANN.MUENDEN"||#[[4]]"NETTO MARKEN-DISCOU."||#[[4]]"tegut... gute Lebens"||#[[4]]"REWE-Markt Moewes oHG"||#[[4]]"KARSTADT WARENHAUS GMHB"&];
Dataset[Supermarkets]
month
year
Buchungstag
Beguenstigter/Zahlungspflichtiger
Betrag
5
2017
26 May 2017
tegut... gute Lebens
-8.74
5
2017
23 May 2017
ALDI GmbH + Co. KG HANN.MUENDEN
-11.99
5
2017
17 May 2017
tegut... gute Lebens
-4.24
5
2017
17 May 2017
NETTO MARKEN-DISCOU.
-4.08
5
2017
8 May 2017
REWE-Markt Moewes oHG
-10.39
4
2017
26 Apr 2017
ALDI GmbH + Co. KG HANN.MUENDEN
-10.29
4
2017
19 Apr 2017
tegut... gute Lebens
-6.18
4
2017
18 Apr 2017
REWE-Markt Moewes oHG
-7.77
4
2017
11 Apr 2017
tegut... gute Lebens
-11.9
4
2017
5 Apr 2017
tegut... gute Lebens
-2.79
4
2017
5 Apr 2017
ALDI GmbH + Co. KG HANN.MUENDEN
-3.08
4
2017
4 Apr 2017
NETTO MARKEN-DISCOU.
-5.87
3
2017
27 Mar 2017
ALDI GmbH + Co. KG HANN.MUENDEN
-15.
3
2017
27 Mar 2017
NETTO MARKEN-DISCOU.
-4.57
3
2017
23 Mar 2017
tegut... gute Lebens
-2.09
3
2017
23 Mar 2017
NETTO MARKEN-DISCOU.
-2.49
3
2017
13 Mar 2017
REWE-Markt Moewes oHG
-15.89
3
2017
10 Mar 2017
tegut... gute Lebens
-3.68
3
2017
8 Mar 2017
tegut... gute Lebens
-2.69
3
2017
1 Mar 2017
NETTO MARKEN-DISCOU.
-5.11
showing 1–20 of 31
Number of lines in this new association:
M=Supermarkets[[All,"month"]];
Length[M]
31
Extracting relevant data for each month:
MM6=Select[Supermarkets,#[[1]]6&];​​MM5=Select[Supermarkets,#[[1]]5&];​​MM4=Select[Supermarkets,#[[1]]4&];​​MM3=Select[Supermarkets,#[[1]]3&];​​MM2=Select[Supermarkets,#[[1]]2&];​​MM1=Select[Supermarkets,#[[1]]1&];
Computing the total amount of the column “Betrag” for each month:
SMM6=Total[MM6[[All,"Betrag"]]];​​SMM5=Total[MM5[[All,"Betrag"]]];​​SMM4=Total[MM4[[All,"Betrag"]]];​​SMM3=Total[MM3[[All,"Betrag"]]];​​SMM2=Total[MM2[[All,"Betrag"]]];​​SMM1=Total[MM1[[All,"Betrag"]]];
Making a list of the sums above:
SMMN={SMM1,SMM2,SMM3,SMM4,SMM5,SMM6};
In order to plot “month vs total amount” for my supermarket expenses, I will use the built-in function “timeseries” on the two lists: SMMN, Months.
tsSupermarketsM=TimeSeries[SMMN,{Months}];
Here we plot our data by using listplot:

Rent expenses

Let us focus on my rent expenses. This analysis takes into consideration all rents in the studied period of time.

Visa card expenses

Let us focus on my visa card expenses. This analysis takes into consideration all uses of my visa card online or in direct payments in the studied period of time.

All together

Further Explorations

A further exploration would be cleaning this code and using many more built-in functions in order to make it more efficient and beautiful.

Authorship information

Federico Dradi,
Friday, June 23rd 2017
federico.dradi@gmail.com
​