In[]:=
Clear["Global`*"];​
​(* a simple function to detect which environment we're in -- the local machine or the cloud *)​
​detectCloud[] := StringContainsQ[NotebookDirectory[], "wolframcloud"];

Data Collection: Decennial Population and Apportionment

Sources

◼
  • State Abbreviations: U.S. Postal Service
  • ◼
  • State Decennial Census Populations: Census Department
  • ◼
  • State Decennial Apportionment Data: Census Department
  • We’ll start with the state abbreviations from the Post Office

    In[]:=
    table = Import["http://about.usps.com/who-we-are/postal-history/state-abbreviations.htm", {"HTML", "Data",2}][[2]];​
    ​Print[Grid[table[[2;;6]], FrameAll]];
    1831
    1874
    1943
    1963
    Alabama
    Al.
    Ala.
    Ala.
    AL
    Alaska
    --
    Alaska
    Alaska
    AK
    Arizona
    --
    Ariz.
    Ariz.
    AZ
    Arkansas
    Ar. T.
    Ark.
    Ark.
    AR

    Let's toss in entities to the output, since we can be sure they’re uniform across different sources, spellings and punctuations

    In[]:=
    data = { StringTrim[First@#], StringTrim[Last@#] }& /@ Rest@table;​
    ​data = Select[data, First@# ≠ "" && First@# ≠ "Puerto Rico"&];​
    ​states = <| "name"  First@#, "abbr"  Last@#, "entity"  Interpreter["USState"][First@#] |>& /@ data;

    And make an association of the associations

    In[]:=
    statesAssociation = AssociationThread[#["name"]& /@ states, states];

    We’ll need to correct Nebraska:
    ​

    In[]:=
    statesAssociation["Nebraska"]["abbr"] = "NE";

    Now let’s get decennial Census data.

    While Mathematica has population conveniently built in to state entities, let’s be extra careful and use the same figures the U.S. used for apportionment.

    We can get the table data from the URL at the top of the document directly as a CSV to save some time.

    In[]:=
    csv = Import["https://www2.census.gov/programs-surveys/decennial/tables/2010/2010-apportionment/pop_change.csv", "CSV"];​
    ​data = Most@Take[csv, -52];​
    ​header = csv[[3]]
    Out[]=
    {STATE_OR_REGION,1910_POPULATION,1920_POPULATION,1930_POPULATION,1940_POPULATION,1950_POPULATION,
    1960_POPULATION,1970_POPULATION,1980_POPULATION,1990_POPULATION,2000_POPULATION,2010_POPULATION,
    1910_CHANGE,1920_CHANGE,1930_CHANGE,1940_CHANGE,1950_CHANGE,1960_CHANGE,1970_CHANGE,1980_CHANGE,
    1990_CHANGE,2000_CHANGE,2010_CHANGE}

    Let’s clean up the header, reducing to just to the populations -- we can calculate the change ourselves if we ever need it.

    In[]:=
    shortHeader = { "name", "1910", "1920", "1930", "1940", "1950", "1960", "1970", "1980", "1990", "2000", "2010" };

    And map the header to the data objects

    In[]:=
    population = Map[AssociationThread[shortHeader, #[[1;;12]]]&, data];​
    ​Grid[Prepend[Values /@ population[[1;;4]], shortHeader], FrameAll, Spacings{0.6,1}]
    Out[]=
    name
    1910
    1920
    1930
    1940
    1950
    1960
    1970
    1980
    1990
    2000
    2010
    Alabama
    2138093
    2348174
    2646248
    2832961
    3061743
    3266740
    3444165
    3893888
    4040587
    4447100
    4779736
    Alaska
    64356
    55036
    59278
    72524
    128643
    226167
    300382
    401851
    550043
    626932
    710231
    Arizona
    204354
    334162
    435573
    499261
    749587
    1302161
    1770900
    2718215
    3665228
    5130632
    6392017
    Arkansas
    1574449
    1752204
    1854482
    1949387
    1909511
    1786272
    1923295
    2286435
    2350725
    2673400
    2915918

    Looks good. Let’s connect to the object from the Post Office

    In[]:=
    statesAssociation = AssociationThread[​
    ​ Keys@statesAssociation,​
    ​ Append[statesAssociation[#["name"]], "population_decennial"  Rest@#]& /@ population​
    ​];
    In[]:=
    Print[statesAssociation["Alaska"]];​
    ​Print[statesAssociation["Arkansas"]];
    nameAlaska,abbrAK,entity
    Alaska, United States
    ,population_decennial191064356,192055036,193059
    278,194072524,1950128643,1960226167,1970300382,1980401851,1990550043,2000626932,2010710
    231
    nameArkansas,abbrAR,entity
    Arkansas, United States
    ,population_decennial19101574449,19201752204,
    19301854482,19401949387,19501909511,19601786272,19701923295,19802286435,19902350725,
    20002673400,20102915918

    And now the apportionment data