Population density maps are among the most valuable datasets, e.g. when investigating the epidemic outbreak of an infectious disease. The Humanitarian Data Exchange (HDX) seems to have world’s most detailed population density datasets. These high-resolution maps, created by Facebook, estimate the number of people living within 30-meter grid tiles. In this short computational essay, I will discuss how to covert these maps into a computable datasets in the Wolfram Language, with desirable size and shape of grid tiles. An an example, I shall focus on Qatar and show how to get desired maps using GeoHistogram, a very powerful built-in function in the Wolfram Language.
The HDX high resolution population density maps can be obtained through their website. The dataset is provided in in two formats, GeoTIFF and CSV, for distribution of various populations. I will download and process the CSV file of the overall population density. The same methods can be applied for any other desired population dataset. Since Qatar’s dataset is small, I shall import it directly. But for larger datasets, I suggest downloading the file and then importing it.
Import zipped data from HDX and get the CSV file:
In[]:=
qatar=Import
,"population_qat_2019-07-01.csv";
Return the number of tiles on the original data:
In[]:=
Length[qatar]
Out[]=
307646
Return the headings of the original data:
In[]:=
qatar〚1〛
Out[]=
{Lat,Lon,Population}
Generate a list of two sublists as geo positions and a corresponding population:
In[]:=
{geo,ppl}={qatar〚2;;,;;2〛,qatar〚2;;,-1〛};
Generate a list whose elements are {geoposition->population}:
In[]:=
data=Thread[(GeoPosition/@geo)ppl];
Let’s look at the data using the default bins and counts of GeoHistogram.
A closer look shows that the polygon points in the GeoHistogram output are also projected using GeoGridPosition. Accordingly, in order to extract polygons with correct geo positions, we need to include the projection using GeoGridPosition too.
Given this result, we now put the data in the format of a list with sublists as {polygon points, center of polygon, population density} and export it as CSV file (might be useful for users of other programming languages).
Summary
Given the data from HDX, generate grid tiles of population density for a given shape and size of tiles:
Define colors, bins and legends for the GeoGraphics function:
Visualize each tile with a color specified by cfunc, defined before:
Acknowledgment
I’d like to thank my colleagues, Alan Joyce (Wolfram|Alpha team) and José Martín-García (Algorithms R&D team), for their valuable input in helping with datasets and important suggestions on code.