Navigating the Entity Framework

Navigating the Wolfram Entity Framework can be confusing and sometimes it can be a hit or miss from my experience. Often we start with Ctrl-= to get a semantic interpretation to get us started, but then again how do we go from there? There are too many ways to achieve the same thing in the Entity framework with all the special Entity expressions out there. I always wanted to be able to get Entity information right from the start with the least amount of expressions.
​
So during the last few days, I took a deep dive into the framework to find Expressions that can get me information quickly and reasonably consistently.
In this post, I would like to share my findings as a quick tutorial.

EntityValue is Key

The key expression to start navigating the entity framework is EntityValue. It’s key because it will get you a list of entities, entity classes, properties, property classes and property metadata.
​
On the highest level of abstraction (group) there are Entity Types.
You can evaluate EntityValue[ ] to get a list of CanonicalNames of the all the Entity Types.

Entity Types

In[]:=
EntityValue[]//Short
Out[]//Short=
{AdministrativeDivision,Aircraft,Airline,Airport,AirportType,Alphabet,332,WritingScriptWhiteSpace,YogaPose,YogaPosition,YogaProp,YogaSequence,ZIPCode}

Entities

Next we’ll find the Canonical name of the Entity type we’re are interested in i.e. “Country”.
To retrieve all entities we can start with EntityList , but some entity type lists may contain many entities which takes some time to load.
So I suggest you first verify how many entities there are. We will use EntityCount (which is also fast to evaluate) as so:
In[]:=
EntityValue["Country","EntityCount"]
Out[]=
250
As expected in the case of countries, this is a manageable number of Entity objects, so we proceed with EntityList to produce a list of 250 country Entitiy objects:
In[]:=
EntityList@EntityValue["Country"]//Shallow
Out[]//Shallow=

Afghanistan
,
Aland Islands
,
Albania
,
Algeria
,
American Samoa
,
Andorra
,
Angola
,
Anguilla
,
Antarctica
,
Antigua and Barbuda
,240
For this quick tutorial we are interested in the United States, which is this entity:
In[]:=
United States
COUNTRY
//InputForm
Out[]//InputForm=
Entity[Country, UnitedStates]
The country entity obviously has geographic information enclosed in the Entity object i.e. we can plot the entity directly in e.g. GeoListPlot :
In[]:=
GeoListPlot
United States
COUNTRY

Out[]=
Note: if the entity list is too unwieldy, transform the entities into a list of CanonicalNames as so and perform a “search”:
In[]:=
CanonicalName/@EntityList@EntityValue["Country"]//Short​​Select[StringMatchQ[___~~"United"~~___]]@%
Out[]//Short=
{Afghanistan,AlandIslands,Albania,Algeria,AmericanSamoa,Andorra,Angola,236,Vietnam,WallisFutuna,WestBank,WesternSahara,Yemen,Zambia,Zimbabwe}
Out[]=
{UnitedArabEmirates,UnitedKingdom,UnitedStates,UnitedStatesMinorOutlyingIslands,UnitedStatesVirginIslands}

Entity Properties

So what properties can we find for the United States entity?
Entity properties are defined on the highest level i.e. at the level of Entity Type , but we can also find them at the Entity-object level in the same way i.e. through the “Properties” option as so:
In[]:=
EntityValue["Country","Properties"]//Shallow​​
United States
COUNTRY
["Properties"]//Shallow
Out[]//Shallow=

adjusted net national income
,
seasonal bank borrowings from Fed, plus adjustments
,
regions
,
adult population
,
obese adults
,
number of aggravated assaults
,
rate of aggravated assault
,
aggregate home value
,
aggregate home value, householder 15 to 24 years
,
aggregate home value, householder 25 to 34 years
,741
Out[]//Shallow=

adjusted net national income
,
seasonal bank borrowings from Fed, plus adjustments
,
regions
,
adult population
,
obese adults
,
number of aggravated assaults
,
rate of aggravated assault
,
aggregate home value
,
aggregate home value, householder 15 to 24 years
,
aggregate home value, householder 25 to 34 years
,741
The result is a list of EntityProperty objects which can be long and in this case the list compromises of 751 property objects:
In[]:=
United States
COUNTRY
["PropertyCount"]
Out[]=
751
Note: we can also produce a list of entity properties as a list of CanonicalNames in the same way we did before.
In this example, we are interested in the population size
In[]:=
population
//InputForm
Out[]//InputForm=
EntityProperty[Country, Population]
Now the Entity property quantity can be easily retrieved in operator form as so:
In[]:=
United States
COUNTRY

population

Out[]=
331449281
people
Note: You can also use the CanonicalName instead of the population EntityProperty object.

Annotations

There are additional properties we can retrieve, annotations that is: “Source” and “Date” which concerns reference information and date.
We can get those data through the Entity object as so:
In[]:=
United States
COUNTRY

population
,"Source"​​
United States
COUNTRY

population
,"Date"
Out[]=

UNdata

Out[]=
Year: 2020

Property Metadata

There is even more information we can retrieve which is enclosed in so-called metadata for which we have to go back to our EntityValue expression as so:
In[]:=
EntityValue
population
,"Qualifiers"
Out[]=
{Age,CitizenshipStatus,Date,Gender,HispanicOrigin,MarginOfError,Percent,Race,TwoOrMore,UrbanRural}
Note there are a number of metadata properties you can try out i.e.:
Out[]=
{Qualifiers,QualifierValues,Label,Definition,Source,PhysicalQuantity,Unit}
The result is a list of property metadata.
​
We are particularly interested in the QualifierValues which gives a list of “rules” of metadata properties we can retrieve
In[]:=
EntityValue
population
,"QualifierValues"
Out[]=
{Age{Adult,MiddleAge,PreSchool,SchoolAge,Senior,Young,YoungAdult},CitizenshipStatus{BornInPuertoRico,BornInUS,BornToAmericanParents,NaturalizedCitizen,NotCitizen,TotalCitizens},Date{},Gender{Female,Male},HispanicOrigin{Argentinean,Bolivian,CentralAmerican,Chilean,Colombian,CostaRican,Cuban,Dominican,Ecuadorian,Guatemalan,Hispanic,HispanicOrLatinoAllOther,Honduran,Mexican,Nicaraguan,NotHispanic,OtherCentralAmerican,OtherHispanicOrLatino,OtherSouthAmerican,Panamanian,Paraguayan,Peruvian,PuertoRican,Salvadoran,SouthAmerican,Spaniard,Spanish,SpanishAmerican,Uruguayan,Venezuelan},MarginOfError{MarginOfError,StandardError},Percent{Main},Race{AmericanIndian,Asian,Black,NativeHawaiian,Other,TwoOrMore,White,{All,Hispanic}},TwoOrMore{ThreeOrMore,TwoIncludingOther},UrbanRural{Rural,Urban}}
We can retrieve the metadata by reverting to the UnitedStates Entity and the CanonicalName of the entity property one-at-a-time as so:
In[]:=
United States
COUNTRY
["Population","Gender""Female"]
Out[]=
164810876
people
In[]:=
United States
COUNTRY
["Population","CitizenshipStatus""BornToAmericanParents"]
Out[]=
3129487
people

Dated metadata

Some entities have TimeSeries information in the QualifierValues’ Date, which can be retrieved as so:
In[]:=
United States
COUNTRY
["Population","Date"{DateObject[{1910,1,1}],DateObject[{2022,1,1}]}]​​DateListPlot@%
Out[]=
TimeSeries
Time: 01 Jan 1910 to 01 Jan 2022
Data points: 113

Out[]=

Differences between entities

For country entities, we used a list with a begin date and and an end date in DateObject format to retrieve a TimeSeries object.
The way we can find dated metadata may vary per entity. Here are two other examples where we have to use an Interval or a DateRange.
For example Planet Entity properties e.g. Venus’ distance from earth of can be retrieved one -date-at-a-time, so if we need a time series we need a few intermediate steps:
In[]:=
Function[a,{a,Entity["Planet","Venus"]["DistanceFromEarth","Date"->a]}]/@DateRange
Wed 1 Jan 2014
,
Thu 1 Aug 2019
,"Month"//TimeSeries​​DateListPlot@%
Out[]=
TimeSeries
Time: 01 Jan 2014 to 01 Aug 2019
Data points: 68

Out[]=
​
This example uses the entity framework as a source of financial data (in conjunction with FinancialData, see below).
TimeSeries data can be retrieved directly, using an Interval (not a DateInterval..BTW) as so:
In[]:=
Capgemini
FINANCIAL ENTITY
["Volatility20Day","Date"Interval@{DateObject[{2014,1,1}],DateObject[{2018,1,1}]}]​​DateListPlot@%
Out[]=
TimeSeries
Time: 31 Dec 2013 to 29 Dec 2017
Data points: 1013

Out[]=

Using the Dated Expression on Entity

There is another way to retrieve dated Entity data i.e. using the Dated[ ] expression directly on the Entity object.
The way we can use it for e.g. country entities is by using the Interval of two dates i.e.:
In[]:=
Dated
United States
COUNTRY
,Interval@{DateObject[{1998,7,1}],Today}["Population"]
Out[]=
TimeSeries
Time: 01 Jan 1998 to 01 Jan 2022
Data points: 25

Note that, again, DateInterval should have been a more logical choice for a date interval. ​With the Dated expression we have the option to retrieve all dated information using the option “All” as so:
In[]:=
Dated
United States
COUNTRY
,All["Population"]​​DateListPlot@%
Out[]=
TimeSeries
Time: 01 Jan 1600 to 01 Jan 2050
Data points: 181

Out[]=
For commodity prices we can retrieve the complete Time Series as so:
In[]:=
Dated[Entity["Element","Gold"],All]["Price"]​​DateListPlot@%
Out[]=
TimeSeries

Time: 01 Jul 1900 to 07 May 2022
Data points: 14199
Data not saved. Save now

Out[]=
Again using Dated is still limited to one-date-at-a-time for Planet Entities so if we want a TimeSeries we have to use this instead:
In[]:=
Function[a,{a,Dated[Entity["Planet","Venus"],a]["DistanceFromEarth"]}]/@DateRange
Wed 1 Jan 2014
,
Thu 1 Aug 2019
,"Month"//TimeSeries​​DateListPlot@%
Out[]=
TimeSeries
Time: 01 Jan 2014 to 01 Aug 2019
Data points: 68

Out[]=

Entity Properties as a Dataset

So far I focussed on single properties per Entity, but we can also retrieve all properties of an Entity in e.g. Dataset format directly from the Entity:
In[]:=
United States
COUNTRY
["Dataset"]//DeleteMissing
Out[]=
adjusted net national income
$
1.81958×
13
10
per year
seasonal bank borrowings from Fed, plus adjustments
$2.7×
7
10
regions
{
…
51
}
adult population
2.14156×
8
10
people
obese adults
31.1%
number of aggravated assaults
821182
crimes
/yr
rate of aggravated assault
0.002502
crimes
/(personyr)
aggregate home value
$23694653539900
aggregate home value, householder 15 to 24 years
$115468621800
aggregate home value, householder 25 to 34 years
$1739775936100
aggregate home value, householder 35 to 64 years
$14471046106900
aggregate home value, householder 65 years and over
$7368362875000
aggregate household income
$
1.06998×
13
10
per year
aggregate weekly hours index
119.6
agricultural irrigated land fraction
5.78552%
agricultural land fraction
44.3634%
agricultural production index
105.1
agricultural production per capita index
98.9
agricultural products
{
…
12
}
arrivals by air
3.0325×
7
10
people/yr
rows 1–20 of
675

Entity Classes

Some Entity Types may contain EntityClasses which are groups of Entity Types which we can retrieve using the EntityClassList as so:
In[]:=
EntityClassList@EntityValue["Country"]//Shallow
Out[]//Shallow=

Africa
,
African Caribbean and Pacific Group
,
African Development Bank
,
African Union
,
Agency for the Prohibition of Nuclear Weapons in Latin America and the Caribbean
,
Agency of Cultural and Technical Cooperation
,
Alliance of Small Island States
,
The Americas
,
Andean Community of Nations
,
Antarctica
,349
The Country Entity Class comprise of countries grouped by continents, unions etc.
For example: which countries are part of the European Union?
In[]:=
EntityList@EntityClass["Country","EuropeanUnion"]
Out[]=

Austria
,
Belgium
,
Bulgaria
,
Croatia
,
Cyprus
,
Czech Republic
,
Denmark
,
Estonia
,
Finland
,
France
,
Germany
,
Greece
,
Hungary
,
Ireland
,
Italy
,
Latvia
,
Lithuania
,
Luxembourg
,
Malta
,
Netherlands
,
Poland
,
Portugal
,
Romania
,
Slovakia
,
Slovenia
,
Spain
,
Sweden


Property Classes

Some Entity properties are grouped into property classes , similar to entities classes.
There is no such expression as EntityPropertyClassList, but the property classes are retrieved in a similar way we retrieve properties i.e.:
In[]:=
EntityValue["Country","PropertyClasses"]
Out[]=

economic properties

We can use the Entity to get all the relevant data from the PropertyClass in one go:
In[]:=
United States
COUNTRY

economic properties

Out[]=
0.414,
$
19278194000000
per year
,
$
63543.6
per personper year
,
$
2.09366×
13
10
per year
,
$
20936600000000
per year
,
-3.48614
%
per year
,
1.23358
%
per year

We retrieve a list of PropertyClass properties associated with this class in a similar to the way we get Entity properties i.e. through the “Properties” option as so:
In[]:=
economic properties
["Properties"]
Out[]=

Gini index
,
GDP
real
,
GDP
nominal|per capita
,
GDP
nominal|Default
,
GDP
nominal|PPP
,
GDP
real|Default|annual growth rate
,
inflation rate
consumer prices

Now we can retrieve any property of the property class from the Entity in our example:
In[]:=
United States
COUNTRY

GDP
real

Out[]=
$
19278194000000
per year
Note we can also getPropertyClass properties from EntityClasses e.g.:
In[]:=
European Union
COUNTRIES

economic properties
[[1;;2]]
Out[]=
0.308,
$
3.88884×
11
10
per year
,
$
48327.6
per personper year
,
$
4.30947×
11
10
per year
,
$
4.91315×
11
10
per year
,
-6.25903
%
per year
,
1.38191
%
per year
,0.272,
$
4.61772×
11
10
per year
,
$
44594.4
per personper year
,
$
5.15332×
11
10
per year
,
$
6.00544×
11
10
per year
,
-6.28393
%
per year
,
0.740792
%
per year


Specialized Entity Types

Of the Entity Types, there are a few curated special Entity types. These are named <XXX>Data and you can find them as follows:
In[]:=
?*Data
I will not go into detail here about the specialized entity types e.g. CountryData (where we can find data on the UnitedStates)., but you can experiment with it.

Summary

Your key expression to navigate the Wolfram Entity Framework is EntityValue. You can use it to retrieve: entity types, entities, properties, entity classes , property classes and finally property metadata. Once you’ve found the entities and their properties of interest, you can use the Entity object to get the property quantities and metadata. To sum up the expressions I use most:​​EntityValue[ ] -> List of Entity Type CanonicalName​​EntityValue[ Entity type CanonicalName,“Properties” ] -> List of EntityProperty objects​EntityValue[ Entity type CanonicalName,”PropertyClasses” ] -> List of EntityPropertyClass objectsEntityValue[ Entity Property object, “metadata” ] -> List of Property metadata rules​​EntityList@ EntityValue[ Entity Type CanonicalName ] -> List of Entity objects​EntityClassList@ EntityValue[ Entity Type CanonicalName ] -> List of EntityClass objects​EntityList@ EntityClass object -> List of Entity objects​​PropertyClass object [“Properties”] -> List of EntityProperty objects​​Entity object [ “Dataset” ] -> Dataset with all Entity Properties​Entity object[ EntityProperty object ] -> Quantity​Entity object[ Property CanonicalName, metadata rule ] -> Quantity, Data, TimeSeries etc.
​
Of course this is my own selection of methods to navigate the framework and there are many, many ways to achieve the same results.
You can play with the framework to achieve all the identities available, but I like to keep it simple: EntityValue to get property objects/names, and the Entity object to get the data quantities.
​
I hope this is helpful. Have fun navigating the entity framework.
​
Cheers,
​
Dave