Historical popularities of subjects based on word frequency​
​by Audrey Shin
In this project, the functions WordData and WordFrequencyData were used to return the sub-words associated with the given word and the mention frequency of the given words respectively. Used in conjunction, the mention frequency of all sub-words of a certain subject over history was found. This was then plotted using DateListPlot and Manipulate.

Finding Sub-Words Using WordData

Different subject categories fall in and out of fashion over time; one way to track this is by looking at the frequency with which associated words are mentioned. For example, information science, as a field, is far more prevalent than it was, say, 100 years ago (perhaps because it had not even been invented at the time). This project aims to track the associated terms of a given word to attempt to gain an understanding of its popularity over time.
To begin with, the sub-words of the given word needed to be found. This was done using the "NarrowerTerms" property of WordData, which returns the related terms of the given word.
WordData was used to find related words:
In[]:=
WordData["art","NarrowerTerms"]
Out[]=
{{art,Noun,CreativeActivity}{arts and crafts,carving,ceramics,decalcomania,decoupage,drafting,draftsmanship,drawing,gastronomy,glyptography,origami,painting,perfumery,printmaking,sculpture,topiary},{art,Noun,FineArt}{artificial flower,commercial art,cyberart,dance,decoupage,diptych,gem,genre,graphic art,grotesque,kitsch,mosaic,plastic art,treasure,triptych,work of art},{art,Noun,SuperiorSkill}{airmanship,aviation,enology,eristic,falconry,fortification,homiletics,horology,minstrelsy,musicianship,oenology,puppetry,taxidermy,telescopy,ventriloquism,ventriloquy},{art,Noun,VisualCommunication}{drawing,illustration}}
Keeping in mind that the function WordFrequencyData is not usable with terms containing blank spaces, the process of finding the sub-words of the given word was summarized into a function.
The final function for obtaining the sub-categories and narrower terms of a given field:
In[]:=
getsubwords[subj_String]:=With[{word=WordData[subj,"NarrowerTerms"]},AssociationThread[Keys[word][[All,-1]],Table[Map[StringDelete[#," "]&,Values[word][[i]]],{i,1,Length[word]}]]]​​removemissing[keywords_]:=KeyDrop[keywords,Keys[Select[keywords,Length[#]==0&]]]​​cleangetsubwords[subj_String]:=removemissing[getsubwords[subj]]​​cleangetsubwords["art"]
Out[]=
CreativeActivity{artsandcrafts,carving,ceramics,decalcomania,decoupage,drafting,draftsmanship,drawing,gastronomy,glyptography,origami,painting,perfumery,printmaking,sculpture,topiary},FineArt{artificialflower,commercialart,cyberart,dance,decoupage,diptych,gem,genre,graphicart,grotesque,kitsch,mosaic,plasticart,treasure,triptych,workofart},SuperiorSkill{airmanship,aviation,enology,eristic,falconry,fortification,homiletics,horology,minstrelsy,musicianship,oenology,puppetry,taxidermy,telescopy,ventriloquism,ventriloquy},VisualCommunication{drawing,illustration}

Tracking Mention Frequency By WordFrequencyData

After finding the sub-words of the given word, the mention frequency of each was found using the property "TimeSeries" of the function WordFrequencyData. This gives the mention frequency of the given word from approximately 1700 to modern day, using the Google Books English n-gram public dataset.
Using the option "TimeSeries" of the function WordFrequencyData, returns a TimeSeries of it's frequency mentioned:
In[]:=
WordFrequencyData["art","TimeSeries"]
Out[]=
TimeSeries
Time: 01 Jan 1700 to 01 Jan 2019
Data points: 320

Before converting this into a function, it was also considered that graphs that showed up as "Missing" for unforeseen reasons would be deleted preemptively to not interfere with usage.
The final function for getting the mention frequency for each sub-word:
In[]:=
wordfreqts[subj_String]:=With[{words=cleangetsubwords[subj]},AssociationThread[Keys[words]->Map[DeleteMissing,Table[WordFrequencyData[words[[i]],"TimeSeries"],{i,1,Length[words]}]]]]

Plotting Data Using DateListPlot and Manipulate

Using DateListPlot to plot WordFrequencyData of narrower terms:
In[]:=
DateListPlotWordFrequencyData["carving","TimeSeries"],

Out[]=
Via Mapping DateListPlot, WordFrequencyData of Narrower Terms in each Sub-Category can be displayed alone or with others to compare.
Creates a Manipulate displaying every narrower terms' WordFrequencyData as a DateListPlot:
Createsubjectmanipulate[subj_String]:=With{finaldata=wordfreqts[subj]},ManipulateMapDateListPlot#,
&,Dynamic[KeyTake[finaldata[Subject],Field]],{Subject,Keys[finaldata],PopupMenu},{Field,Keys[finaldata[Subject]],TogglerBar,Appearance"Row"},


Conclusion

In conclusion, WordDataFrequency made it simple but interesting to track word usage throughout history. One of the findings were that within the arts, categories like drawing proved to be far more prevalent through most of history, especially in recent times; even when considering the rise of digital art and assisted drawing like drafting. An extension of this project could be looking for specific texts from the when the word was being most frequently used.

The Final Graphs

One Manipulate graph was created for each word of the following list: art, mathematics, and philosophy.

Art Subject WordDataFrequency Visualization

In[]:=
Createsubjectmanipulate["art"]
Out[]=
​
Subject
CreativeActivity
Field
artsandcrafts
carving
ceramics
decalcomania
decoupage
drafting
draftsmanship
drawing
gastronomy
glyptography
origami
painting
perfumery
printmaking
sculpture
topiary

Mathematics Subject WordDataFrequency Visualization

In[]:=
Createsubjectmanipulate["mathematics"]
Out[]=
​
Subject
Noun
Field
appliedmathematics
puremathematics

Philosophy Subject WordDataFrequency Visualization

Out[]=
​
Subject
PhilosophicalSystem
Field
abolitionism
absolutism
amoralism
animalism
animism
antiestablishmentarianism
antiestablishmentism
asceticism
Cabalism
churchdoctrine
commandment
contextualism
creationism
credo
creed
democracy
descriptivism
divineright
dogma
dualism
dynamism
egalitarianism
epicureanism
equalitarianism
establishmentarianism
establishmentism
ethicism
expansionism
feminism
formalism
freethinking
functionalism
Girondism
gospel
gymnosophy
humanism
humanitarianism
imitation
individualism
internationalism
irredentism
irridentism
Kabbalism
laissezfaire
literalism
majorityrule
millennium
monism
multiculturalism
nationalism
nihilism
nucleardeterrence
pacificism
pacifism
passivism
phenomenology
pluralism
populism
precept
prescriptivism
presentism
rationalism
reformism
reincarnationism
religiousdoctrine
secessionism
secularhumanism
secularism
states'rights
teaching
unilateralism
utilitarianism
◼
  • Keywords: Language, History, Frequency