Alphabetic and numerical order of numbers across languages

last revised 2023-12-31T09:03:07

Context

Daniel Litt tweeted a fun observation plotting the numerical order of numbers 10–100 against the alphabetic order in a few different languages. Inspired by this, I took a stab at plotting 1–100 for a slightly wider variety of languages. Unsurprisingly, there were a few subtleties (and bugs in Mathematica) prompting revision. This notebook captures how I generated the data and plots for anyone generous enough to take a closer look. There is also a Google Sheet here which lists the data being plotted for closer inspection.
​
Feedback welcome at @aresnick or alec@powderhouse.org

Finding supported languages

In[]:=
(*ListalllanguagesWolframLanguageknowsabout,converttheirnamestostrings(seemsmostreliablefor`IntegerName`usage),andcleanup*)​​languages=(EntityValue[#,"Name"]&/@(Flatten[EntityClassList["Language"]]//DeleteMissing))//Flatten//DeleteDuplicates//DeleteMissing//AlphabeticSort;
In[]:=
(*Selectonlythoselanguageswhich`IntegerName`supports1–100*)​​filterNoError[list_,f_]:=Parallelize[Select[list,Quiet[Check[f[#],False]]=!=False&]]​​supportedLanguages=filterNoError[languages,lang|->Scan[IntegerName[#,Language->lang]&,Range[100]]]
Out[]=
{Afrikaans,Albanian,Amharic,Arabic,Armenian,Azerbaijani,Bosnian,Bulgarian,Catalan,Chinese,Croatian,Czech,Danish,Dutch,English,Esperanto,Estonian,Ewe,Faroese,Filipino,Finnish,French,Georgian,German,Hebrew,Hindi,Hungarian,Icelandic,Indonesian,Irish,Italian,Japanese,Khmer,Kirghiz,Korean,Lao,Latvian,Lithuanian,Macedonian,Malay,Maltese,Persian,Polish,Portuguese,Romanian,Russian,Serbian,Slovak,Slovenian,Spanish,Swedish,Tamil,Thai,Turkish,Ukrainian,Vietnamese,Welsh}

Plotting numerical v. alphabetical order

In[]:=
calculateNumberWords[language_,range_:100]:=Block[​​{nums,words,wordOrder,data},​​nums=Range[range];​​words=Parallelize[IntegerName[#,Language->language]&/@nums];​​(*Returnthepositionsofeachwordintheappropriatelyalphabetizedlist*)​​wordOrder=OrderingBy[​​words,​​#&,All,​​If[​​MemberQ[{​​"Malay"(*IntegerNamesupportsMalay,butAlphabeticOrderdoesnot;however,MalayusesaLatinalphabetandordering,sowedefaulttothat.*)​​},language],​​AlphabeticOrder,​​AlphabeticOrder[language](*Formostlanguages,useWolfram'salphabeticorderingforthatlanguage*)​​]​​];​​(*Constructthesetofpointstoplot:xisthenumber,yisthealphabetizedposition*)​​data=<|"points"->Partition[Riffle[nums,wordOrder],2],"full"->Partition[(Riffle[Partition[Riffle[nums,words],2],wordOrder]//Flatten),3]|>​​]
In[]:=
(*Givenalanguageandrange,returnaplot*)​​plotNumberWords[language_,range_:100]:=Block[{data},​​data=calculateNumberWords[language,range]["points"];​​ListPlot[data,Axes->None,Frame->None,PlotLabel->language,PlotStyle->{Black,PointSize[If[range<500,Small,Tiny]]}]​​]
In[]:=
languagePlots100=AssociationMap[plotNumberWords[#,100]&,supportedLanguages];​​languagePlots1000=AssociationMap[plotNumberWords[#,1000]&,supportedLanguages];
In[]:=
plotLayout[plotAssociation_]:=GraphicsGrid[Partition[plotAssociation[#]&/@Sort[Keys@plotAssociation],Ceiling[Sqrt[Length@plotAssociation]]],Frame->All,ImageSize->Full]
In[]:=
plotLayout[languagePlots100]
Out[]=
In[]:=
plotLayout[languagePlots1000]
Out[]=
In[]:=
exportXLSX[languages_,range_,path_:"/Users/aresnick/Desktop/"]:=Block[​​{data},​​data=AssociationMap[​​Prepend[calculateNumberWords[#,range]["full"],{"Numeral","Word","Alphabetized Order"}]&,​​supportedLanguages​​];​​Export[FileNameJoin[{path,ToString[range]<>"_"<>DateString["ISODateTime"]<>".xlsx"}],Normal[data],"XLSX"]​​]
In[]:=
(*ExportanXLSfiletomakeaGoogleSheeteasyforotherstoinspect.ForsomereasonMathematicafailswhentryingtoexportarangebeyond~50*)​​exportXLSX[supportedLanguages,100]
Out[]=
/Users/aresnick/Desktop/100_2023-12-31T09:01:04.xlsx
In[]:=
DateString["ISODateTime"]
Out[]=
2023-12-31T09:03:07