Edit Distance Algorithm: Categorification
Categories for the Working Programmers
Part 2: NNG graph of words with Edit Distance form a Path Category
(fully coded)
Dara O Shayda
dara@compclassnotes.com
In[]:=
DateObject[]
Out[]=
Mon 8 Jun 2026 18:13:59GMT+1
Abstract
In part 1 of Edit Distance we learned how to code the distance metric and the corresponding NNG (Nearest Neighbor Graph) and conceptualized this graph as a space for the words to reside within. Now in part 2, learn how to code to extend this NNG into a Path Category NNG and construct codexplanations for the properties of a category. Finally we learn to construct an EndoFunctor called StringTrim, from the new word-Path category to itself, such that its Identity Natural Transformation is actually a single familiar computational function which trims off repeated tokens from the beginning and the end of a word. We hope the programmers can code a Eureka moment to understand that all that which we program are actually natural transformations between some categories.
​
​codexplain: code explain as in explain something by its construction code.
Live Notebooks:
​Part 2: https://www.wolframcloud.com/obj/ccn2/Published/edit_distance_note2.nb
​Part 1: https://www.wolframcloud.com/obj/ccn2/Published/edit_distance_note1.nb
​
​CCN NODE:
​Part 2: https://discuss.compclassnotes.com/t/edit-distance-algorithm-categorification/274
​Part 1: https://discuss.compclassnotes.com/t/edit-distance-algorithm/268
© 2012-Present CCN Studios​​Creative Commons Attribution-NonCommercial-ShareAlike 4.0​​​
 To find hidden scripts, click on either vertical bars to toggle to view or hide the code. See below.
In[]:=
Clear[editDistance];​​myEditDistance[s0_,t0_]:=Module[{m,n,s,t,d,d2,substitutionCost},​​(*foralliandj,d[i,j]willholdthedistancebetween​​thefirsticharactersofsandthefirstjcharactersoft​​notethatdhas(m+1)*(n+1)values​​*)​​m=StringLength[s0];​​n=StringLength[t0];​​d=ConstantArray[0,{m+1,n+1}];​​​​(*1-indexedarrays*)​​s=StringSplit[s0,""];​​t=StringSplit[t0,""];​​​​(*sourceprefixescanbetransformedintoemptystringby​​droppingallcharacters*)​​(*1-indexedmatrix*)​​Do[d[[i,1]]=i-1,{i,2,m+1}];​​​​(*targetprefixescanbereachedfromemptysourceprefix​​byinsertingeverycharacter​​*)​​Do[d[[1,j]]=j-1,{j,2,n+1}];​​​​substitutionCost=0;​​​​Do[​​If[​​s[[i-1]]==t[[j-1]],substitutionCost=0,substitutionCost=1​​];​​​​d[[i,j]]=Min@{​​(d[[i-1,j]]+1),(*deletion*)​​(d[[i,j-1]]+1),(*insertion*)​​(d[[i-1,j-1]]+substitutionCost)(*substitution*)​​},​​​​{j,2,n+1},{i,2,m+1}];​​​​d2=Prepend[Transpose@d,Join[{""},StringSplit[s0,""]]];​​d2=Prepend[Transpose@d2,Join[{"",""},StringSplit[t0,""]]];​​d2[[Dimensions[d2][[1]],Dimensions[d2][[2]]]]=(Highlighted@d2[[Dimensions[d2][[1]],Dimensions[d2][[2]]]]);​​Clear[];​​=d2//MatrixForm;​​​​d[[m+1,n+1]]​​​​];
NearestNeighborGraph[ ]
Words, and their similars reside in graphs called the Nearest Neighbor Graphs (NNG) [3]. Let us construct one for one single base word namely
"mesh"
:
(*CommentouttheSeedRandomtovariatethewordselections*)​​SeedRandom[ToString@myEditDistance];​​dict=DeleteDuplicates@Join[RandomSample[Nearest[DictionaryLookup[]]["mesh",1000],10],{"mesh","me"}];​​nng=NearestNeighborGraph[dict,{3,3},VertexLabels->"Name",DistanceFunction->myEditDistance,ImageSize->500,DirectedEdgesTrue,GraphLayout->{"SpringEmbedding","Rotation"0Degree}]
Out[]=
In[]:=
edges=EdgeList@nng
Out[]=
{dimesekes,dimesmesh,dimesme,beyseff,beysheed,beysbean,effbeys,effheed,effgas,heedbeys,heedmend,heedheap,gasbeys,gaseff,gasheap,mendheed,mendmesh,mendme,heapbeys,heapheed,heapbean,mushedmesh,beanbeys,beaneff,beanheap,ekesdimes,ekesbeys,ekeseff,meshdimes,meshmend,meshme,medimes,memend,memesh}
In[]:=
nnglables=Table[edges[[i]]Subscript["→",i],{i,1,Length@edges}]
Out[]=
{dimesekes
→
1
,dimesmesh
→
2
,dimesme
→
3
,beyseff
→
4
,beysheed
→
5
,beysbean
→
6
,effbeys
→
7
,effheed
→
8
,effgas
→
9
,heedbeys
→
10
,heedmend
→
11
,heedheap
→
12
,gasbeys
→
13
,gaseff
→
14
,gasheap
→
15
,mendheed
→
16
,mendmesh
→
17
,mendme
→
18
,heapbeys
→
19
,heapheed
→
20
,heapbean
→
21
,mushedmesh
→
22
,beanbeys
→
23
,beaneff
→
24
,beanheap
→
25
,ekesdimes
→
26
,ekesbeys
→
27
,ekeseff
→
28
,meshdimes
→
29
,meshmend
→
30
,meshme
→
31
,medimes
→
32
,memend
→
33
,memesh
→
34
}
In[]:=
nng2=Graph[DeleteDuplicates@Flatten@Map[{#[[1]],#[[2]]}&,edges],edges,VertexLabels->"Name",EdgeLabelsnnglables,GraphLayout->{"SpringEmbedding","Rotation"0Degree},ImageSize->550]
Out[]=
Q: Why
"mushed"
has no incoming edges/arrows?
Write a tiny bit of code to codexplanation the answer this question:
In[]:=
VertexInComponent[nng,{"mushed"},{1}]
Out[]=
{}
codexplain: code explain as in explain something with its construction code.
In[]:=
VertexOutComponent[nng,{"mushed"},{1}]
Out[]=
{mesh}
● Recall the definition of an edge between two vertices in a NNG graph:
{1} here indicates the distance in the number of edges in nng.
​
and the particular implementation of the NNG used these 3 and discarded the rest.
Visualizations: Subgraphs
Visualizations: Paths
Visualizations: All Path between two vertices
Categorification
Digression: T, ξ
Example
The usual bracketed expressions:
Why? Move away from Function Paradigm
This is how a Piecewise [ ] function is used; try to use it for hundreds to thousands of terms e.g. solutions to FEM equations:
Vector Dot Product Paradigm
Compute two vectors 1 and v2:
Compute their dot product and that computes an algebraic product like version of the piecewise function!
Monad メカ (Mechanation)
Mecha Ref[13].
Let us codexplain a few examples.
ξ : Expressions
ξ : Strings
● You already know these functions in Wolfram programming language namely for Expressions or Strings:
​
T is Level[ ] or StringSplit[ ]
ξ is Total[ ] or StringJoin[ ]
Write a bit of code to study ϕ :
● For strings we introduce a bit of tease which surprisingly is not provided in the Wolfram language! As it should have been:
Treat String like an Array
NNG Categorification: Path Category
Example
Sample Morphisms (paths)
Composition
Bird’s Eye View
Corollary: myEditDistance [ ] metric and corresponding NNG graph form a Path Category.
​
​● This indicates that these particular Path Categories are similar structures appearing independent of how the strings and their metrics were defined.
​
​● Post construction, the sets actually do not play any roles.
​
​We call this categorification of words.
Functors
● A Functor, between two categories, is a tuple of two maps which combined serve similar to a homomorphism, preserving the similarities between two categories
​
​● An EndoFunctor, from a category to itself, preserves the self-similarity or the category’s inner-symmetries
Definition
● The object function F which assigns to each object X of C an object F(X) of D
η Natural Transformations: StringTrim[ ]
Identity Natural Transformation
You might note the path or morphism is shrunken
​
References
[1] https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm
​
[2] https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
​
[3] https://en.wikipedia.org/wiki/Nearest_neighbor_graph
​
[4] https://en.wikipedia.org/wiki/Categorification
​
[5] Francis Borceux, Handbook of categorical Algebra vol. 1 , Cambridge UP 1994.
​
[6] https://en.wikipedia.org/wiki/Category_theory
​
[7] https://en.wikipedia.org/wiki/Monomial
​
[8] Saunders Mac Lane, Categories for the Working Mathematician
Second Edition
​
[9] Francis Borceux, Handbook of categorical Algebra vol. 2 , Cambridge UP 1994.
​
[10] https://en.wikipedia.org/wiki/Natural_transformation
​
[11] https://math.stackexchange.com/questions/1075000/functors-that-has-a-natural-transformation-from-identity#1075027
​
[12] Monads: Categories for the Working Programmers (fully coded) Part 1-α14.0 December 2025
DOI: 10.13140/RG.2.2.15034.25286
https://www.researchgate.net/publication/398327914_Monads_Categories_for_the_Working_Programmers_fully_coded_Part_1-a140
​
[13] https://en.wikipedia.org/wiki/Mecha
​