Random Character Sequences Do Not Follow Zipf's Law
Random Character Sequences Do Not Follow Zipf's Law
This Demonstration shows that word frequencies [1] in random character sequences and real texts behave differently from the point of view of Zipf's law. (For random character sequences, a word means the smallest unit separated by blanks.) Data exhibiting Zipf-like behavior shows a roughly linear relationship between frequency and rank on a log-log plot.
We consider only one random sequence model. All characters, including the blank or space are equally likely. This model is specified with a single parameter, , the number of characters other than the space. was used in [2] (as mentioned in [1]). In this Demonstration, you can select between 2 and 26.
N
N∈{2,4,6,26}
N