Square-free word

1

In combinatorics, a square-free word is a word (a sequence of symbols) that does not contain any squares. A square is a word of the form XX, where X is not empty. Thus, a square-free word can also be defined as a word that avoids the pattern XX.

Finite square-free words

Binary alphabet

Over a binary alphabet {0,1}, the only square-free words are the empty word, and 101.

Ternary alphabet

Over a ternary alphabet {0,1,2}, there are infinitely many square-free words. It is possible to count the number c(n) of ternary square-free words of length n. This number is bounded by, where. The upper bound on \alpha can be found via Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves square-freeness.

Alphabet with more than three letters

Since there are infinitely many square-free words over three-letter alphabets, this implies there are also infinitely many square-free words over an alphabet with more than three letters. The following table shows the exact growth rate of the k-ary square-free words, rounded off to 7 digits after the decimal point, for k in the range from 4 to 15:

2-dimensional words

Consider a map \textbf{w} from to A, where A is an alphabet and \textbf{w} is called a 2-dimensional word. Let w_{m,n} be the entry. A word \textbf{x} is a line of \textbf{w} if there exists such that, and for. Carpi proves that there exists a 2-dimensional word \textbf{w} over a 16-letter alphabet such that every line of \textbf{w} is square-free. A computer search shows that there are no 2-dimensional words \textbf{w}over a 7-letter alphabet, such that every line of \textbf{w} is square-free.

Generating finite square-free words

Shur proposes an algorithm called R2F (random-t(w)o-free) that can generate a square-free word of length n over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a (k+1)-ary square-free word. algorithm R2F is input: alphabet size k \ge 2, word length n > 1 output: a (k+1)-ary square-free word wof length n. choose w[1] in uniformly at random set \chi_w to w[1] followed by all other letters of in increasing order set the number N of iterations to 0 while |w| < n do choose j in \Sigma_{k} uniformly at random append to the end of w update \chi_w shifting the first j elements to the right and setting increment N by 1 if w ends with a square of rank r then delete the last r letters of w return w Every (k+1)-ary square-free word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of w. The expected number of random k-ary letters used by Algorithm R2F to construct a (k+1)-ary square-free word of length n isNote that there exists an algorithm that can verify the square-freeness of a word of length n in O(n \log n) time. Apostolico and Preparata give an algorithm using suffix trees. Crochemore uses partitioning in his algorithm. Main and Lorentz provide an algorithm based on the divide-and-conquer method. A naive implementation may require O(n^2) time to verify the square-freeness of a word of length n.

Infinite square-free words

There exist infinitely long square-free words in any alphabet with three or more letters, as proved by Axel Thue.

Examples

First difference of the Thue–Morse sequence

One example of an infinite square-free word over an alphabet of size 3 is the word over the alphabet {-1,0,+1} obtained by taking the first difference of the Thue–Morse sequence. That is, from the Thue–Morse sequence one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting square-free word is

Leech's morphism

Another example found by John Leech is defined recursively over the alphabet {0,1,2}. Let w_1 be any square-free word starting with the letter 0. Define the words recursively as follows: the word w_{i+1} is obtained from w_i by replacing each 0 in w_i with 121,021,201,210, each 1 with 1,202,102,012,021, and each 2 with 2,010,210,120,102. It is possible to prove that the sequence converges to the infinite square-free word 0121021201210120210201202120102101201021202102012021...

Generating infinite square-free words

Infinite square-free words can be generated by square-free morphism. A morphism is called square-free if the image of every square-free word is square-free. A morphism is called k–square-free if the image of every square-free word of length k is square-free. Crochemore proves that a uniform morphism h is square-free if and only if it is 3-square-free. In other words, h is square-free if and only if h(w) is square-free for all square-free w of length 3. It is possible to find a square-free morphism by brute-force search. algorithm square-free_morphism is output: a square-free morphism with the lowest possible rank k. set k = 3 while True do set k_sf_words to the list of all square-free words of length k over a ternary alphabet for each h(0) in k_sf_words do for each h(1) in k_sf_words do for each h(2) in k_sf_words do if h(1) = h(2) then break from the current loop (advance to next h(1)) if and then if h(w) is square-free for all square-free w of length 3 then return increment k by 1 Over a ternary alphabet, there are exactly 144 uniform square-free morphisms of rank 11 and no uniform square-free morphisms with a lower rank than 11. To obtain an infinite square-free words, start with any square-free word such as 0, and successively apply a square-free morphism h to it. The resulting words preserve the property of square-freeness. For example, let h be a square-free morphism, then as, h^{w}(0) is an infinite square-free word. Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is square-free if and only if it is 5-square-free.

Letter combinations in square-free words

Avoid two-letter combinations

Over a ternary alphabet, a square-free word of length more than 13 contains all the square-free two-letter combinations. This can be proved by constructing a square-free word without the two-letter combination ab. As a result, bcbacbcacbaca is the longest square-free word without the combination ab and its length is equal to 13. Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary two-letter combination.

Avoid three-letter combinations

Over a ternary alphabet, a square-free word of length more than 36 contains all the square-free three-letter combinations. However, there are square-free words of any length without the three-letter combination aba. Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary three-letter combination.

Density of a letter

The density of a letter a in a finite word w is defined as where |w|_a is the number of occurrences of a in w and |w| is the length of the word. The density of a letter a in an infinite word is where w_l is the prefix of the word w of length l. The minimal density of a letter a in an infinite ternary square-free word is equal to. The maximum density of a letter a in an infinite ternary square-free word is equal to.

This article is derived from Wikipedia and licensed under CC BY-SA 4.0. View the original article.

Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc.
Bliptext is not affiliated with or endorsed by Wikipedia or the Wikimedia Foundation.

Edit article