|
The extended rational expressions over an alphabet A are defined recursively as follows.
Theorem 1 (Schützenberger 1965) A rational language is star-free if and only if its syntactic monoid is aperiodic.
Corollary 2 There is an algorithm to decide if a given rational language has star-height 0.
The complexity of this algorithm was analyzed by Stern (1985) and later by Cho and Huynh (1991). Given a finite deterministic automaton A, deciding whether A recognize a star-free set can be solved in polynomial space. It is also the complement of an NP-hard problem.
Schützenberger's theorem shows that there exist some languages of star-height 1, for instance (aa)* (it is not difficult to verify that the syntactic monoid of this language is not aperiodic). Schützenberger's Theorem also suggested to study the star-height problem through properties of the syntactic monoid. In this direction, Henneman has studied the languages whose syntactic monoids are groups. The case of commutative groups is especially interesting.
Theorem 3 (Henneman 1971) A language recognized by a finite commutative group is of star-height ≤ 1.
It is then easy to deduce the following consequence.
Corollary 4 A language recognized by a finite commutative monoid is of star-height ≤ 1.
For instance, if a is a letter of A, if B = A \ {a} and if 0 ≤ k ≤ n, then the language (B*a)k((B*aB*)n)* is star-free.
In the quest for a language of star-height > 1, the next level of complexity was nilpotent groups of class 2, and such candidates were actually proposed in Brzozowski (1980). However
Theorem 5 (Pin, Straubing, Thérien 1989) A language recognized by a finite nilpotent group of class ≤ 2 is of star-height ≤ 1.
The combinatorial structure of the languages recognized by nilpotent groups is related to the generalized binomial coefficients introduced by Eilenberg (1976), which count the number of times that a word u appears as a subword (in the sense of subsequence) of a word v. A precise description, involving the class of nilpotency of the group, is given in Thérien (1983). Let L(u,k,n) denote the set of words w in which the number of appearances of u as a subword is congruent to k mod n. Then a language is recognized by a nilpotent group of class c if and only if it is a boolean combination of languages L(u,k,n) where |u| ≤ c. Thus, the problem of finding the star-height of languages recognized by nilpotent groups reduces to finding the star-height of the languages L(u,k,n). Henneman's result mentionned above is a consequence of the fact that the star-height of L(u,k,n) is 1 if u is a word of length 1. Here we show that the star-height of L(u,k,n) is at most 1 if u is a word of length ≤ 2; this corresponds to the case of nilpotent groups of class 2. We were not able to treat completely the case of words of length 3. However, we prove that the star-height of L(u,k,n) is at most 1 if u is a word of length ≤ 3 and n is a square-free integer. This covers in particular the case of L(abc,0,2), which was proposed as a candidate for having star-height 2 in Brzozowski (1980). Other more recent candidates have been eliminated in Robson.
Non-trivial examples of languages of star-height 1 were given by Thomas. Let A = {a,b} and, for each n ≥ 0, put xn = anb. Then the set X = { xn | n ≥ 0 } is a prefix code such that X* = A*b U {1}. In particular, every word of X* admits a unique factorization as a product of words of X. Now, let W(h,k,r,m) be the set of words w of X* such that, in the factorization of w, the number of factors xn with n congruent to r mod m is congruent to h mod k. Then we have
Theorem 6 (Thomas 1981) For every h, k, r, m, the languages W(h,k,r,m) are of star-height at most 1.
It is also shown in (Pin, Straubing, Thérien 1989) that the groups that divide a semidirect product of a commutative group by (\Z/2\Z)n, and the monoids that divide a wreath product of the form M o (G o N), where M and N are aperiodic monoids, and G is a commutative group. In particular,
Theorem 7 Every language recognized by a group of order less than 12 is of star-height at most 1.
It is also known (Pin, Straubing, Thérien 1989) that the languages of star height ≤ n, for a given n are closed under boolean operations, concatenation, left and right quotients, star-free injective substitutions and inverses of length-preserving morphisms. On the other hand, every rational language is the inverse image, under some morphism between free monoids, of a language of (restricted) star-height 1. In particular, if the languages of star-height ≤ 1 are closed under inverse morphisms, every rational language is of star-height ≤ 1.