Wednesday, September 3, 2014

A surprising property of power strings

[This came about trying to come up with a linear time algorithm to determine if a string is a power string]

Definition of power string.

A string $x$ is called a power string, if and only if there is some string $z$, such that $$x = z^m\quad \text{and}\quad m \gt 1$$

In other words, $x$ is $z$ concatenated to itself $m$ times.

For example, $nananana$ is the power string $(na)^4$, while  $nanananananabatman$ is not a power string.

[If you want to try finding a linear time algorithm for the detection of power strings, please stop reading]

The surprising property of power string is given by the following:

Proposition: Power strings are the only strings which are a cyclic shift of themselves.

We ignore the trivial shifts, by the length of the string which rotates every string into itself.

That power strings are cyclic shifts of themselves is easy to see. We show two proofs of the converse here.

Proof 1)

[Even though this proof seems verbose, it is not really difficult]

Suppose \(x\) is a string of length \(n\), over the alphabet \(A\), which is a cyclic shift of itself. You can basically treat \(x\) as a function \(f:\{0,1,2,\dots, n-1\}\to A\).

If \(x\) is a cyclic shift of itself, then for some \(1 \le k \le n-1\), \(f\) satisfies $$f(x+k \mod n) = f(x), \forall x \in \{0,1,2,\dots, n-1\}$$
In order to show that \(x\) is a power string, we need to show that there is a \(p\) such that \(p\) divides \(n\) and \(f(x+p \mod n) = f(x), \forall x \in \{0,1,2,\dots, n-1\}\)
 
Now if \(n = qk - k_1\) where \(q \gt 1\) and \(0 \le k_1 \lt k \), the we must have that

$$ f(x) = f(x + k \mod n) = f(x+ qk \mod n)$$
$$ = f(x + n + k_1 \mod n) = f(x+ k_1 \mod n)$$
Thus we get a sequence: \(k = k_0 \gt k_1 \gt k_2 \gt \dots \) such that $$f(x+k_i) = f(x), \forall x \in \{0,1,2,\dots, n-1\}$$
 Since \(k_i\) is a strictly decreasing sequence, we must end up at \(0\) at some point, say \(k_s = 0\). Then we can choose \(p = k_{s-1}\).

Proof 2)

We use Lyndon's theorem which says that two strings \(x,y\) commute (\(xy = yx\)) iff they are power strings of the same string, i.e. \(x = z^r\) and \(y = z^s\).

If \(x\) is a cyclic shift of itself, then we have that \(x = uv\) (concatenation of strings \(u\) and \(v\)) such that \( uv = vu\).

Linear Time Algorithm

This property allows you to determine if a string is a power string in \(O(n)\) time, using any linear time string matching algorithm like Boyer-Moore. All you need to do is check if \(x\) is a sub-string of \(xx\) with the first and last letters removed from \(xx\).

No comments:

Post a Comment