Category Archives: NSF-DMS-0901020

Deterministic automaton complexity of strings

The deterministic automaton complexity of a finite binary string \(x=x_1\ldots x_k\) is defined to be the least number \(A(x)\) of states of a finite state automaton \(M\) such that \(x\) is the only string of length \(k\) in the language accepted by \(M\). It was introduced by Shallit and Wang (2011).

In a Discrete Mathematics class in Spring 2009, students Jason Axelson, Chris Ho and Aaron Kondo wrote a C program that calculated the complexity of all strings of length at most 7.

\[A()=1,\quad A(0)=2,\]
\[A(x_1\ldots x_k)\le k \quad\text{ for } k\ge 2,\]
\[A(x_1\ldots x_k)\le k-1 \quad\text{ for } k\ge 5.\]
\[A(x)\le 5 \quad\text{ for } k=7.\]

The string \(011100\) is the only string (starting with a 0) of length 6 or less with the property that its complexity is increased when read backwards:
\[A(011100)=4\ne 5=A(001110).\]

String 011100
Automaton witnessing the complexity of the string 011100. (For simplicity, the trash state where any missing arrow goes is not shown.)

A table of complexities is available.

Research statement

The primary focus of my research is algorithmic randomness, especially its relation to randomness extraction and to probability and statistics. More generally I have worked in various areas of computability theory (recursion theory).

The basic idea of algorithmic randomness is to declare an object to be algorithmically random if it belongs to no “computable” set of measure zero. Since only computable sets can truly be understood by humans, an algorithmically random object will be as random as one can desire. This way one can formulate questions such as: can one compute a random object from an object that is almost random? Can one compute anything useful from a random object? etc. So algorithmic randomness is a natural setting for randomness extraction (although not the only natural setting).

Martin-Löf (ML-) randomness is a notion of algorithmic randomness, arguably the most natural one. It basically corresponds to the class of all Turing machines, whereas other randomness notions correspond to those Turing machines that always halt, say. It turns out that the collection of all Martin-Löf random sets is simpler in terms of definability than rival notions. It is a \(\Sigma^0_2\) class, whereas other notions tend to give \(\Pi^0_3\) classes or worse. It is an \(F_\sigma\) set rather than \(F_{\sigma\delta}\). Thus each ML-random set belongs to a closed set containing nothing but ML-random sets, and this is a basic tool in the area.

Open problem: Let \(\text{MLR}\) denote the collection of all Martin-Löf random subsets of \(\mathbb N\) and let \(\le_T\) denote Turing reducibility.

Suppose \(A\subseteq B\subseteq C\), \(A\in\text{MLR}\), and \(C\in\text{MLR}\). Then is it true that \((\exists X\in\text{MLR})(X\le_T B)\)?

Note: Hertling (2005) showed that it is not necessarily true that \(B\in\text{MLR}\).

This is one of many questions of the form: if a set is close to being random, then can we extract randomness from it? I have worked primarily on two notions of being close to random: being close in asymptotic Hamming distance, and being an infinite subset.

The current state of knowledge about infinite subsets is that each Schnorr 2-random set \(X\) has an infinite subset \(Y\) such that no Martin-Löf random set is Turing reducible to \(Y\). This is the main result of the paper A strong law of computationally weak subsets. The main idea of the proof is to combine the Extinction Criterion for Galton-Watson processes with a construction from the paper Comparing DNR and WWKL.

In the case of asymptotic Hamming distance, the paper Recovering randomness from an asymptotic Hamming distance has some results. One can define asymptotic Hamming distance in two ways: either require that the Hamming distance of the prefixes of \(X\) and \(Y\) of length \(n\) is small for all \(n\), or merely for infinitely many \(n\). It turned out to be most natural, in a way, to require small distance on an infinite computable set of values of \(n\). This is related to Michel Weber’s strengthening of the Law of the Iterated Logarithm. This law has a somewhat strange form, involving the logarithm of the logarithm function. However, when one looks at what happens at an infinite sparse set of times, this strangeness disappears. In terms of this notion of asymptotic Hamming distance I obtained the following type of result:

For each Turing reduction \(\Phi\) and each sufficiently random set \(X\), there is a set \(Y\) which is close to \(X\), such that \(\Phi^Y\) is not random.

This has the corollary that from a set \(Y\) that is complex in the sense of Kjos-Hanssen, Merkle, and Stephan (TAMS, 2011), and simultaneously has positive effective packing dimension, one cannot uniformly compute any set \(Z\) that is even Mises-Wald-Church stochastic. A goal is to strengthen this so that \(Y\) has positive effective Hausdorff dimension. Another possible way it can be strengthened is to deal with all Turing reductions at once. Conceivably there is a route here to prove that for each sufficiently random set \(X\), there is a set \(Y\) that is so close to \(X\) as to force \(Y\) to have effective Hausdorff dimension 1; and such that no 1-random set is Turing reducible to \(Y\). This would strengthen a result of Greenberg and Miller which asserts that such sets \(Y\) exist, if we remove the assumption of being close to a random set.

Excerpts of a talk from Luminy, 2009

Randomness with respect to a measure

Martin-Löf randomness for arbitrary measures on $2^\omega$:

A test for $\mu$-randomness is a uniformly $\Sigma^0_1(\mu)$ sequence $\{U_n\}_{n\in\omega}$ with $\mu(U_n)\le 2^{-n}$.

The test $\{U_n\}_{n\in\mathbb{N}}$ defines a null set $\cap_n U_n$. $X$ passes the test for randomness $\{U_n\}_{n\in\mathbb{N}}$ if $X\not\in \cap_n U_n$. If $X$ passes all $\mu$-randomness tests then $X$ is $\mu$-random.

Bernoulli measures

For each $n\in\omega$,
$$\mu_p(\{X: X(n)=1\})=p$$
and $X(0),X(1),X(2),\ldots$ are mutually independent random variables.

The Strong Law of Large Numbers for $\mu_p$ states that for almost all $X$ according to the measure $\mu_p$, we have
$$\forall\varepsilon>0\,\exists N\,\forall n>N\, \left| \frac{\text{the $\#$ of 1s up to $n$ in $X$}}{n}-p\right| <\varepsilon.$$
Suppose $X$ does not satisfy the SLLN, as witnessed by a number $\varepsilon_0$. Let
$$U_N=\{Z: \exists n>N \, \left| \frac{\text{the $\#$ of 1s up to $n$ in $X$}}{n}-p\right| \ge \varepsilon_0\}.$$
Then $U_N$ is open and in fact $\Sigma^0_1(\mu_p)$; and $\mu(U_N)$ goes effectively to $0$. Thus, $X$ is not Martin-Löf random with respect to $\mu_p$.

A set $Y\subseteq \mathbb{N}$ is feeble if it is infinite, and no Martin-Löf random set is computable from $Y$.

Theorem[Law of Feeble Subsets]
Almost every $X\subseteq\mathbb{N}$, according to $\mu:=\mu_{1/2}$, has a feeble subset $Y\subseteq X$.

(Passing from $X$ to $Y$ we suffer a “loss of randomness beyond algorithmic repair.”)

Let $X$ be Martin-Löf random and let $Y$ be a computably chosen subset of $X$. Say,
$$Y=\langle X(0),0,X(2),0,X(4),0,\ldots\rangle.$$
Then $Y$ is an infinite subset of $X$, but $Y$ does compute a Martin-Löf random set, namely
$$Z=\langle X(2),X(4),X(6),X(8),\ldots\rangle$$
so $Y$ is not feeble.

Generally, if $X$ is Martin-Löf random and $R$ is computable ($R\le_T\emptyset$) then $X\cap R$ is not feeble.

If there a Martin-Löf random set $X$ and a set $R\le_T X$ such that $R\cap X$ is feeble?

Partial negative result: such an $X$ can not be hyperimmune-free.

von Neumann’s randomness extractor

Let $X$ be Martin-Löf random let $Y$ be a “randomly chosen” subset of $X$. That, is each 1 in $X$ is converted to a 0 with probability $\frac{1}{2}$. Then $Y$ does compute a Martin-Löf random set. Namely, let $Z$ be obtained from $X$ by making the following replacements:
$$\langle X(2n),X(2n+1)\rangle \mapsto Z(n)$$
$$\langle 0,0\rangle \mapsto \langle \rangle$$
$$\langle 1,1\rangle \mapsto \langle \rangle$$
$$\langle 0,1\rangle \mapsto \langle 0\rangle$$
$$\langle 1,0\rangle \mapsto \langle 1\rangle$$

How dense can a feeble subset $Y\subseteq X$ be?

The principal function of a set $Y$ is given by $p(n)=$ the smallest element of $Y$ that is $>p(n-1)$.

  • K., 2007: principal function bounded by $4^n$.
  • Miller, 2008: principal function bounded by $n^{2+\varepsilon}$.

Can a Martin-Löf random set $X$ have a feeble subset $Y$ of linear density, i.e. principal function bounded by $c\cdot n$?

One way to approach this question is to start with stronger reducibilities.

Let $r$ be a reducibility. $Y$ is $r$-feeble if $Y$ is infinite and $Y$ does not $r$-compute any ML-random set.

$c$, $d$, and $btt$-feeble subsets of linear density

Take $Y$ to be a $\mu_p$-random subset of $X$. Then $Y$ has principal function bounded by $({p}^{-1}+\varepsilon)n$ ($Y$ has linear density) and $Y$ is $c$-feeble, $d$-feeble, and $btt$-feeble.

However, using von Neumann’s extractor one can show that $Y$ is not $tt$-feeble. By using different algorithms one can show that $Y$ is also not $l$-feeble nor $p$-feeble.

Is there an $r$-feeble subset of $X$ of linear density, for $r\in\{p,l\}$? Or even for $r\in\{tt,T\}$?

The 7 clones in Post’s lattice containing the constant functions correspond to reducibilities $m$, $btt(1)$, $c$, $d$, $p$, $l$, $tt$ and out of these 7 a $\mu_p$-random set is feeble for exactly 4.

Reducibilities for which a $\mu_p$-random set is feeble:

  • $m$: $\{0,1\}$
  • $btt(1)$: $\{0,\neg\}$
  • $c$: $\{0,1,\wedge\}$ (conjunctive)
  • $d$: $\{0,1,\vee\}$ (disjunctive)

and those for which it is not:

  • $p$: $\{0,1,\wedge,\vee\}$ (positive)
  • $\ell$: $\{0,\iff\}$
  • $tt$: $\{\neg,\vee\}$

When is a measure “close enough” to uniform?

Theorem[Kakutani 1948]
Let $\mu$ and $\nu$ be strongly positive (i.e. the $p_i$, $q_i$ are bounded away from $0$ and $1$) generalized Bernoulli measures of parameters $\{p_i\}_{i\in\mathbb{N}}$ and $\{q_i\}_{i\in\mathbb{N}}$, respectively. If $\sum_i (p_i-q_i)^2<\infty$ then $\mu$ and $\nu$ are equivalent (i.e. have the same null sets).

Theorem[Vovk 1987]
Let $\mu$ and $\nu$ be computable strongly positive generalized Bernoulli measures of parameters $\{p_i\}_{i\in\mathbb{N}}$ and $\{q_i\}_{i\in\mathbb{N}}$, respectively. If $\sum_i (p_i-q_i)^2<\infty$ then $\mu$MLR=$\nu$MLR.

Linear reducibility

$Z$ is linearly reducible to $Y$ ($Z\le_\ell Y$) if the truth of the equation $Z(n)=1$ can effectively be reduced to the truth of a linear equation in the variables $Y(0), Y(1), \ldots$ over the finite field $\mathbb Z/2\mathbb Z = (\{0,1\},+)$:

$$Z(43)=1\Leftrightarrow Y(3)+Y(5)=0$$
$$Z(44)=1\Leftrightarrow Y(1)+Y(6)+Y(8)=1$$
One way to state this:

$Z\le_{\ell}Y$ if there is a recursive function $f$ such that for all $n$,
$$n\in Z\iff |D_{f(n)}\cap Y|\equiv 1(\text{mod }2).$$

An improved von Neumann extractor (in a sense)

A $\mu_p$-random set is not $\ell$-feeble.

Suppose $Y$ is $\mu_p$-random. The $\mu_p$-probability that $|D_{f(n)}\cap Y|\equiv 1(\text{mod }2)$ is computable from $p$ and converges effectively to $1/2$ as $|D_{f(n)}|\to\infty$. \footnote{\tiny{The reason is that if $D_{f(n)}$ is very large, then it is likely to contain a large amount of numbers from $Y$, and this amount is about equally likely to be odd or even. }}

Let $D_{f(n)}$ be disjoint finite sets that are so large that if $p_i:=\mu_p(\{Y:|D_{f(i)}\cap Y|\equiv 1(\text{mod }2)\})$ then $\sum_i (p_i-\frac{1}{2})^2<\infty$.

If $Z$ is reduced to $Y$ via $D_{f(n)}$, then $Z$ is ML-random.

Positive reducibility

$Z\le_p Y$ iff there is a recursive function $f$ such that for all $n$, $n\in Z$ iff $\exists i\le k_i$ $D_{f(n,i)}\subseteq Y$.

We can rewrite this:
\tag{*}\neg \forall i\le k_i [D_{f(n,i)}\not\subseteq Y]

If the $D_{f(n,i)}$ are disjoint then the events $[D_{f(n,i)}\not\subseteq Y]$ are independent and so we can calculate the probability of (*).

By taking $|D_{f(n,i)}|$ large enough and taking $k_i$ large enough we can get this probability as close to $1/2$ as desired. Thus,

Theorem[A second improved von Neumann extractor]
Let $0<q<1$. A $\mu_q$-random set is not $p$-feeble.

Cannot reduce meet to join

Here is an argument from Culver’s MA thesis from 2010.

Suppose the sequence $(A_0\wedge A_1,A_2\wedge A_3,\ldots)$ is $\vee$-reducible to $A$.
Let’s look at the use of the $\vee$-truth-table reduction for computing $A_8\wedge A_9$.
We claim that after seeing only $A_0,\ldots,A_7$ we can rule out one possibility for $(A_8,A_9)$, thereby making money on $A$ and thus showing $A$ is not random.

If the use is only $A_0,\ldots,A_7$ then let’s say the reduction predicts $A_8\wedge A_9=1$. Then we can rule out the pair $(0,1)$. If the reduction predicts $0$ we can rule out $(1,1)$.

If the use is only $A_0,\ldots,A_8$ then based on us looking at $A_0,\ldots,A_7$ we can determine that the reduction predicts $1\wedge A_9=1$ or we can determine that the reduction predicts $1\wedge A_9=0$. Then we can rule out the pair $(1,0)$, or $(1,1)$, respectively.

Finally, if the reduction uses all of $A_0,\ldots,A_9$ then based on us looking at $A_0,\ldots,A_7$ we can determine that the reduction predicts $1\wedge 1=1$ and we can determine that the reduction predicts $1\wedge 0=1$ and… the reduction cannot correctly predict the whole $\wedge$ function or else $\wedge$ would be an element of the clone $\{\vee\}$.

Special Session on Computability and Complexity

Cameron Freer and I organized a special session at the AMS meeting at UH during March 3-4, 2012.



Logan Axon (Gonzaga University, Washington)Axon’s slides
Jennifer Chubb Reimann (University of San Francisco)
Denis Hirschfeldt (University of Chicago)Hirschfeldt’s slides
John Hitchcock (University of Wyoming)
Greg Igusa (UC Berkeley)Igusa’s slides
Joseph S. Miller (University of Wisconsin, Madison)Miller’s slides
Mia Minnes (UCSD) Minnes’ slides
Pavel Semukhin (Pacific Institute for Mathematical Sciences (PIMS), University of Regina, Saskatchewan, Canada)Semukhin’s slides

View Special Session speakers in a larger map


1078-03-400 Algorithmic Randomness and Computational 13-dec-2011
John M Hitchcock*,

1078-03-399 Cupping with random sets. 13-dec-2011
Joseph S. Miller*,

1078-03-377 Algorithmic randomness and complexity 13-dec-2011
via automata.
Mia Minnes*,

1078-03-363 The reverse mathematics of homogeneous 13-dec-2011
Denis R. Hirschfeldt*,

1078-03-354 Effective properties of approximable 13-dec-2011
Jennifer Chubb Reimann*,

1078-03-313 Martin-L\”of random closed sets. 12-dec-2011
Logan Axon*,

1078-03-237 Nonexistence of Minimal Pairs for 10-dec-2011
Generic Computation.
Gregory Igusa*,

1078-03-127 Automatic models of first-order theories. 01-dec-2011
Pavel Semukhin*,
Frank Stephan

Computability menagerie

The Computability Menagerie houses animals of varying degrees of non-computability. The project started during my Marie Curie fellowship in Heidelberg, where I was working on a huge diagram of downward closed classes of Turing degrees, using JFig.

Mushfeq Khan and Joe Miller have written a web app version of the menagerie.

Related research on André Nies’ home page:
Logic Blog

Tamer (computable) creatures are on display at the Complexity Zoo.