# Few paths, fewer words, and the maximum probability of writing 001 in a two-state Hidden Markov Model being $8/27$

The aim of this note is to give the simplest possible non-trivial calculation of the parameters of a HMM that maximize the probability of emitting a certain string.

Let $\{0,1\}$ be our alphabet.
Let $p$ be the probability of emitting 1 in state $s_0$.
Let $q$ be the probability of emitting 1 in state $s_1$.
Let $\epsilon$ be the probability of transitioning from $s_0$ to $s_1$.
Let $\delta$ be the probability of transitioning from $s_1$ to $s_0$.
Let $S(t)$ be the state after transitioning $t$ times, a random variable.
The probability of emitting the string 001 when starting in state $s_0$ is then
$$f(p,q,\epsilon,\delta)=\Pr(001; S(1)=s_0=S(2))+\Pr(001; S(1)=s_0, S(2)=s_1)$$
$$+\Pr(001; S(1)=s_1, S(2)=s_0)+\Pr(001; S(1)=s_1=S(2))$$
$$=\overline p^2 p \overline\epsilon^2 + \overline p^2q\overline\epsilon\epsilon + \overline p\overline q p\epsilon\delta + \overline p\overline q q \epsilon\overline\delta.$$
Which choice of parameters $p, q, \epsilon, \delta$ will maximize this probability?
To answer this we first note that
$$\frac{\partial f}{\partial\delta}=0\iff p=1, q=1, \epsilon=0\text{ or }p=q.$$
Going through these possibilities we keep finding values of $f$ bounded above by $\overline p^2p\le 4/27$:

1. $p=1$ immediately gives $f=0$.
2. $q=1$ gives $f=\overline p^2 p \overline\epsilon^2 + \overline p^2\overline\epsilon\epsilon=\overline p^2p\overline\epsilon.$
3. $\epsilon=0$ gives $f=\overline p^2 p.$
4. $p=q$ gives $f=\overline p^2 p \overline\epsilon^2 + \overline p^2p\overline\epsilon\epsilon + \overline p^2 p\epsilon\delta + \overline p^2 p (\epsilon\overline\delta)=\overline p^2p(\overline\epsilon^2 + \overline\epsilon\epsilon + \epsilon\delta + \epsilon\overline\delta)=\overline p^2p.$

We next consider boundary values for $\delta$.

1. $\delta=0$. We may assume $p=0$, since there is no use in considering a positive probability of emitting a 1 in state $s_0$ if there is no chance of ever returning to that state. Then upon calculation of $\partial f/\partial q$ we find that $\epsilon=2\overline q$, which gives $f=2q^2\overline q$. This is maximized at $q=2/3$ which corresponds to $\epsilon=2/3$ as well, and gives a value $f=8/27>1/4$.

This $8/27$ is decomposable as a sum of two disjoint scenarios of probability $4/27$:

1. One is that after writing the first 0 we stay in state $s_0$, write another 0, and then transition to state $s_1$ to write a 1.
2. The other is that after writing the first 0 we move to state $s_1$, write the 2nd zero there, and stay there to write the 3rd letter, 1.
2. $\delta=1$. Then $f=\overline p^2 p \overline\epsilon^2 + \overline p^2q\overline\epsilon\epsilon + \overline p\overline q p\epsilon =\overline p(\overline p p \overline\epsilon^2 + \overline pq\overline\epsilon\epsilon +\overline q p\epsilon)$.
Then $0=\partial f/\partial q = \overline p\epsilon(\overline p\overline\epsilon – p)$ if $p=\frac{\overline\epsilon}{1+\overline\epsilon}$. This gives $f=\frac{\overline\epsilon}{(1+\overline\epsilon)^3}$ (turns out not to depend on $q$) which is maximized for $\epsilon=1/2$ with value $f=4/27$. So we consider boundary values for $q$:

1. $q=0$. Then $f=\overline pp(\overline p\overline\epsilon^2+\epsilon)\le \frac14\cdot 1$.
2. $q=1$. Then $f=\overline p^2p\overline\epsilon^2+\overline p^2\overline\epsilon\epsilon$ and $\partial f/\partial\epsilon=\overline p^2(1-2\epsilon-2p\overline\epsilon)=0$ if $\overline p=1/(2\overline\epsilon)$, which gives $f=\overline p/4\le 1/4$.

Now note how much easier this is if we only consider a single path. Then clearly $1/4$ is the maximum possible, via 3 different paths, because of the presence of terms of the form $a\overline a$.
Replacing such occurrences by $1/4$ we upper bound $f$ by
$$\frac14\left(\overline p \overline\epsilon^2 + \overline p^2q + \overline q\epsilon\delta + \overline p \epsilon\overline\delta\right).$$