# The Tukey test: starting with a known variance

Consider normal random variables $Y_i$ with means $\mu_i$, $1\le i\le k$.
The Tukey test often seems overly complicated, but it becomes clearer if we first assume that variance of $Y_i$, $\sigma^2$, is known.

Let’s draw $r$ samples from each, called $Y_{i,j}$, $1\le j\le r$.
Denote the sample averages by $\overline Y_{i+}$.
Let $W_i = \overline Y_{i+}-\mu_i$.
Let $Q=R$, the range, be defined by
$$R = \max_i W_i – \min_i W_i = \max_{i,j} |W_i-W_j|.$$
Suppose we understand the distribution of $R$ well enough to find a number $Q_\alpha$ such that
$$\Pr(R\le Q_\alpha) = 1-\alpha$$
Define the intervals
$$I_{i,j} = (\overline Y_{i+}-\overline Y_{j+} – Q_\alpha, \overline Y_{i+}-\overline Y_{j+} + Q_\alpha)$$
Then it is easily seen that
$$\Pr(\text{for all i, j, }\mu_i-\mu_j \in I_{i,j}) = 1-\alpha.$$
and hence for all $i$, $j$,
$$\Pr(\mu_i-\mu_j\in I_{i,j}) \ge 1-\alpha.$$
(Larsen and Marx make a mistake here, mixing up the last two equations.)
Thus, the hypothesis that $\mu_i=\mu_j$ can be rejected if we observe $0\not\in I_{i,j}$.

Now, if the variance is unknown, we instead define $Q=R/S$ where $S$ is a certain estimator of the variance.
An important point is that we want to understand the joint distribution of $R$ and $S$, which is easiest if they are independent.
We do have an estimator of $\sigma^2$ that’s independent of the $W_i$, namely the residual sum of squares (a.k.a. sum of squares for error),
$$\mathrm{SSE} = \sum_i \sum_j (Y_{i,j}-\overline Y_{i+})^2$$
So we take $S^2$ to be a suitable constant time $\mathrm{SSE}$.
Namely, we want $S^2$ to be an unbiased estimator of $\sigma^2/r$, the variance of $W_i$. And we know that $\mathrm{SSE}/\sigma^2$ is $\chi^2(rk-k)$ distributed.
$$S^2 = \mathrm{MSE}/r$$
where $\mathrm{MSE} = \mathrm{SSE}/(rk-k)$.
A point here is that $\mathrm{SSE}/\sigma^2$ is a sum of $k$ independent $\chi^2(r-1)$ random variables (one for each $i$), hence is itself $\chi^2(rk-k)$.

The Department of Mathematics at University of Hawaii at Manoa has long had an informal graduate program in logic, lattice theory, and universal algebra going back to Alfred Tarski’s student William Hanf.
Starting in 2016, things are getting a little more formal.

We intend the following course rotation (repeating after two years):

Semester Course number Course title
Fall 2015 MATH 649B Graduate Seminar
Spring 2016 MATH 649* Applied Model Theory
Fall 2016 MATH 654* Graduate Introduction to Logic
Spring 2017 MATH 657 Computability and Complexity

*Actual course numbers may vary.

#### Faculty who may teach in the program

David A. Ross, Professor
Bjørn Kjos-Hanssen, Professor
Mushfeq Khan, Temporary Assistant Professor 2014-2017
Achilles Beros, Temporary Assistant Professor 2015-2017

# The noncentral $t$ distribution is an elementary function relative to erf

In this note we show that for each fixed number of degrees of freedom $\nu$, the noncentral $t$ distribution with noncentrality parameter (= the mean of the numerator) $\mu$ is an elementary function relative to erf.

The pdf is, per Wikipedia,
$$f(x) =\frac{\nu^{\frac{\nu}{2}} \exp\left (-\frac{\nu\mu^2}{2(x^2+\nu)} \right )}{\sqrt{\pi}\Gamma(\frac{\nu}{2})2^{\frac{\nu-1}{2}}(x^2+\nu)^{\frac{\nu+1}{2}}} \int_0^\infty y^\nu\exp\left (-\frac{1}{2}\left(y-\frac{\mu x}{\sqrt{x^2+\nu}}\right)^2\right ) dy.$$
The non-elementary part is the integral
$$\int_0^\infty y^\nu\exp\left (-\frac{1}{2}\left(y-\frac{\mu x}{\sqrt{x^2+\nu}}\right)^2\right ) dy = g\left(\frac{\mu x}{\sqrt{x^2+\nu}}\right)$$
where
$$g(a)=\int_0^\infty y^\nu\exp\left (-\frac{1}{2}\left(y-a\right)^2\right ) dy.$$
We claim that $g$ is elementary relative to the error function erf, or equivalently relative to the standard normal cdf $\Phi$.

Namely, by change of variable and the binomial theorem,
$$g(a)=\sum_{k=0}^\nu a^{\nu-k} {\nu\choose k} \alpha_k$$
where
$$\alpha_k = \int_a^\infty x^k e^{-x^2/2}\,dx$$
Next using $u=x^{k-1}$, $dv=x e^{-x^2/2}\,dx$ and integration by parts, we can show a recurrence relation for $\alpha_k$:
$$\alpha_k = a^{k-1}e^{-a^2/2} + (k-1)\alpha_{k-2}$$
Finally, $\alpha_0=\sqrt{2\pi}(1-\Phi(a))$ and $\alpha_1 = e^{-a^2/2}$, and we are done.

# Mathematical logic

### Syllabus for Math 455

Mathematical logic, Spring 2013

The last time this course was offered was Spring 2009. (Will the next time be Spring 2017?)

Textbook: A mathematical introduction to Logic, 2nd edition (Enderton)
Times: TuTh 12-1:15

See the Calendar tab for homework lists.

The top all-but-two homeworks count 20%. The best out of two midterms counts 40%. The final (either an exam or optionally a project consisting of a presentation and minimum-2 page paper) counts 40%.

### Flyer for Math 455

Math 455, Mathematical Logic, will be offered Spring 2013. It is an undergraduate course at the junior/senior level, but may also be of interest to graduate students.

What is the course about? Kurt Gödel proved in the 1930s that in any reasonable mathematical system there will always be true but unprovable statements about arithmetic.
This was very upsetting to David Hilbert (pictured) who had hoped to be able to answer mathematical questions definitely and systematically.
Because of this work, Gödel was one of only two mathematicians included in TIME magazine’s 100 persons of the 20th century.

The prerequisites for the course are Math 454, or Math 321, or instructor’s consent.

# Almost sure convergence in stochastic integrals

Consider a mesh $\{t_j^{(n)}\}$ for $1\le j\le k(n)$.

Mörters and Peres (Brownian motion, Remark 7.7) give a condition that turns out to be equivalent to
$$\tag{1}\label{1} \sum_{n=1}^\infty \sum_{j=1}^{k(n)} (t^{(n)}_{j+1} – t^{(n)}_j)^2 <\infty$$
(which is used explicitly in Exercise 1.16) and show that it guarantees that almost surely
$$\sum g(B_{t_j})\Delta B_{t_j} \rightarrow \int g(B_s)dB_s$$
where $\Delta B_{t_j} = B_{t_{j+1}} – B_{t_j}$ and $\Delta t_j = t_{j+1}-t_j$. In other words, this sequence of random variables almost surely converges to its $L^2$ limit.

(This is not guaranteed in general. Consider an independent sequence where $X_n=1$ with probability $1/n$ and $X_n=0$ otherwise. Then $\mathbb E(X^n_2)=1/n^2$ so $X_n$ converges to 0 in $L^2$. But $X_n$ does not converge to 0 almost surely, by the second Borel-Cantelli lemma.)

Instead of giving the details on that*, we will give details on something Mörters and Peres do not give as much detail on but which will also reveal how to do the above: what condition on the mesh is required for almost sure convergence of
$$\sum g(B_{t_j})(\Delta B_{t_j})^2 \rightarrow \int g(B_s)ds,$$
or equivalently, since $\int g(B_s)ds$ is a non-stochastic (but random) integral where the mesh does not matter,
$$b_n:= \sum g(B_{t_j})[(\Delta B_{t_j})^2 - \Delta t_j] \rightarrow 0.$$
They show that such convergence happens in $L^2$ without any assumption on the mesh (as most authors do) but then they are a little vague about the rest in Theorem 7.13. So as an exercise we can go back to Remark 7.7 and see if a similar condition can be obtained in this case.

By second-semester calculus it suffices to show that $\sum b_n^2<\infty$ almost surely:
$$\{\omega: \sum b_n(\omega)^2<\infty\}\subseteq\{\omega: b_n(\omega)\rightarrow 0\}.$$
For this it suffices to show that $\sum \mathbb E (b_n^2)<\infty$.

(Indeed, if $\sum \mathbb E (b_n^2)<\infty$ then [by Monotone Convergence] we have $\sum \mathbb E (b_n^2) = \mathbb E (\sum b_n^2)$ and so $\sum b_n^2$ is a nonnegative random variable with finite mean and hence $\sum b_n^2$ must be finite almost surely.

To prove that a nonnegative random variable with finite mean must be finite almost surely, note that for any random variable $X$, if $\mathbb P(X^2\ge M) \ge N/M$ then
$$\mathbb E(X^2)\ge \mathbb E(X^2 | X^2\ge M) \mathbb P(X^2\ge M) \ge M \mathbb P(X^2\ge M) \ge N.$$
Thus, if $\mathbb E(X^2) < N$, where $N$ is a given constant, then $\mathbb P(X^2\ge M) < N/M$.
Consequently, $\mathbb P(X^2=\infty)=0$.

Let $U_M = \{\omega: \sum b_n(\omega)^2\ge M\}$; then $\mathbb P(U_M) < N/M$. We may note that $U_M$ is computably enumerable only relative to the mesh. And indeed by an exercise in MP, the mesh must be computable in order to work for all ML-random $\omega$. With a nonstandard analysis construction of the Ito integral, however, we can sidestep this requirement.)

Calculating exactly as on the bottom of page 47 in Øksendal (6th edition)’s proof sketch for Ito’s lemma, with his $a_j$ being our $g(B_{t_j})$, we have
$$\mathbb E(b_n^2) = 2\sum_{j=1}^{k(n)} \mathbb E(g(B_{t_j})^2) (\Delta t_j)^2.$$
Here the factors $\mathbb E(g(B_{t_j})^2)$ will be bounded by a constant $c$, assuming we are integrating over a bounded domain $\int_S^T$.

The mesh will depend on $g$ in general. Consider for example $g(b)=e^{b^2/2}$ at time $t=1$; then $\mathbb E(e^{B_1^2})=+\infty$.

So our condition is
$$\sum_{n=1}^\infty 2\sum_{j=1}^{k(n)} \mathbb E(g(B_{t_j})^2) (\Delta t_j)^2 \le \sum_{n=1}^\infty 2\sum_{j=1}^{k(n)} c (\Delta t_j)^2 <\infty$$
which is exactly the same as condition (\ref{1}).

*Actually here are some details. We seek to relate the condition
$$\sum_{n=1}^\infty \mathbb E \int_0^t (H_n(s)-H(s))^2 ds <\infty$$
to the mesh. Here $H(s)=g(B_s)$ and $H_n(s)$ is an elementary approximation. The $n$th term is a sum over $j$ of:
$$\mathbb E \int_{t_j}^{t_{j+1}} \left(g(B_s) – \sum_{j=1}^{k(n)} g(B_{t_j}) \chi_{[t_j,t_{j+1})}(s)\right)^2 ds$$
$$= \mathbb E\int_{t_j}^{t_{j+1}} (g(B_s) - g(B_{t_j}))^2 ds = \int_{t_j}^{t_{j+1}} \mathbb E ((g(B_s) - g(B_{t_j}))^2) ds$$
(by dominated convergence)
$$\le c \int_{t_j}^{t_{j+1}} \mathbb E ((B_s) - B_{t_j})^2) ds$$
(since $\mathbb E[(g(B_s)-g(B_{t_j})^2] =\mathbb E\left[\left(\frac{g(B_s)-g(B_{t_j})}{B_s-B_{t_j}}\right)^2(B_s-B_{t_j})^2\right] \le\mathbb E\left[\left(\frac{g(B_s)-g(B_{t_j})}{B_s-B_{t_j}}\right)^2\right]\mathbb E[(B_s-B_{t_j})^2]$)

$$= c \int_{t_j}^{t_{j+1}} \mathbb E ((B_{s – t_j})^2) ds = c \int_{t_j}^{t_{j+1}} s – t_j ds$$
$$= c \int_0^{\Delta t_j} u du = c (\Delta t_j)^2/2$$
so we get
$$\frac{1}{2} \sum_{n=1}^\infty \sum_{j=1}^{k(n)} (t_j^{(n)}-t_{j+1}^{(n)})^2 < \infty$$
which is the same as condition (\ref{1}).

In the case where each $t_{j+1}^{(n)}-t_{j}^{(n)} = 1/k(n)$, this becomes
$$\sum_{n=1}^\infty 1/k(n) < \infty$$
so we can take $k(n)=n^2$ or, like Paley-Wiener, $k(n)=2^n$.