Date: Tue, 27 May 97 13:40:26 HST
Subject: Re: Proof


>I was browsing thru your home page and I stumbled across your anatomy of a
>proof for your 311 class. I was wondering, does this "outline" or "method"
>work for all if / then style theorems? For example, how would you go about
>(using the outline detailed on your home page) proving this theorem: If S
>is a linearly independent set of vectors in a finite-dimensional vector
>space V, then there is a basis T for V, which contains S.
>It seems to me that almost every theorem I have come across in Linear Algebra is
>of this if / then nature. It would be great if some systematic way of
>proving these were made available.

Yes, the paradigm I gave is the standard one for proving any implication, although occasionally another pattern may be more useful.

In your example, this paradigm applies to the overall statement you give, since it has an "If ... then ..." form. "Suppose that S is linearly independent and V is finite dimensional. Then ..."

However the theorem you cite has a much more complicated logical structure. Notice that the conclusion is an existence statement. Proving it will involve essentially constructing a set T and then proving that T is both linearly independent and spans. (There are several ways for proving existence theorems, however the most straightforward --- but not always easiest --- way of proving that a set with a desired property exists is to point to, or construct, such a set.)

At this point, you can no longer "follow your nose," but have to do a little real thinking. At the core of the theorem you mention, there is the following implication: "If S is linearly independent and does not span, then there exists some vector v in V which can be adjoined to S so that the resulting larger set is still linearly independent."

Okay, so now apply the pattern for implications. "Suppose that S (which we assume is linearly independent) does not span V. Then..."

At this point, we are back to "following our nose;" namely, we ask, "What is the technical definition of spanning? This gives us

"Suppose that S (linearly independent) does not span V. Then there exists a vector v in V which is not a linear combination of vectors in S."

Is this v the additional one we're looking for? Experience in doing proofs suggests that almost for sure it will be. So take a chance, and continue what we have so far by writing

"Suppose that S does not span V. Then there exists a vector v which is not a linear combination of vectors in S. Then the set S' obtained by adjoining v to S will be linearly independent because ..."

Ah, this doesn't look so easy. So we have to think about what "linearly independent" means. This is another implication, so we have to use the pattern for proving implications. This is probably too long to fit after "because," so we go back and change the sentence structure slightly.

"Then we claim that the set S' obtained by adjoining v to S will be linearly independent. In fact, suppose that c1 v1 + ... + cn vn + cv = 0, where v1 , ... vn are distinct elements in S and c, ci are scalars."

According to the paradigm for proving linear independence, we are now trying to establish that all the c's are 0. Hm... Doesn't seem so obvious. But part of the knack here is having confidence in yourself, knowing that since this is a simple theorem and the logical structure fits the general patterns, that we must be going in the right direction. There's got to be a reason for these c's to all be 0.

Why should these c's all be 0? Well, whenever you don't see where to go next, that's a good signal to look back at the hypothesis of the theorem and anything else we know so far.

We know: S doesn't span V. (Hm... This doesn't seem like what we need.) S is linearly independent. (This is sort of what we need, but it doesn't quite apply.)

What about v? What do we know about v? Go back and look. Where did v come from?

Oh. We know that v is not a linear combination of vectors in S. That's got to be the most crucial fact, because v is the vector at issue here, and this is the only thing that's known about v.

So how does that apply to seeing that all ci are 0?

Okay, so we now should remember one of the most essential principles in linear algebra, namely:

What's the significance of a scalar being non-zero? Answer: To say that a scalar is non-zero is the same thing as saying that division by that scalar is possible. So for practical purposes, any individual non-zero scalar can be replaced by 1.

At this point, an indirect proof becomes easier. "Suppose that S' is not linearly independent. Then there is a linear relation c1 v1 + ... + cn vn + cv = 0," where not all the c's are 0.

If in fact the last coeffient c is non-zero, then there actually will be an equation

c1 v1 + ... cn vn + v = 0.

On the other hand, we know that there is not an equation

v = d1 v1 + ... dn vn
because v is not a linear combination of vectors in S.

We're looking for a contradiction. It's pretty easy to see that we have one, but only on the assumption that c != 0.

At this point, it's tempting to resort to wishful thinking and just flat out assert that c != 0. But we have to do better than this. We have to give a reason.

We know that at least one of the coefficients c1, ..., cn, c must be non-zero. But why should it be the last one?

Once again, since we're stuck it's time to think back over what we know. One thing we know is that S is linearly independent.

Etc. etc. You can see that once you break this theorem down into the micro logical level, the proof seems amazingly complicated. And yet each step can be found by a perfectly systematic process.

Whether the outline I give is useful to students trying to learn the process of proving theorems is useful or not is another question. I have found that proving any statement that boils down to an implication is immensely confusing to students.

Another example is proving that a function is one-to-one. This boils down to the implication "If x != y then f(x) != f(y)." Some of my students seem to find it less confusing for some reason to do an indirect proof in such cases, although I find this personally quite distasteful. ("Suppose by way of contradiction that the function f is not one-to-one. Then there exist two elements x and y such that x != y but f(x)=f(y). But ....") Instead of the indirect proof, it is preferable to prove the contrapositive of the implication, namely,

"If f(x) = f(y) then x=y."
Proving this statement is the standard way to prove that a function is one-to-one. However students find it extraordinarily difficult to learn this paradigm.



[ Top of Page | Teaching and Learning Mathematics | HOME ]