Although this may not mean much at the moment,
Now let's work our way towards understanding it.
Take two alleles of a gene, one of which is dominant (A) and the other is recessive (a). (The full theory actually doesn't require us to have one dominant and one recessive allele, or even to limit ourselves to two alleles. But it's much simpler to understand if we follow this explanatory convention.)
Ok, so far so good; nothing too interesting there. But now we ask ourselves a second question: if we happen to know the frequency of the A allele in the pool, can we predict what the odds of the getting any one of the particular 4 genotypes will be? In other words, if there are 100 balls (alleles) in the pool, and we know that 90 are 'A', can we predict the odds of getting (scooping up), say, AA with our two balls?
Both Hardy and Weinberg independently saw that this was indeed quite possible - and useful. Following the conventional notation, let's call A's frequency 'p'. Similarly, we'll call a's frequency 'q'. Using the above figures, p would 0.9 (or 90/100), and q would be 0.1 (or 10/100). Notice, by the way, that the frequencies of both the alleles, no matter what they are, must obviously total 1. This is common sense - try it with different figures if you're not convinced. The algebraic representation of this would be:
(frequency of A) + (frequency of a) = 1,
Which is to say:
This equation will come in handy later. But back to our example. What are the odds of the first ball I scoop up out of the pool at random being A? Clearly 0.9, or p. The same goes for the second ball, when considered in isolation. Now comes the only slightly tricky bit of algebra: what are the odds of me getting both balls to be A - which would correspond with the genotype AA? The equation is:
or, in our general algebraic notation, for any value of p,
The odds of getting genotype Aa would, in identical fashion, be:
And, to complete the story, aA's odds would be q x p, or qp.
But, come to think of it, the genotype Aa is exactly equivalent to aA, and pq is obviously the same as qp, so we can combine the two terms:
pq + qp = 2pq
We're nearly there now. Let's collect all the possible genotype frequencies. We have p2, 2pq and q2 as our options. Since together the genotypes that these symbols represent must make up 100% of the total genotypes, the frequencies must add up to 1. In other words:
p2 + 2pq + q2 = 1
...which is, of course, the Hardy-Weinberg equation.
So let's summarise so far. This is the take-home message:
- The homozygous dominant genotype (AA in this example) is represented as p2 (i.e. frequency of the A allele squared).
- The heterozygous genotype (Aa in this example) is represented as 2pq.
- The homozygous recessive genotype (aa in this example) is represented as q2.
That'll be the topic of the next post.