What is the P-value for in ANOVA?
In the analysis of variance (ANOVA), we study the variations of between- and within-groups in terms of their respective mean squares (MS) which are calculated by dividing each sum of squares by its associated degrees of freedom. The result, although termed a mean square, is actually a measure of variance, which is the squared standard deviation.
The F-ratio is then obtained as the result of dividing MS(between) and MS(within). Even if the population means are all equal to one another, you may get an F-ratio which is substantially larger than 1.0, simply because of sampling error to cause a large variation between the samples (group). Such F-value may get even larger than the F-critical value from the F-probability distribution at given degrees of freedom associated with the two MS at a set significant Type I (alpha-) level of error.
Indeed, by referring to the distribution of F-ratios with different degrees of freedom, you can determine the probability of observing an F-ratio as large as the one you calculate even if the populations have the same mean values.
So, the P-value is the probability of obtaining an F-ratio as large or larger than the one observed, assuming that the null hypothesis of no difference amongst group means is true.
However, under the ground rules that have been followed for many years by inferential statistics, this probability must be equal to, or smaller than, the significant alpha- (type I) error level that we have established at the start of the experiment, and such alpha-level is normally set at 0.05 (or 5%) for test laboratories. Using this level of significance, there is, on average, a 1 in 20 chance that we shall reject the null hypothesis in our decision when it is in fact true.
Hence, if we were to analyze a set of data by ANOVA and our P-value calculated was 0.008, which is much smaller than alpha-value of 0.05, we can confidently say that we would be committing just an error or risk of 0.8% to reject the null hypothesis which is true. In other words, we are 99.2% confident to reject the hypothesis which states no difference among the group means.