Jon May

  Tel: +44 114 222 6561
Fax: +44 114 276 6515
Email: jon.may@sheffield.ac.uk
    me
Jon May

Home
Publications
Teaching
Research


Statistical concepts

These are listed in a sensible order, so you can read through them. Click on hyperlinked terms to jump to their definition.
Or use this alphabetic list to locate a term of interest to you: alpha level - confidence interval difference - effect size - null hypothesis - p-value - population - power - sample - standard deviation - standard error - type I error - Type II error - variance -

Difference

The difference between the value of a variable measured from an experimentally manipulated or selected group and another, specified value such as zero, or the value of the same variable from an unmanipulated control group or another manipulated contrast group.
In general, difference (or effect) = observed mean - null hypothesis mean

Sample

The set of observations, of size N, that you have obtained, and from which you can obtain a mean to estimate the true mean within a population.

population

The total set of entities (people, rats) that could be sampled from, and where every variable has a real value that can be estimated from your sample.
Since there are many constraints upon who (or what) is available to be sampled, every experiment takes place upon a different statistical population, even if the intent is to discover things about all people (or rats, or other things) in general.

variance

A parameter that measures the spread of scores, computed by finding the mean of the scores, squaring the differences between this mean and each score, adding them up (to produce the Sum of Squares, or 'SS'), and dividing by the number of scores minus one, to get the Mean Square, or 'MS'. Properly abbreviated by s-squared when it is computed like this, from a sample, or by the greek character small sigma when it is a value for a population.
NB: The denominator is only N-1 (the number of scores minus one) if you want to estimate the variance of the population from which the sample is drawn, which is what you usually do want to do, but just N if you want the exact variance of your sample.

Standard Deviation

The square root of the variance, abbreviated properly as s, but also as SD or stdev.
It is useful for describing the spread of a distribution, since if the distribution is normal, approximately 66% of scores lie within one standard deviation either side of the mean, 95% of scores lie within 2 standard deviations of the mean.
NB: As with the variance, this is based on a denominator of N-1 (the number of scores minus one) if you want to estimate the Standard Deviation of the population from which the sample is drawn, which is what you usually do want to do, but just N if you want the exact standard deviation of your sample.

Standard Error

If you repeatedly sampled n observations from a population, the resulting set of means would have a distribution with its own standard deviation, and that standard deviation is called the standard error (of the mean, to be exact).
Abbreviated as SE, it is useful for knowing how likely you would be to obtain a similar mean if you repeated the experiment, since approximately 66% of means would be within +/- 1 SE of your observed mean, and 95% would be within +/- 2 SE. Another name for the latter is the 95% Confidence Interval
Divide the Standard Deviation of the Population by the square root of the number of scores to obtain the Standard Error:
SQRT(SS/(N-1))/SQRT(N)

p-value

The probability that a difference as large as that observed in your study would be found by chance if there were no such difference within the population (i.e., if the null hypothesis were true).
It is not the inverse probablility that your observed difference reflects a true difference in the population, nor a measure of the importance of your observed difference, although these two errors are common.
Traditional practice is to only report criterial values, such as 'n.s.', p<0.05, p<0.01 or p<0.001, but this is now discouraged by the APA, since it encourages the above errors.
The APA recommend that every inferential test statistic should be accompanied by its exact p-value, as well as a statement of its relationship to the alpha level (i.e., statistically significant or statistically non-significant) and its effect size, or confidence interval.

alpha level

The p-value that an experimenter is prepared to accept as being statistically significant (typically, 0.05 or 5%). This represents the acceptable chance of making a Type I error, or a false rejection of the null hypothesis.
The APA recommend that this value should be identified before data is collected, and stated at the beginning of the Results section. The exact p-values of Inferential test statistics can then be explicitly compared to this alpha level.

Null Hypothesis

That there is no true difference within the population. It is abbreviated as Ho.
The Null Hypothesis is said to be 'rejected' if an inferential test statistic has a p-value that exceeds the experimenter's alpha level, and this implies that there is a true difference to be explained. However, with any alpha level, there is a chance that the Ho may still be true, which is equal to 1 - alpha (not 1 - p-value, please note!).
If the p-value does not exceed the alpha level, then the Ho cannot be rejected. This does not necessarily mean that there is no true difference, though, because the power of the experiment may not have been sufficient to allow one to be observed.

Type I error

A false rejection of the Null Hypothesis, claiming there to be a difference in the population due to the observation of a difference in the experimental sample that was due to chance.

Type II error

A false acceptance of the Null Hypothesis, claiming there to be no difference in the population, due to the observed difference in the experimental sample being small or the variance in the sample being large.

Power

The probability of correctly rejecting a false null hypothesis.
The smaller the alpha level, the sample size, and the effect size, then the lower the power. Low power studies are unlikely to find 'statistically significant' results. If they do, they are unlikely to be replicable, however small the observed p-value.
Computing power values for estimated effect sizes, population variances, and alpha levels can allow an experimenter to identify the sample size needed.

Effect Size

The magnitude of the observed difference in terms of the variance in the measure within the population, which is in turn estimated by the variance within the sample.
Cohen (1994) argues that effect sizes of 0.2, 0.5 and 0.8 can be regarded respectively as small, medium and large.
Usually abbreviated as d, so d = difference / variance.
It can also easily be obtained from t, since d = t/sqrt(n). For F ratios, divide the SS effect by the SS total to get Eta-squared, which corresponds to the proportion of variance in the sample 'explained' by the effect, and which is equivalent to r-squared in regression equations.
Beware of the 'adjusted eta-squared' reported by SPSS, which sums to more than 1 over an analysis, implying that the effects explain more than the observed variance.

Confidence Interval

The range of values within which an observed measure or parameter might have taken, with a stated probability, abbreviated as CI.
95%CI=1.23 to 1.89 means that if the study were repeated with the same sample sizes, then 95 times out of 100, the measure would be found to be within the two values. If this measure were a mean, and the value expected by the Null Hypothesis were outside this range, then this is equivalent to a p-value smaller than 0.05. If the value is a difference, then the Null Hypothesis is that the difference is zero, so again, it is equivalent to finding a p-value less than 0.05.
The APA recommend that CIs are stated for each mean or inferential test statistic that you report, either alongside or instead of an exact p-value. As well as examining whether the Null Hypotheesis lies within the CI, the magnitude of the CI gives useful information about the accuracy of the estimated measure or parameter, and the likely replicability of the result.
Computing CIs for various plausible effect sizes, population variances, and sample sizes before conducting an experiment can give more information about the likely outcomes than computing a power value.
Given a mean and its standard error when n is large, the 95%CI = mean + 1.96 x SE, and the 99%CI = mean + 2.57 x SE.
To be more precise, 95%CI= mean +/- ( t x s/sqrt(n) )
where
s = the variance (or standard deviation squared),
n = the sample size, and
t= the critical value of t with n-1 degrees of freedom at the 0.025 level, i.e. the 5% value split into two tails.
As n increases, the t value asymptotes at 1.96 for .025 and at 2.57 for .005, hence the values above.