Statistical Analysis of Data

Statistical procedures are powerful research tools for organizing and understanding data. They provide ways to represent and describe groups, summarize results, and evaluate data. There are two major types of statistical procedures: (1) descriptive statistics, which are used to simplify and organize data, and (2) inferential statistics, which are used to draw inferences from the data.

Statistical procedures depend on variability in responses among participants. All participant variables studied in psychology show individual differences. People differ from one another on virtually all measures. Therefore, when we find that groups in an experiment differ, it is important to know if the observed difference is due to the experimental procedure or whether it is due only to the natural variation among individuals.

Two important groups of descriptive procedures are frequency distributions and the graphical representation of data.

**Nominal and ordinal data**. For most kinds of nominal or ordinal data, statistical simplification involves counting the frequency of participants who fall into each category. These frequencies can then be reported in a table. It is often useful to convert the frequencies into percentages. In some research, participants are categorized on two or more variables, called cross-tabulation. Cross-tabulation can help us to see relationships among nominal measures.**Score data**. The simplest way to organize a set of score data is to create a frequency distribution. A grouped frequency distribution shortens a large table to a more manageable size by grouping the scores into equal-sized intervals.

Histograms and frequency polygons are used to represent frequency
or grouped frequency distributions. Both the histogram and the
frequency polygon represent data on a two-dimensional graph, in
which the horizontal axis (*X* axis or abscissa) represents the
range of scores and the vertical axis (*Y* axis or ordinate)
represents the frequency of the scores. The histogram represents the
frequency of a given score by the height of a bar, while the
frequency polygon represents the frequency of a score by the height
of a dot. The dots for adjacent scores are then connected to form
the polygon. An advantage of frequency polygons is that two
distributions can be compared by putting them on the same graph.

Pie charts provide an easy way to see how a group breaks down into subgroups. For example, a pie chart might indicate by the size of the slice of the pie how many people who voted in an election came from each of several age groups.

With small group sizes, the frequency polygon appears jagged; as the group size increases, it tends to look more like a smooth curve. A smooth, symmetric, bell-shaped curve is a normal curve or normal distribution. In a skewed distribution, the scores tend to pile up on one end of the distribution. In a positively skewed distribution, the scores cluster at the bottom of the scale; in a negatively skewed distribution, the scores cluster at the top of the scale.

Descriptive statistics serve two purposes: (1) to use just one or two numbers in describing a data set, and (2) to provide a basis for later statistical analyses with inferential statistics.

There are three measures of central tendency, mode, median, and mean, all of which describe what the typical or average score is like in the distribution. The mode is the most frequently occurring score. The median is the middle score in a distribution, in which the scores are arranged in order from highest to lowest. Half of the scores fall below the median and half fall above it. The mean is the most commonly used measure of central tendency; it is the sum of all the scores divided by the number of scores.

In addition to measures of central tendency, it is also important to determine how variable the scores are. Variability refers to the dispersion or spread of the scores. It can be measured in several ways: range, variance, and standard deviation.

The range is the difference between the lowest and the highest score. Although very simple to compute, it is unstable because it depends on only two scores: the highest and the lowest.

A better measure of variability is the variance. It is more stable than the range because it utilizes all of the scores in its computation and is useful in later statistical procedures. In computing the variance, we take the difference between each score and the mean, square each difference (to get rid of the negative signs), and sum the squared differences. We then divide the sum of the squared differences by the number of scores minus one, which are called the degrees of freedom.

The variance is an excellent measure of variability and is used in many inferential statistics. However, the variance is expressed in squared units. A measure called the standard deviation can be computed to transform the measure of variability into the same units as the original scores. The standard deviation is the square root of the variance.

The relationship between two or more variables is indexed with a
correlation coefficient. There are several types of correlation
coefficients. The Pearson product-moment correlation (Pearson *r*)
is used with score data. If either or both variables are ordinal and
neither is nominal, the appropriate correlation is the Spearman
rank-order correlation (Spearman *r*). Correlations can range
from +1.00 to -1.00 (perfect positive and perfect negative
correlations, respectively). A correlation of zero means that there
is no relationship between the variables.

Phi is a way of measuring the strength of relationship between nominal variables.

A correlation is also useful in the prediction of the value of one variable from the value of another variable. This prediction is called regression. The stronger the correlation, the better the prediction.

Correlation coefficients are used to quantify many types of reliability. These reliability indices can range from a -1.00 to a +1.00, but negative reliability indices are unlikely. Coefficient alpha is a measure of internal consistency--that is, how intercorrelated the items of the measure are.

The standard score (also called the *Z*-score) is calculated
by subtracting the mean from the score and dividing the difference
by the standard deviation of that distribution. Standard scores are
useful when comparing scores from different distributions. A
standard score can be converted into a percentile rank, which tells
us what percentage of the sample scored below the score.

Inferential statistics are used to draw inferences about populations on the basis of information from samples that are drawn from the populations.

Although we are interested in populations of participants, it is seldom possible to test whole populations. Therefore, we obtain data on samples from populations. A population is the group of all the organisms of interest from which the sample is selected. The sample is a subset of organisms drawn from the population.

The null hypothesis can be applied to many situations; it is the hypothesis that there are no differences between two populations.

Inferential statistics are used to compute the probability of obtaining the observed data if the null hypothesis is true. If that probability is very small, then it is unlikely that the null hypothesis is true. We would therefore conclude that the null hypothesis is false. An arbitrary cutoff point, called the alpha level, is used for making this decision. Traditionally, we set alpha to a small value, such as .05 or .01.

A Type I error occurs when the researcher rejects the null hypothesis when it is actually true. A Type II error occurs when the researcher fails to reject the null hypothesis when the null hypothesis is actually false.

Inferential statistics are used on data drawn from a particular group of participants (the sample) to draw conclusions about a larger group of people (the population).

Inferential statistics are used most frequently to evaluate mean
differences between groups. Two of the most frequently used
inferential statistics are the *t*-test and analysis of
variance. There are several variations of each of these tests.

The terms power or statistical power refer to the sensitivity of a statistical procedure to detect the differences being sought. Power depends not only on the sensitivity of the statistical procedures but also on the precision of the research design, the accuracy of the measurements, and the size of the sample. Any improvement in research design that increases its sensitivity will enhance its power. Through power analysis we can compute the size of the sample needed to achieve a specific level of power.

The effect size is the difference between the groups expressed in standard deviation units.

Scientists have an ethical obligation to use statistical procedures to provide an even-handed review of the data instead of selecting data that might support a pet theory.