|
Reliability
and Validity
Both of these
have been mentioned during the year, particularly ‘validity’ as
in ‘ecological validity’ or ‘experimental validity.’ However,
you now need to fully understand what both of them mean, how
they can be increased and most importantly how to remember
which is which!
Reliability
Reliability is
akin to consistency
If you use a
meter rule to measure the length of your classroom today, and
you repeat the procedure next week, you will expect to get the
same result. The meter rule is consistent in its measurement or
we say it is reliable!
Reliability in
Psychology
This can be
measured in a number of ways depending upon the circumstances.
However, each time we are looking for consistency of
measurement:
Reliability of
observations
This year some
of the students have observed aggressive acts in men’s and
women’s football to see if the men’s game is really more
aggressive. (Personally I never realised that real men played
football but that’s a different issue).
Inter-rater reliability
One
way of tackling this problem would be for one person to
watch a game played by each gender, look for various aggressive
acts and score them accordingly. However, you only have one
person’s opinion. Better would be to get two or three people to
do it independently and compare scores afterwards. To ensure
that results were reliable the raters would sit down beforehand
and decide on the criteria to use and how to apply these. For
example decide exactly what was meant by ‘dirty tackle’ (no
jokes please) or an ‘aggressive act.’ This would ensure inter-rater
reliability. Or in English it would ensure consistency in
measurement between the observers. All singing from the same
hymn sheet in politico-speak.
Reliability of
tests
If you measure
someone’s IQ today you would expect to get a similar result if
you used the same test to assess the same person in a few weeks
time. If the results were the same time (i.e. if the results
were consistent (that word again)), you could assume the test
was reliable!
Split test
reliability
Rather than
waiting a few weeks to try the test again it is possible to use
split test reliability. For example with an IQ test, split it
in half give both halves to the participant and compare their
score on each separate half. If scores on each half are similar
psychologists assume the test to be reliable.
|
Validity
Does the test
or the experiment measure what it’s s’pose to be measuring?
We have
mentioned this word ‘validity’ on a number of occasions, usually
in relation to ‘ecological validity.’ However, there are a
number of different types of validity; here we’ll concentrate on
‘internal’ and ‘external’, sometimes referred to as
‘ecological.’
Internal (or
experimental) Validity
Are the
effects that have been caused actually due to the independent
variable? For example if we’ve found that coffee (the I.V.)
does increase speed of reaction (the D.V.), can we be certain
that this increase is really due to the coffee or could it be
due to a confounding variable such as the time of day or just
faster reactions of the second group etc.
Basically any
of the confounding variables mentioned so far such as IQ,
gender, age, time of day, lighting conditions etc., can cause
false conclusions to be drawn.
External
validity (like ecological validity)
How much can
the results obtained tell us about real life, or put another
way; can we generalise our findings to the real world?
Coolican
(1994) points out 4 major issues:
-
Population:
Can we generalise from our small sample, probably all
students, to the population as a whole?
-
Location,
location, location
(Coolican only said it once): Can the results that we’ve
obtained in a laboratory setting really tell us how people
will behave in real life. Think back to memory experiments
most of which were carried out in laboratories, or to ‘Stan
the Man’ Milgram’s experiment in the labs of Yale
University. Would people really behave this way in real
life?
-
Measures:
If we use the Eysenck Personality Questionnaire (EPQ) and
measure a person as very extrovert and slightly neurotic,
can we be sure that they are really like this in real life
or in social situations? Similarly when we measure IQ, is
the test we are using telling us anything real about the
person?
-
Times:
Can
experiments carried out 40 0r 50 years ago such as Asch,
Milgram etc. still tell us anything about people today. I
have mentioned how for example conformity changes over
time. Wars, for example tend to bring populations together
and make us more conformist as was measured following the
Falklands Conflict of 1982.
Methods of
checking validity
Clearly it is
useful for a psychologist to have some idea of whether or not
tests are valid. There are a number of ways this can be done:
Meta analyses:
data can be collected form lots of different studies in
different parts of the World and see if results are similar.
For example Bouchard & McGue compared findings for IQ tests
between MZ twins and found similar levels of correlation between
them all.
Concurrent
validity:
if we are measuring IQ we could compare the scores obtained to
school tests in maths and English, or we could compare the
results of personality tests with assessments by a person’s
friends and family.
Predictive:
a test should be able to predict later performance, behaviour or
personality. So again, a high score on an IQ test should be
able to predict later success at school etc. In school you sit
YELLIS and ALIS tests which are used by teachers as predictors
of your future performance.
Next page |