Introduction
What follows
is meant as a summary or brief overview only of this topic area. It is
essential that a combination of class exercises and/or texts are used with
the notes to provide a fuller understanding of the issues covered. Easily
the best way of learning research methods is a combination of reading
followed by practise. Read a section, e.g. on levels of data and then
practise what you’ve just learned by answering questions on the topic.
Questions on the paper will require short response answers.
Apologies in
advance if this document lacks the usual ‘humour’ but time is pressing and
it is very late in the term so my sense of humour appears to have
undergone a temporary by-pass!
An overview
Basically the
topic follows the following format:
1.
Research Methods
Experiments
Laboratory
Field
Natural
Quasi
Correlations
Observations
Interviews
2.
Research Design and implementation
Hypotheses
Research design
Variables
Sampling
Demand characteristics, bias etc.
3. Data
analysis
Analysis of qualitative data
Analysis of quantitative data
Measures of central tendency
Measures of dispersion
Correlations
Illustrating and summarising
Government
Health warning
The following
information does contain sums and other material likely to cause offence
to the squeamish. However, I’ll endeavour to keep the aforementioned to
an absolute minimum and will, wherever possible avoid the gratuitous use
of things like numbers!
This topic is
brief, but is nevertheless useful, particularly when next term you will be
called upon to complete a piece of coursework! Good luck and ENJOY!
Research
Methods
The
Experiment
-
Definition
of an experiment
-
Advantages
and disadvantages of the experimental method
-
Types of
experiment
Definition
In an
experiment a variable is manipulated to see what effect it will
have on another.
For example if
we wanted to know whether caffeine affected reaction times:
We could take
two groups, give one group coffee (experimental group) and compare them to
another group without coffee (control group). We would then set them a
task designed to measure their reaction times.
In experiments
there are 2 variables:
-
Independent
variable
(the one we alter or manipulate) in this case whether or not the person
has had coffee.
-
Dependent
variable
(the one that alters as a result of what we do), in this case reaction
time. The dependent variable is usually the one we measure or record.
Not rocket
science, BUT the problem is always remembering which is which. The way I
do it is simply to think of the dependent variable as the one that
depends on what we do!
In this case
reaction time (dependent variable) depends on whether or not the
participant has had a cup of coffee.
In your text
read the blue box on p. 241 for a succinct summary.
Other
variables
The I.V. and
D.V. as we psychologists refer to them, are not the only variables to
worry about.
In the coffee
experiment suppose we find that the coffee group have faster reaction
times, can we be certain the coffee has caused this. Other possible
reasons:
-
The
experimental group (on coffee) might just by chance contain people with
faster reactions
-
Perhaps we
measured one group in the morning, the other in the afternoon.
-
Perhaps
those in the control group had a hangover etc.
Confounding
variables
These are
variables that get in the way of our results or make our results difficult
to interpret.
Think of
Brady’s executive monkeys. Brady assumed that being in control had caused
the stress that lead to the ulcers. Control being the IV and ulcers being
the DV. In fact it was more likely to be the activity levels of the
monkeys that caused the results. This is an example of a confounding
variable.
Common
confounding variables include:
-
Intelligence
of participants
-
Personality
of participants
-
Gender of
participants
-
Time of day
-
Weather
-
Noise levels
-
Temperature…
Obviously in
an experiment we take steps to minimise these, for example we could ensure
that the procedure is carried out at the same time of day, in the same
room, with similar temperature settings etc… More on this later.
|
Advantages
of experiments |
Disadvantages of experiments |
|
Cause and
effect:
We can usually see that the IV has caused the alteration in the DV.
Provided we have controlled our experiment we should be able to show
that it was the coffee that was responsible for the faster reaction
times. |
Lacks
ecological validity:
As we’ve
seen so many times (e.g. in memory and in Milgram), experiments,
especially those in laboratories are very artificial. Can they really
tell us how people will behave in real life situations? |
|
Replication:
Provided
care has been taken in conducting and reporting the procedure another
person should be able to repeat your procedure to see if they get the
same results. |
Demand
characteristics:
New one
for you; this refers to participants behaving differently because they
know they’re being watched. We saw this in Milgram. It could be that
they guess what you want and try to please the experimenter, e.g. by
obeying! |
At least two
of these terms should be nauseatingly familiar to you. I could only guess
at the number of times cause and effect and ecological validity have been
mentioned in the past year. Note however, that in this case experiments
can show cause and effect. Correlations are the ones we’ve
criticised for not being able to show cause and effect. More on these
later.
Other types of
experiment
Not all
experiments are carried out in laboratories
Field
experiments
Not, as the
name implies, experiments necessarily conducted in fields, although they
could be!
More likely
settings would include the work place, school, the street etc. Basically
the same rules apply: an independent variable is manipulated to see how it
affects a dependent variable. Confounding variables can still get in the
way, and cause and effect can still be determined. However, the setting
is more natural. Bickman’s litter experiment would be an example. The IV
(the way the person was dressed), the DV (whether or not participants
obeyed).
|
Advantages
of field experiments |
Disadvantages of field experiments |
|
Realism:
because the settings are more natural it is assumed that people will
behave more naturally, so field experiments should have greater
ecological validity. |
Less
control of variables:
the
experimenter has less control over the environment so more variables
may affect the outcome, e.g. a patient may have stopped one of the
nurses. |
|
Demand
characteristics:
these can
be less since participants may not be aware that they are in an
experiment, as was the case with Hofling! |
Ethics:
If
patients are unaware of the study how can the consent to take part or
withdraw from the experiment? |
|
|
Replication:
It is difficult to repeat the procedure exactly as it was the first
time. |
Quasi-experiments
Not, as the
name implies, experiments were you run around in the dark pretending to
shoot each other with lasers! But they could be if you were looking for
age or sex differences.
In a real
experiment you can manipulate the IV and you can decide who goes in
which group. In your study on coffee you decide which participants go in
which group. However supposing you wanted to see if 40-somethings had
faster reactions than teenagers: you never know it could happen!
In one group
your participants will have to be teenagers and the other group
will have to comprise 40-somethings. You are unable to randomly
allocate your participants to the different groups. Similarly with sex
differences; by definition the boys are going to be in one group and the
girls in the other.

Natural
experiments
Not, as the
name implies, experiments carried out in the buff! These are similar to
and often confused with quasi-experiments, but there is one crucial
difference. Natural experiments take advantage of a naturally
occurring event. The effect of the eruption of Mount St Helens on stress
related illnesses is the one all the texts prefer to mention. In this
case the IV was the eruption, a naturally occurring event. A better
example, and one that we’ve studied is Hodges & Tizard’s study of
institutional care) which examined the effect of different types and
duration of care on the children’s subsequent behaviour and development.
IV is the type
and duration of care (in this case not controlled by the researchers, it
happened anyway).
DV is the
effect this has on subsequent development (and which can be measured using
various tests).
|
Advantages
of natural experiments |
Disadvantages of natural experiments |
|
Demand
characteristics:
it is
often the case that the experimenter isn’t even present when the event
occurs, thankfully in the case of Mt Saint Helens! As a result
participants are not trying to please the researchers. |
Lack of
control:
the
researchers have no control at all over the variables and there may be
lots of confounding variables. In the Mt St Helens case ill health
caused by smoke, or stress due to loss of house etc. |
|
Research
opportunities:
it is
possible to research events that it would be unethical to study any
other way or that may be impossible to set up. |
Replication:
in some
cases clearly impossible, in others very difficult. As a result it
may be impossible to check the validity of research. |
|
|
Cause and
effect:
following on from lack of control, it may be impossible to decide if
the IV is causing the change in the DV. |
Types of
experiment
Non
experimental methods
These include:
-
Correlational analysis
-
Naturalistic
observation
-
Interviews
-
Questionnaires
-
Case studies
Correlational
analysis
In the past
year we’ve seen lots of examples of this. For example whenever I’ve
criticised a study because it doesn’t show cause and effect it’s probably
been a correlational study.
An example:
For example we
could look for a correlation between IQ and performance at GCSE or
A-level. Common sense would perhaps tell us that students that have
higher IQs are more likely to perform well at GCSE.
In year 13 we
look at the controversial area of IQ and find that there is a high
correlation between the IQs of MZ twins. If one twin has a high IQ it is
likely the other does too. This is taken as evidence for the nature or
genetic determination of IQ. However, as we will see there are a host of
other reasons why this might be the case.
Types of
correlation
Positive:
the most common; as one variable increases so does the other, e.g. IQ and
GCSE score in the example above.
Positive
Negative

Negative:
as one variable increases the other decreases, e.g. it might be fair to
assume that the higher your stress levels the lower your life expectancy.
Again we are unable to show cause and effect. As mentioned frequently in
‘Stress,’ illnesses could be due to secondary habits such as smoking, poor
diet etc.
|
Advantages
of correlations |
Disadvantages of correlations |
|
Correlations allow us to study links between variables that could not
be studied in any other way. We could not inflict so much stress on a
person that we endanger their life. However, we can use a
correlational analysis to show a possible link between the two
occurring naturally. |
Cause and
effect: do I really need to explain this one?
A
correlation shows a possible link between 2 variables it does not
prove that one causes the other, e.g. smoking and heart disease. |
|
Economical
and fast: large amounts of data can be compared quickly and cheaply,
e.g. by using a questionnaire to collect data. |
Correlations can disadvantage certain people in society if misused.
For example it was established long ago that blacks under perform on
IQ tests compared to whites. This knowledge was misinterpreted as
evidence of white superiority. |
Naturalistic
Observation
This is an
easy one to explain. People or animals are observed in their natural
environment, without any sort of intervention or manipulation of variables
and without their knowledge.
Examples
include:
-
Seyfarth &
Cheney’s research on the calls of the vervet monkey (more on this
fascinating subject in year 13)
-
Sylva’s
study of play in young children.
-
Much of the
work carried out by Konrad Lorenz
Ethologists
specialise in studying animals in their natural environment.
|
Konrad Lorenz
Was it the tickly beard of the
alluring aroma of his rough shag?
Either way Lorenz was still attracting
the birds into his 70s.
|
 |
|
Advantages
of naturalistic observation |
Disadvantages of naturalistic observation |
|
Ecological
validity:
since the setting is natural and there are no demand characteristics
it is safe to assume that this is how people really behave! |
Reliability:
there is
the issue of bias. For example if a researcher is looking at
aggressive acts in a football game and assumes that boys are going to
be more aggressive, the results may inadvertently be interpreted in
this way.
|
|
Sometimes
this is the only possible way of doing research, especially if people
are unwilling or unable to complete questionnaires or interviews.
|
Replication:
in many
cases it would be impossible to recreate exactly the same situation so
that someone else could verify your findings.
|
Ethical issues
raised by observations
These were
covered in more detail in the social topic (conformity and obedience). If
participants are unaware of being watched then clearly they are unable to
give their consent and unable to withdraw from the research.
Interviews
There are a
number of species of interview each with their own advantages and
disadvantages. I’ll consider the main ones only:
Informal
interviews
The
interviewer has an aim in mind at the outset but is willing to be flexible
about getting answers. The interviewer tries not to direct the
interviewee but instead listens and lets the interview take its natural
course.
|
Advantages
|
Disadvantages |
|
Lots of
information can be gathered |
Difficult
to analyse, especially if different participants discuss different
issues |
|
Interviewee made to feel relaxed |
Low
reliability |
Clinical
interview
These were
made popular by Freud and in particular Piaget and are a type of informal
interview. Piaget for example would read ‘moral stories’ to a child and
start off by asking the same questions to all the children, for example
‘who is the naughtier boy in the stories.’ However, follow up questions
would be informal and vary from child to child.
Structured or
formal interviews
These follow a
set pattern with the interviewer having prepared a set of questions in
advance that are asked in a particular order.
Note:
sometimes the questions may be open and allow the interviewee to respond
how they like, for example ‘how did you feel when Freddie ate your pet
hamster?’ Or they can be closed and allow only a ‘yes’ or ‘no’ response.
For example ‘were you upset when Freddie ate your pet hamster?’
|
Advantages
|
Disadvantages |
|
Easily
replicated |
Little
flexibility so important points may be missed |
|
Data is
easier to analyse |
Questions
may be ambiguous (think of the SRRS for determining stress levels). |
|
Data is
less likely to be influenced by the interviewer |
This
format may encourage brief answers |
Limitation of
interviews in general:
Social
desirability bias
We all like to
create a favourable impression. When faced with an interviewer we are
less likely to be honest than when filling out an anonymous
questionnaire. For example people being questioned about their love life
are likely to exaggerate in face to face interviews.
Lie scales can
be introduced to assess how honest answers may be. For example if people
were being questioned about their childhood a ‘lie question’ might be; ‘As
a child did you always do as you were told first time and without
moaning?’ A response of ‘yes’ would be assumed to be a fib and indicate
that perhaps the interviewees answers may not be reliable.
Questionnaires
We all know
what they are and have all filled lots of them in. Basically a
questionnaire is a list of written questions that is able to gather lots
of relevant information relatively quickly and cheaply.
The biggest
problem is wording of the questions. Again there is the issue of ‘open’
or ‘closed’, but more importantly, as we saw in EWT, the issue of leading
questions. These are a favourite of politicians or of newspapers that
want to find support or criticism of a particular issue. For example
imagine you wanted to find out if people wanted more money spent on the
NHS, a relatively neutral question might be
‘Should more
money be spent on the NHS?’
The Mirror
(presumably wanting a ‘yes’ response might get their pollsters to ask:
‘Should
extra money be provided to the NHS to take care of Britain’s sick and
elderly?’
Whereas the
Telegraph (being very stereotypical here) may get their pollsters to ask:
‘Would you
be happy to pay more taxes to fund bureaucracy in the NHS?’
Rather extreme
examples admittedly, real surveys produced by Alasdair Campbell (or Tony’s
new head of spin) would be far more subtle, but you get the idea!
It is always a
good idea to test your questionnaire in a pilot study first to make sure
it doesn’t take hours to complete and that participants understand the
questions. Feedback like this may provide ideas for follow up questions
to be asked in the real study.
|
Advantages |
Disadvantages |
|
Lots of
people can be tested quickly |
Lots of
questionnaires will not be returned! |
|
This
allows more reliable generalisation to the overall population |
People may
tell fibs. Even in anonymous questionnaires this may be an issue.
Again lie questions may be included, e.g. in Eysenck’s Personality
Questionnaire (EPQ). |
|
Data can
often be analysed easily |
|
Typical questions on Research Methods
Describe two disadvantages of investigations using correlational analysis
(2 + 2 marks)
Identify the research method used in this study and explain one advantage
and one disadvantage of this method.
(2 + 2 marks)
Give one advantage of using a questionnaire in this study. (2
marks)
Following the survey it was decided to carry out an observational study
into under-age drinking. Outline procedures for carrying out such an
observation. (6 marks)
Research
Design and Implementation
Aims and
Hypotheses
Aims
When carrying
out a piece of research it is essential that you have an aim in mind.
This needs to be reasonably precise, for example ‘I’m gonna study memory’
would not be sufficiently precise. However the aims are broader, or less
precise than the hypotheses. A suitable aim for memory might be ‘to see
if age affects the duration of STM.’
Miller’s aim
was to discover the capacity of STM.
Milgram’s aim
was to see if normal people would obey when told to kill someone!

Hypotheses
These are more
precise and should be operationalised, i.e. give some clue as to how the
research will be carried out. You must remember that two hypotheses are
included:
a.
Experimental or alternative hypothesis
This makes
your prediction, for example:
‘As age
increases the duration of STM decreases.’
b. Null
hypothesis
This might at
first glance seem redundant, since what you are saying is that you will
not find what you’re expecting. A suitable null hypothesis for the
experiment above could be:
‘Age will have
no affect on the duration of STM.’
The two
examples above are simplified to give you the overall idea. When deciding
on an experimental hypothesis you need to give some indication of the
method to be used. For the above experiment it might be:
‘Duration of
STM, as measured by the Brown-Peterson technique, will decrease with
age.’
The null
hypothesis would normally read:
‘Age will
have no affect on duration of STM. Any affect found could be due to
chance.’
Why do we have
a null hypothesis?
A null
hypothesis is easier to prove. For example if we were trying to show that
all MZ twins had the same voting intentions. Our hypothesis might be:
‘Pairs of MZ
twins will always vote in the same direction in the coming General
election.’
The null
hypothesis might be:
‘Twins
status will have no affect on direction of voting. Any similarity found
may be due to chance.’
Suppose we
test 50 twins and both members of each pair are intending to vote in the
same direction. Have we proved that all twins will vote in the same
direction? It could be that the next pair we test won’t. In that case
all we need to do is find one pair that have different intentions to prove
our null hypothesis.
When we
finally get round to testing the results with a stats test it will be the
null hypothesis that we’re testing.
One
tailed or two tailed?
Having decided
on your hypothesis and aims you need to decide on the direction. In the
examples above I’ve already done this.
When we say
that we expect ‘duration of STM to decrease as age increases’ we
are making a definite prediction. That prediction has direction. Compare
this to the statement that ‘duration of STM will be affected by an
increase in age.’ Will duration increase or decrease? The hypothesis
doesn’t say. It could go either way.
One tailed
If the
hypothesis has a direction we say it is ‘directional’ or one-tailed. In
the first example we are saying that duration of STM will decrease.
Two tailed
If we are not
prepared to commit ourselves and simply say there will be an affect
then this is non directional or two tailed.
Try the
exercises for practice and further illumination.
In year 13
coursework you will almost certainly be copying (sorry replicating)
someone else’s work, for example Peterson and Petersons. In this case
your hypothesis will be based on their research. If you were to replicate
Milgram’s (don’t even think about it), you would choose a one tailed
hypothesis such as:
‘Participants
will follow instructions and fry an innocent person with 450V of
electricity when told to do so by a man in a white coat.’
You could
predict this with some confidence since all past research suggests that
this is the case.
Note: this
business of one or two tailed does not apply to the null hypothesis. This
will always read, “not be affected” or “no correlation” etc.
Research
Design
Here we decide
how we are going to sort or group our participants. Do we use the same
people in all conditions or groups, or do we choose different people for
different conditions or groups? In some cases, as we’ll see the decision
is made for us. In others the solution isn’t so obvious and there may be
pros and cons for each.
Repeated
Measures Design
Here we use
the same participants in each group or condition.
For example,
returning to the earlier experiment on coffee and reaction times.
In a repeated
measures design we could give our group of participants the test on day
one with no coffee and record their reaction times.
The next day
we could repeat the procedure, with the same group of people, but
this time give them coffee before the experiment began.
Advantages
The two groups
have the same age, sex, personality, ideas, past experiences, IQ, reaction
times (crucially for this one) etc. They are the same people.
Disadvantage
Order effects:
Assuming, as we expect the group do better on the second day, can we be
sure that this increase in performance is due to the coffee. It could be
that they’ve had the chance to practice the task the day before! It’s not
surprising they’re better the second time around. This is called order or
practice effect.
Boredom:
Of
course, on some tasks it could work the other way, and a task done the
second time shows a deterioration because they’re fed up with doing it.
Extra
materials:
For example if you use the same participants for two memory experiments
you will need two lists of words etc. for them to recall

Independent Measures Design
You guessed
it. If we used the same people in each group last time, this time we use
different people in each group. Clearly this overcomes practice and
boredom effects ‘cos they only do it the once
Each
participant is randomly allocated to one group or the other, so in our
coffee experiment
One group,
comprising one set of participants do the test with coffee
The other
group, comprising a different set of participants do the test without
coffee.
Sorted, no
problems with practice or repeat effects or with boredom or tiredness
effects.
BUT
Can we be
certain that the likely faster reactions of the first group are down to
the coffee?
It could be
that the participants that we’ve randomly assigned to that condition have
naturally faster reactions. They may be younger, or some of them may
engage in activities that require fast reactions.
In other
experiments, sex, personality, age, IQ etc. could all be an issue because
the participants are going to differ on all of these.
There are some
occasions when independent measures design has to be used:
Sex
differences
Age
differences
By definition
the two conditions are different. You couldn’t have someone in the male
condition and the female condition, or in the under 30 condition
and the over 30 condition!
Advantages
-
No order or
practice effects
-
Can use the
same stimulus material (such as word lists in memory) for each group
Disadvantages
-
Participants
are not matched in terms of IQ, personality, age etc.
-
You will
need twice as many participants.
Matched
Pairs Design
This is the
ideal compromise. In the reactions experiment you would have different
people in each condition, i.e. some would have the coffee and others not.
However, the two sets would be matched in terms of IQ or whatever
characteristics are relevant, in this case reaction times, age etc.
Advantages
-
No order
effects since each participant only does the task once
-
You can use
the same material twice
-
Groups are
similar in terms of individual characteristics
Disadvantages
-
Very time
consuming and difficult to match all of your participants in this way
-
It is
impossible to match people for all characteristics even if you were to
use MZ twins between the two groups!
Selecting your
victims (sorry participants)
Having decided
on your method (experiment, correlation etc.) and your design (repeated or
individual) you now need to decide how you will choose the people who will
be assigned to your conditions or groups.
When asked,
everyone replies in unison: ‘RANDOMLY. WRONG!!!
Random
Sample
It is
practically impossible to get a truly random sample. In a random sample
every member of your target population would have an equal chance of being
selected. So for example if you wanted a random sample of primary school
children in the Market Harborough area you would need to obtain all
of their names, put them in a hat and draw your sample out. In actual
fact that would be the easy bit. The difficult task would be finding them
and persuading their parents to let you chosen ones take part!
The main
disadvantages of this method are:
-
Time
consuming
-
Inevitably
some of those selected will not take part
Other, more
realistic methods of obtaining a representative sample:
Systematic sample
(similar to
random, with the same disadvantages)
This could be
done by visiting the target schools and selecting every 5th
child in the register. This would still be time consuming. If your
target was people in MH, you could select every 20th street and
then visit every 10th house in those streets etc. However, it
cannot be claimed that every person in MH has an equal chance of being
selected!
Stratified sample
Here each
variable affecting the outcome of the procedure needs to be considered.
For example if you were investigating voting intentions you would want to
select on grounds of: gender, occupation, age, education, home ownership
etc. So because the male:female ratio is about 50:50 your sample would be
50:50. Because about 65% of the adult population are home owners then 65%
of your sample would be too. Etc., etc.
Disadvantages
Time consuming
Not a truly
representative sample
Opportunity sample
Now we’ve hit
rock bottom! This is probably the least effective way since it involves
selecting whoever happens to be available and willing to take part!
Next year in
your search for victims, chances are you’ll go to the sixth form centre
and pick out a few friends or non-threatening strangers. Valentine (1982)
estimates that 75% of all American and British psychology research is
conducted on students, and the majority of these will have been selected
in this way!
Disadvantages
A very poor
representative or cross-sectional sample!
Sample size
How many
people are going to be part of your opportunity sample?
Things to
consider:
Large samples
can be expensive and are definitely time consuming
Small samples
make it difficult to get a significant result (20 is about the minimum for
most statistical tests).
Generally, the
larger the sample the better since bias is likely to be reduced.
Reliability
and Validit
Both of these
have been mentioned during the year, particularly ‘validity’ as in
‘ecological validity’ or ‘experimental validity.’ However, you now need
to fully understand what both of them mean, how they can be increased and
most importantly how to remember which is which!
Reliability
Reliability is
akin to consistency
If you use a
meter rule to measure the length of your classroom today, and you repeat
the procedure next week, you will expect to get the same result. The
meter rule is consistent in its measurement or we say it is reliable
Reliability in
Psychology
This can be
measured in a number of ways depending upon the circumstances. However,
each time we are looking for consistency of measurement:
Reliability of
observations
This year some
of the students have observed aggressive acts in men’s and women’s
football to see if the men’s game is really more aggressive. (Personally
I never realised that real men played football but that’s a different
issue).
Inter-rater
reliability
One way of
tackling this problem would be for one person to watch a game played by
each gender, look for various aggressive acts and score them accordingly.
However, you only have one person’s opinion. Better would be to get two
or three people to do it independently and compare scores afterwards. To
ensure that results were reliable the raters would sit down beforehand and
decide on the criteria to use and how to apply these. For example decide
exactly what was meant by ‘dirty tackle’ (no jokes please) or an
‘aggressive act.’ This would ensure inter-rater reliability. Or in
English it would ensure consistency in measurement between the observers.
All singing from the same hymn sheet in politico-speak.
Reliability of
tests
If you measure
someone’s IQ today you would expect to get a similar result if you used
the same test to assess the same person in a few weeks time. If the
results were the same time (i.e. if the results were consistent (that word
again)), you could assume the test was reliable!
Split test
reliability
Rather than
waiting a few weeks to try the test again it is possible to use split test
reliability. For example with an IQ test, split it in half give both
halves to the participant and compare their score on each separate half.
If scores on each half are similar psychologists assume the test to be
reliable.
Validity
Does the test
or the experiment measure what it’s s’pose to be measuring?
We have
mentioned this word ‘validity’ on a number of occasions, usually in
relation to ‘ecological validity.’ However, there are a number of
different types of validity; here we’ll concentrate on ‘internal’ and
‘external’, sometimes referred to as ‘ecological.’
Internal
Validity
Are the
effects that have been caused actually due to the independent variable?
For example if we’ve found that coffee (the I.V.) does increase speed of
reaction (the D.V.), can we be certain that this increase is really due to
the coffee or could it be due to a confounding variable such as the time
of day or just faster reactions of the second group etc.
Basically any
of the confounding variables mentioned so far such as IQ, gender, age,
time of day, lighting conditions etc., can cause false conclusions to be
drawn.
External
validity (like ecological validity
How much can
the results obtained tell us about real life, or put another way; can we
generalise our findings to the real world?
Coolican
(1994) points out 4 major issues:
-
Population:
Can we generalise from our small sample, probably all students, to the
population as a whole?
-
Location,
location, location
(Coolican only said it once): Can the results that we’ve obtained in a
laboratory setting really tell us how people will behave in real life.
Think back to memory experiments most of which were carried out in
laboratories, or to ‘Stan the Man’ Milgram’s experiment in the labs of
Yale University. Would people really behave this way in real life?
-
Measures:
If we use the Eysenck Personality Questionnaire (EPQ) and measure a
person as very extrovert and slightly neurotic, can we be sure that they
are really like this in real life or in social situations? Similarly
when we measure IQ, is the test we are using telling us anything real
about the person?
-
Times:
Can experiments carried out 40 0r 50 years ago such as Asch, Milgram
etc. still tell us anything about people today. I have mentioned how
for example conformity changes over time. Wars, for example tend to
bring populations together and make us more conformist as was measured
following the Falklands Conflict of 1982.
Methods of
checking validity
Clearly it is
useful for a psychologist to have some idea of whether or not tests are
valid. There are a number of ways this can be done:
Meta analyses:
data can be collected form lots of different studies in different parts of
the World and see if results are similar. For example Bouchard & McGue
compared findings for IQ tests between MZ twins and found similar levels
of correlation between them all.
Concurrent
validity:
if we are measuring IQ we could compare the scores obtained to school
tests in maths and English, or we could compare the results of personality
tests with assessments by a person’s friends and family.
Predictive:
a test should be able to predict later performance, behaviour or
personality. So again, a high score on an IQ test should be able to
predict later success at school etc. In school you sit YELLIS and ALIS
tests which are used by teachers as predictors of your future performance.
Relationship
between researcher and participant
As we’ve
already seen this can cause problems, particularly in the experimental
method. In the Milgram evaluation I touched on demand characteristics,
the idea that simply because the participant was taking part in an
experiment that this would affect his behaviour (all Milgram’s
participants were ‘he’s).
Possible
effects
Participant
reactivity
Put simply,
participants will behave differently or unnaturally because they know they
are being watched. This doesn’t just apply under experimental conditions
but in any walk of life! The classic example which is well worth a read,
but not necessary for the exam, is to be found on page 272 and is known as
the ‘Hawthorn Effect.
Demand
characteristics
The idea that
participants will behave the way they believe you want tem to behave. It
could be that participants guess what the experiment is about, or at least
think they’ve guessed, and this will influence their behaviour
accordingly. This was a criticism of the Milgram procedure. In Asch’s
study on conformity, some of the participants said afterwards that they
conformed because they didn’t want to mess up the experiment!
Orne (1962)
persuaded participants to do strange, if not very foolish things. This
argument is often used in the debate over hypnosis. Orne, for example,
persuaded his participants to put their hands into a tank containing a
supposedly very venomous snake. His most famous ‘experiment’ was to
persuade participants to spend hours adding up random numbers and then
getting them to tear up all their hard work!
Sometimes, of
course, the reverse may be true, and for whatever reason, e.g. having been
conned in previous studies, participants may deliberately seek to mess up
your experiment by behaving counter to how they think you want them to
behave.

|