Great Experiment: Russian Art 1863-1922. AS Psychology: Research Methods

 

Home   AS   A2      Links 



Introduction

What follows is meant as a summary or brief overview only of this topic area.  It is essential that a combination of class exercises and/or texts are used with the notes to provide a fuller understanding of the issues covered.  Easily the best way of learning research methods is a combination of reading followed by practise.  Read a section, e.g. on levels of data and then practise what you’ve just learned by answering questions on the topic.  Questions on the paper will require short response answers.

Apologies in advance if this document lacks the usual ‘humour’ but time is pressing and it is very late in the term so my sense of humour appears to have undergone a temporary by-pass!

An overview

Basically the topic follows the following format:

 

1. Research Methods

          Experiments

                   Laboratory

                   Field

                   Natural

                   Quasi

          Correlations

          Observations

          Questionnaires

          Interviews

 

2. Research Design and implementation

Hypotheses

Research design

Variables

Sampling

Demand characteristics, bias etc.

 

3. Data analysis

          Analysis of qualitative data

          Analysis of quantitative data

                   Measures of central tendency

                   Measures of dispersion

          Correlations

          Illustrating and summarising

 

Government Health warning

The following information does contain sums and other material likely to cause offence to the squeamish.  However, I’ll endeavour to keep the aforementioned to an absolute minimum and will, wherever possible avoid the gratuitous use of things like numbers!

 

This topic is brief, but is nevertheless useful, particularly when next term you will be called upon to complete a piece of coursework!  Good luck and ENJOY!


 

Research Methods

 

The Experiment

  • Definition of an experiment
  • Advantages and disadvantages of the experimental method
  • Types of experiment

Definition

In an experiment a variable is manipulated to see what effect it will have on another.

For example if we wanted to know whether caffeine affected reaction times:

We could take two groups, give one group coffee (experimental group) and compare them to another group without coffee (control group).  We would then set them a task designed to measure their reaction times.

In experiments there are 2 variables:

  • Independent variable (the one we alter or manipulate) in this case whether or not the person has had coffee.
  • Dependent variable (the one that alters as a result of what we do), in this case reaction time.  The dependent variable is usually the one we measure or record.

Not rocket science, BUT the problem is always remembering which is which.  The way I do it is simply to think of the dependent variable as the one that depends on what we do!

In this case reaction time (dependent variable) depends on whether or not the participant has had a cup of coffee.

In your text read the blue box on p. 241 for a succinct summary.

 

Other variables

The I.V. and D.V. as we psychologists refer to them, are not the only variables to worry about. 

In the coffee experiment suppose we find that the coffee group have faster reaction times, can we be certain the coffee has caused this.  Other possible reasons:

  • The experimental group (on coffee) might just by chance contain people with faster reactions
  • Perhaps we measured one group in the morning, the other in the afternoon.
  • Perhaps those in the control group had a hangover etc.

 

Confounding variables

These are variables that get in the way of our results or make our results difficult to interpret.

Think of Brady’s executive monkeys.  Brady assumed that being in control had caused the stress that lead to the ulcers.  Control being the IV and ulcers being the DV.  In fact it was more likely to be the activity levels of the monkeys that caused the results.  This is an example of a confounding variable.

Common confounding variables include:

  • Intelligence of participants
  • Personality of participants
  • Gender of participants
  • Time of day
  • Weather
  • Noise levels
  • Temperature…

Obviously in an experiment we take steps to minimise these, for example we could ensure that the procedure is carried out at the same time of day, in the same room, with similar temperature settings etc…  More on this later.

 

Advantages of experiments

Disadvantages of experiments

Cause and effect: We can usually see that the IV has caused the alteration in the DV.  Provided we have controlled our experiment we should be able to show that it was the coffee that was responsible for the faster reaction times.

Lacks ecological validity: As we’ve seen so many times (e.g. in memory and in Milgram), experiments, especially those in laboratories are very artificial.  Can they really tell us how people will behave in real life situations?

Replication: Provided care has been taken in conducting and reporting the procedure another person should be able to repeat your procedure to see if they get the same results.

Demand characteristics: New one for you; this refers to participants behaving differently because they know they’re being watched.  We saw this in Milgram.  It could be that they guess what you want and try to please the experimenter, e.g. by obeying!

At least two of these terms should be nauseatingly familiar to you.  I could only guess at the number of times cause and effect and ecological validity have been mentioned in the past year.  Note however, that in this case experiments can show cause and effect.  Correlations are the ones we’ve criticised for not being able to show cause and effect.  More on these later.

 

Other types of experiment

Not all experiments are carried out in laboratories

 

Field experiments

Not, as the name implies, experiments necessarily conducted in fields, although they could be!

More likely settings would include the work place, school, the street etc.  Basically the same rules apply: an independent variable is manipulated to see how it affects a dependent variable.  Confounding variables can still get in the way, and cause and effect can still be determined.  However, the setting is more natural.  Bickman’s litter experiment would be an example.  The IV (the way the person was dressed), the DV (whether or not participants obeyed). 

 

Advantages of field experiments

Disadvantages of field experiments

Realism: because the settings are more natural it is assumed that people will behave more naturally, so field experiments should have greater ecological validity.

Less control of variables: the experimenter has less control over the environment so more variables may affect the outcome, e.g. a patient may have stopped one of the nurses.

Demand characteristics: these can be less since participants may not be aware that they are in an experiment, as was the case with Hofling!

Ethics: If patients are unaware of the study how can the consent to take part or withdraw from the experiment?

 

Replication: It is difficult to repeat the procedure exactly as it was the first time.

 

Quasi-experiments

Not, as the name implies, experiments were you run around in the dark pretending to shoot each other with lasers!  But they could be if you were looking for age or sex differences.

In a real experiment you can manipulate the IV and you can decide who goes in which group.  In your study on coffee you decide which participants go in which group.  However supposing you wanted to see if 40-somethings had faster reactions than teenagers: you never know it could happen! 

In one group your participants will have to be teenagers and the other group will have to comprise 40-somethings.  You are unable to randomly allocate your participants to the different groups.  Similarly with sex differences; by definition the boys are going to be in one group and the girls in the other.

 

 

Natural experiments

Not, as the name implies, experiments carried out in the buff!  These are similar to and often confused with quasi-experiments, but there is one crucial difference.  Natural experiments take advantage of a naturally occurring event.  The effect of the eruption of Mount St Helens on stress related illnesses is the one all the texts prefer to mention.  In this case the IV was the eruption, a naturally occurring event.  A better example, and one that we’ve studied is Hodges & Tizard’s study of institutional care) which examined the effect of different types and duration of care on the children’s subsequent behaviour and development.

IV is the type and duration of care (in this case not controlled by the researchers, it happened anyway).

DV is the effect this has on subsequent development (and which can be measured using various tests).

 

Advantages of natural experiments

Disadvantages of natural experiments

Demand characteristics: it is often the case that the experimenter isn’t even present when the event occurs, thankfully in the case of Mt Saint Helens!  As a result participants are not trying to please the researchers.

Lack of control: the researchers have no control at all over the variables and there may be lots of confounding variables.  In the Mt St Helens case ill health caused by smoke, or stress due to loss of house etc.

Research opportunities: it is possible to research events that it would be unethical to study any other way or that may be impossible to set up.

Replication: in some cases clearly impossible, in others very difficult.  As a result it may be impossible to check the validity of research.

 

Cause and effect: following on from lack of control, it may be impossible to decide if the IV is causing the change in the DV.

 

Types of experiment

Tekening van quasar

 

 

 

Non experimental methods

These include:                                                                                   

  • Correlational analysis
  • Naturalistic observation
  • Interviews
  • Questionnaires
  • Case studies

 

Correlational analysis

In the past year we’ve seen lots of examples of this.  For example whenever I’ve criticised a study because it doesn’t show cause and effect it’s probably been a correlational study.

An example:

For example we could look for a correlation between IQ and performance at GCSE or A-level.  Common sense would perhaps tell us that students that have higher IQs are more likely to perform well at GCSE. 

In year 13 we look at the controversial area of IQ and find that there is a high correlation between the IQs of MZ twins.  If one twin has a high IQ it is likely the other does too.  This is taken as evidence for the nature or genetic determination of IQ.  However, as we will see there are a host of other reasons why this might be the case. 

 

Types of correlation

 

Positive:  the most common; as one variable increases so does the other, e.g. IQ and GCSE score in the example above.

Positive                                                             Negative

                                   

Negative: as one variable increases the other decreases, e.g. it might be fair to assume that the higher your stress levels the lower your life expectancy.  Again we are unable to show cause and effect.  As mentioned frequently in ‘Stress,’ illnesses could be due to secondary habits such as smoking, poor diet etc.

 

Advantages of correlations

Disadvantages of correlations

Correlations allow us to study links between variables that could not be studied in any other way.  We could not inflict so much stress on a person that we endanger their life.  However, we can use a correlational analysis to show a possible link between the two occurring naturally.

Cause and effect: do I really need to explain this one? 

A correlation shows a possible link between 2 variables it does not prove that one causes the other, e.g. smoking and heart disease.

Economical and fast: large amounts of data can be compared quickly and cheaply, e.g. by using a questionnaire to collect data.

Correlations can disadvantage certain people in society if misused.  For example it was established long ago that blacks under perform on IQ tests compared to whites.  This knowledge was misinterpreted as evidence of white superiority.

 

Naturalistic Observation

This is an easy one to explain.  People or animals are observed in their natural environment, without any sort of intervention or manipulation of variables and without their knowledge.

Examples include:

  • Seyfarth & Cheney’s research on the calls of the vervet monkey (more on this fascinating subject in year 13)
  • Sylva’s study of play in young children.
  • Much of the work carried out by Konrad Lorenz

Ethologists specialise in studying animals in their natural environment.

 

Konrad Lorenz

Was it the tickly beard of the alluring aroma of his rough shag?

Either way Lorenz was still attracting the birds into his 70s.

 

 

Advantages of naturalistic observation

Disadvantages of naturalistic observation

 

Ecological validity: since the setting is natural and there are no demand characteristics it is safe to assume that this is how people really behave!

 

Reliability: there is the issue of bias.  For example if a researcher is looking at aggressive acts in a football game and assumes that boys are going to be more aggressive, the results may inadvertently be interpreted in this way.

 

 

Sometimes this is the only possible way of doing research, especially if people are unwilling or unable to complete questionnaires or interviews.

 

 

Replication: in many cases it would be impossible to recreate exactly the same situation so that someone else could verify your findings.

 

 

Ethical issues raised by observations

These were covered in more detail in the social topic (conformity and obedience).  If participants are unaware of being watched then clearly they are unable to give their consent and unable to withdraw from the research. 

 

Interviews

There are a number of species of interview each with their own advantages and disadvantages.  I’ll consider the main ones only:

Informal interviews

The interviewer has an aim in mind at the outset but is willing to be flexible about getting answers.  The interviewer tries not to direct the interviewee but instead listens and lets the interview take its natural course.

 

Advantages

Disadvantages

Lots of information can be gathered

Difficult to analyse, especially if different participants discuss different issues

Interviewee made to feel relaxed

Low reliability

 

Clinical interview

These were made popular by Freud and in particular Piaget and are a type of informal interview.  Piaget for example would read ‘moral stories’ to a child and start off by asking the same questions to all the children, for example ‘who is the naughtier boy in the stories.’  However, follow up questions would be informal and vary from child to child. 

Structured or formal interviews

These follow a set pattern with the interviewer having prepared a set of questions in advance that are asked in a particular order.

Note: sometimes the questions may be open and allow the interviewee to respond how they like, for example ‘how did you feel when Freddie ate your pet hamster?’  Or they can be closed and allow only a ‘yes’ or ‘no’ response.  For example ‘were you upset when Freddie ate your pet hamster?’ 

 

Advantages

Disadvantages

Easily replicated

Little flexibility so important points may be missed

Data is easier to analyse

Questions may be ambiguous (think of the SRRS for determining stress levels).

Data is less likely to be influenced by the interviewer

This format may encourage brief answers

 

 Limitation of interviews in general:

Social desirability bias

We all like to create a favourable impression.  When faced with an interviewer we are less likely to be honest than when filling out an anonymous questionnaire.  For example people being questioned about their love life are likely to exaggerate in face to face interviews. 

Lie scales can be introduced to assess how honest answers may be.  For example if people were being questioned about their childhood a ‘lie question’ might be; ‘As a child did you always do as you were told first time and without moaning?’  A response of ‘yes’ would be assumed to be a fib and indicate that perhaps the interviewees answers may not be reliable.

 

Questionnaires

We all know what they are and have all filled lots of them in.  Basically a questionnaire is a list of written questions that is able to gather lots of relevant information relatively quickly and cheaply. 

The biggest problem is wording of the questions.  Again there is the issue of ‘open’ or ‘closed’, but more importantly, as we saw in EWT, the issue of leading questions.  These are a favourite of politicians or of newspapers that want to find support or criticism of a particular issue.  For example imagine you wanted to find out if people wanted more money spent on the NHS, a relatively neutral question might be

‘Should more money be spent on the NHS?’

The Mirror (presumably wanting a ‘yes’ response might get their pollsters to ask:

‘Should extra money be provided to the NHS to take care of Britain’s sick and elderly?’

Whereas the Telegraph (being very stereotypical here) may get their pollsters to ask:

‘Would you be happy to pay more taxes to fund bureaucracy in the NHS?’

 

Rather extreme examples admittedly, real surveys produced by Alasdair Campbell (or Tony’s new head of spin) would be far more subtle, but you get the idea!

It is always a good idea to test your questionnaire in a pilot study first to make sure it doesn’t take hours to complete and that participants understand the questions.  Feedback like this may provide ideas for follow up questions to be asked in the real study.

 

Advantages

Disadvantages

Lots of people can be tested quickly

Lots of questionnaires will not be returned!

This allows more reliable generalisation to the overall population

People may tell fibs.  Even in anonymous questionnaires this may be an issue.  Again lie questions may be included, e.g. in Eysenck’s Personality Questionnaire (EPQ).

Data can often be analysed easily

 

 

Typical questions on Research Methods

Describe two disadvantages of investigations using correlational analysis (2 + 2 marks)

Identify the research method used in this study and explain one advantage and one disadvantage of this method.                                                                   (2 + 2 marks)

Give one advantage of using a questionnaire in this study.              (2 marks)

Following the survey it was decided to carry out an observational study into under-age drinking.  Outline procedures for carrying out such an observation. (6 marks)

 

 

Research Design and Implementation

 

Aims and Hypotheses

Aims

When carrying out a piece of research it is essential that you have an aim in mind.  This needs to be reasonably precise, for example ‘I’m gonna study memory’ would not be sufficiently precise.  However the aims are broader, or less precise than the hypotheses.  A suitable aim for memory might be ‘to see if age affects the duration of STM.’ 

Miller’s aim was to discover the capacity of STM.

Milgram’s aim was to see if normal people would obey when told to kill someone!

 

 

 

Hypotheses

These are more precise and should be operationalised, i.e. give some clue as to how the research will be carried out.  You must remember that two hypotheses are included:

a. Experimental or alternative hypothesis

This makes your prediction, for example:

As age increases the duration of STM decreases.’

 

b. Null hypothesis

This might at first glance seem redundant, since what you are saying is that you will not find what you’re expecting.  A suitable null hypothesis for the experiment above could be:

‘Age will have no affect on the duration of STM.’

 

The two examples above are simplified to give you the overall idea.  When deciding on an experimental hypothesis you need to give some indication of the method to be used.  For the above experiment it might be:

‘Duration of STM, as measured by the Brown-Peterson technique, will decrease with age.’

 

The null hypothesis would normally read:

‘Age will have no affect on duration of STM.  Any affect found could be due to chance.’

 

Why do we have a null hypothesis?

A null hypothesis is easier to prove.  For example if we were trying to show that all MZ twins had the same voting intentions.  Our hypothesis might be:

‘Pairs of MZ twins will always vote in the same direction in the coming General election.’

The null hypothesis might be:

‘Twins status will have no affect on direction of voting.  Any similarity found may be due to chance.’

 

Suppose we test 50 twins and both members of each pair are intending to vote in the same direction.  Have we proved that all twins will vote in the same direction?  It could be that the next pair we test won’t.  In that case all we need to do is find one pair that have different intentions to prove our null hypothesis.

When we finally get round to testing the results with a stats test it will be the null hypothesis that we’re testing.

 

One tailed or two tailed?

Having decided on your hypothesis and aims you need to decide on the direction.  In the examples above I’ve already done this.

When we say that we expect ‘duration of STM to decrease as age increases’ we are making a definite prediction.  That prediction has direction.  Compare this to the statement that ‘duration of STM will be affected by an increase in age.’  Will duration increase or decrease?  The hypothesis doesn’t say.  It could go either way. 

 

One tailed

If the hypothesis has a direction we say it is ‘directional’ or one-tailed.  In the first example we are saying that duration of STM will decrease.

Two tailed

If we are not prepared to commit ourselves and simply say there will be an affect then this is non directional or two tailed.

Try the exercises for practice and further illumination.

In year 13 coursework you will almost certainly be copying (sorry replicating) someone else’s work, for example Peterson and Petersons.  In this case your hypothesis will be based on their research.  If you were to replicate Milgram’s (don’t even think about it), you would choose a one tailed hypothesis such as:

‘Participants will follow instructions and fry an innocent person with 450V of electricity when told to do so by a man in a white coat.’ 

You could predict this with some confidence since all past research suggests that this is the case.

 Note: this business of one or two tailed does not apply to the null hypothesis.  This will always read, “not be affected” or “no correlation” etc. 

 

 

Research Design

Here we decide how we are going to sort or group our participants.  Do we use the same people in all conditions or groups, or do we choose different people for different conditions or groups?  In some cases, as we’ll see the decision is made for us.  In others the solution isn’t so obvious and there may be pros and cons for each.

Repeated Measures Design

Here we use the same participants in each group or condition. 

For example, returning to the earlier experiment on coffee and reaction times.

In a repeated measures design we could give our group of participants the test on day one with no coffee and record their reaction times.

The next day we could repeat the procedure, with the same group of people, but this time give them coffee before the experiment began.

 

Advantages

The two groups have the same age, sex, personality, ideas, past experiences, IQ, reaction times (crucially for this one) etc.  They are the same people.

Disadvantage

 

Order effects: Assuming, as we expect the group do better on the second day, can we be sure that this increase in performance is due to the coffee.  It could be that they’ve had the chance to practice the task the day before!  It’s not surprising they’re better the second time around.  This is called order or practice effect. 

Boredom: Of course, on some tasks it could work the other way, and a task done the second time shows a deterioration because they’re fed up with doing it.

Extra materials: For example if you use the same participants for two memory experiments you will need two lists of words etc. for them to recall

 

Text Box: Counterbalancing        
To overcome order or boredom effects we could use ABBA.
One half of the participants could do no coffee followed by coffee the next day (condition A followed by condition B).
The other half could do coffee on the first day and no coffee on the next day (condition B followed by condition A).  Hence ABBA (nothing to do with thanking anyone for the music!).
In some cases repeated measures has to be used:
If you’re comparing gender with subject choice at AS you use the same people in each condition and compare a persons gender score with their AS choices.
 
Examples we’ve seen this year
The love quiz:                the same people take each questionnaire
The 44 thieves:              later behaviour is compared to early attachments in same people
 
 
 
 
 
 
 

Text Box:

 

 

 

 

 

 


 

                                                                                               

 

 

 

Independent Measures Design

You guessed it.  If we used the same people in each group last time, this time we use different people in each group.  Clearly this overcomes practice and boredom effects ‘cos they only do it the once

Each participant is randomly allocated to one group or the other, so in our coffee experiment 

One group, comprising one set of participants do the test with coffee

The other group, comprising a different set of participants do the test without coffee.

Sorted, no problems with practice or repeat effects or with boredom or tiredness effects.

 

BUT

Can we be certain that the likely faster reactions of the first group are down to the coffee?

It could be that the participants that we’ve randomly assigned to that condition have naturally faster reactions.  They may be younger, or some of them may engage in activities that require fast reactions. 

In other experiments, sex, personality, age, IQ etc. could all be an issue because the participants are going to differ on all of these.

There are some occasions when independent measures design has to be used:

Sex differences

Age differences

By definition the two conditions are different.  You couldn’t have someone in the male condition and the female condition, or in the under 30 condition and the over 30 condition!

 

Advantages

  1. No order or practice effects
  2. Can use the same stimulus material (such as word lists in memory) for each group

 

Disadvantages

  1. Participants are not matched in terms of IQ, personality, age etc.
  2. You will need twice as many participants. 

 

Matched Pairs Desig

This is the ideal compromise.  In the reactions experiment you would have different people in each condition, i.e. some would have the coffee and others not.  However, the two sets would be matched in terms of IQ or whatever characteristics are relevant, in this case reaction times, age etc. 

Advantages

  1. No order effects since each participant only does the task once
  2. You can use the same material twice
  3. Groups are similar in terms of individual characteristics

 

Disadvantages

  1. Very time consuming and difficult to match all of your participants in this way
  2. It is impossible to match people for all characteristics even if you were to use MZ twins between the two groups!

 

Selecting your victims (sorry participants)

Having decided on your method (experiment, correlation etc.) and your design (repeated or individual) you now need to decide how you will choose the people who will be assigned to your conditions or groups.

When asked, everyone replies in unison: ‘RANDOMLY.   WRONG!!!

Random Sample

It is practically impossible to get a truly random sample.  In a random sample every member of your target population would have an equal chance of being selected.  So for example if you wanted a random sample of primary school children in the Market Harborough area you would need to obtain all of their names, put them in a hat and draw your sample out.  In actual fact that would be the easy bit.  The difficult task would be finding them and persuading their parents to let you chosen ones take part! 

The main disadvantages of this method are:

  1. Time consuming
  2. Inevitably some of those selected will not take part

Other, more realistic methods of obtaining a representative sample:

 

Systematic sample (similar to random, with the same disadvantages)

This could be done by visiting the target schools and selecting every 5th child in the register.  This would still be time consuming.  If your target was people in MH, you could select every 20th street and then visit every 10th house in those streets etc.  However, it cannot be claimed that every person in MH has an equal chance of being selected!

Stratified sample

Here each variable affecting the outcome of the procedure needs to be considered.  For example if you were investigating voting intentions you would want to select on grounds of: gender, occupation, age, education, home ownership etc.  So because the male:female ratio is about 50:50 your sample would be 50:50.  Because about 65% of the adult population are home owners then 65% of your sample would be too. Etc., etc.

Disadvantages

Time consuming

Not a truly representative sample

Opportunity sample

Now we’ve hit rock bottom!  This is probably the least effective way since it involves selecting whoever happens to be available and willing to take part!

Next year in your search for victims, chances are you’ll go to the sixth form centre and pick out a few friends or non-threatening strangers.  Valentine (1982) estimates that 75% of all American and British psychology research is conducted on students, and the majority of these will have been selected in this way! 

Disadvantages

A very poor representative or cross-sectional sample!

 

Sample size

How many people are going to be part of your opportunity sample?

Things to consider:

Large samples can be expensive and are definitely time consuming

Small samples make it difficult to get a significant result (20 is about the minimum for most statistical tests).

Generally, the larger the sample the better since bias is likely to be reduced.

 

Reliability and Validit 

Both of these have been mentioned during the year, particularly ‘validity’ as in ‘ecological validity’ or ‘experimental validity.’  However, you now need to fully understand what both of them mean, how they can be increased and most importantly how to remember which is which!

 

Reliability

Reliability is akin to consistency

If you use a meter rule to measure the length of your classroom today, and you repeat the procedure next week, you will expect to get the same result.  The meter rule is consistent in its measurement or we say it is reliable 

Reliability in Psychology

This can be measured in a number of ways depending upon the circumstances.  However, each time we are looking for consistency of measurement:

Reliability of observations

This year some of the students have observed aggressive acts in men’s and women’s football to see if the men’s game is really more aggressive.  (Personally I never realised that real men played football but that’s a different issue).

Inter-rater reliability

One way of tackling this problem would be for one person to watch a game played by each gender, look for various aggressive acts and score them accordingly.  However, you only have one person’s opinion.  Better would be to get two or three people to do it independently and compare scores afterwards.  To ensure that results were reliable the raters would sit down beforehand and decide on the criteria to use and how to apply these.  For example decide exactly what was meant by ‘dirty tackle’ (no jokes please) or an ‘aggressive act.’  This would ensure inter-rater reliability.  Or in English it would ensure consistency in measurement between the observers.  All singing from the same hymn sheet in politico-speak.

Reliability of tests

If you measure someone’s IQ today you would expect to get a similar result if you used the same test to assess the same person in a few weeks time.  If the results were the same time (i.e. if the results were consistent (that word again)), you could assume the test was reliable!

Split test reliability

Rather than waiting a few weeks to try the test again it is possible to use split test reliability.  For example with an IQ test, split it in half give both halves to the participant and compare their score on each separate half.  If scores on each half are similar psychologists assume the test to be reliable.


Validity

Does the test or the experiment measure what it’s s’pose to be measuring?

We have mentioned this word ‘validity’ on a number of occasions, usually in relation to ‘ecological validity.’  However, there are a number of different types of validity; here we’ll concentrate on ‘internal’ and ‘external’, sometimes referred to as ‘ecological.’

Internal Validity

Are the effects that have been caused actually due to the independent variable?  For example if we’ve found that coffee (the I.V.) does increase speed of reaction (the D.V.), can we be certain that this increase is really due to the coffee or could it be due to a confounding variable such as the time of day or just faster reactions of the second group etc. 

Basically any of the confounding variables mentioned so far such as IQ, gender, age, time of day, lighting conditions etc., can cause false conclusions to be drawn.

External validity (like ecological validity 

How much can the results obtained tell us about real life, or put another way; can we generalise our findings to the real world?

Coolican (1994) points out 4 major issues:

  • Population: Can we generalise from our small sample, probably all students, to the population as a whole? 
  • Location, location, location (Coolican only said it once): Can the results that we’ve obtained in a laboratory setting really tell us how people will behave in real life.  Think back to memory experiments most of which were carried out in laboratories, or to ‘Stan the Man’ Milgram’s experiment in the labs of Yale University.  Would people really behave this way in real life?
  • Measures: If we use the Eysenck Personality Questionnaire (EPQ) and measure a person as very extrovert and slightly neurotic, can we be sure that they are really like this in real life or in social situations?  Similarly when we measure IQ, is the test we are using telling us anything real about the person?
  • Times: Can experiments carried out 40 0r 50 years ago such as Asch, Milgram etc. still tell us anything about people today.  I have mentioned how for example conformity changes over time.  Wars, for example tend to bring populations together and make us more conformist as was measured following the Falklands Conflict of 1982.

 

Methods of checking validity

Clearly it is useful for a psychologist to have some idea of whether or not tests are valid.  There are a number of ways this can be done:

Meta analyses: data can be collected form lots of different studies in different parts of the World and see if results are similar.  For example Bouchard & McGue compared findings for IQ tests between MZ twins and found similar levels of correlation between them all.

Concurrent validity: if we are measuring IQ we could compare the scores obtained to school tests in maths and English, or we could compare the results of personality tests with assessments by a person’s friends and family.

Predictive: a test should be able to predict later performance, behaviour or personality.  So again, a high score on an IQ test should be able to predict later success at school etc.  In school you sit YELLIS and ALIS tests which are used by teachers as predictors of your future performance.

 

Relationship between researcher and participant

As we’ve already seen this can cause problems, particularly in the experimental method.  In the Milgram evaluation I touched on demand characteristics, the idea that simply because the participant was taking part in an experiment that this would affect his behaviour (all Milgram’s participants were ‘he’s). 

 

Possible effects 

Participant reactivity

Put simply, participants will behave differently or unnaturally because they know they are being watched.  This doesn’t just apply under experimental conditions but in any walk of life!  The classic example which is well worth a read, but not necessary for the exam, is to be found on page 272 and is known as the ‘Hawthorn Effect. 

Demand characteristics

The idea that participants will behave the way they believe you want tem to behave.  It could be that participants guess what the experiment is about, or at least think they’ve guessed, and this will influence their behaviour accordingly.  This was a criticism of the Milgram procedure.  In Asch’s study on conformity, some of the participants said afterwards that they conformed because they didn’t want to mess up the experiment!  

Orne (1962) persuaded participants to do strange, if not very foolish things.  This argument is often used in the debate over hypnosis.  Orne, for example, persuaded his participants to put their hands into a tank containing a supposedly very venomous snake.  His most famous ‘experiment’ was to persuade participants to spend hours adding up random numbers and then getting them to tear up all their hard work!

Sometimes, of course, the reverse may be true, and for whatever reason, e.g. having been conned in previous studies, participants may deliberately seek to mess up your experiment by behaving counter to how they think you want them to behave.