Salvador Dali: The Making of New Man   Psychology as Science
 

 

Home AS A2 Links
Ethics 1
Ethics 2
Dealing with Ethical Issues
Experimental Method
Research Design
Observations
Correlations
Case Studies and Content Analysis
Interviews and Questionnaires
Aims and Hypotheses
Sampling
Reliability and Validity
Researchers and Participants
Data Analysis
Central Tendency
Graphs
Qualitative Analysis

 

 

 

 

 

Reliability and Validity

Both of these have been mentioned during the year, particularly ‘validity’ as in ‘ecological validity’ or ‘experimental validity.’  However, you now need to fully understand what both of them mean, how they can be increased and most importantly how to remember which is which!

Reliability

Reliability is akin to consistency

If you use a meter rule to measure the length of your classroom today, and you repeat the procedure next week, you will expect to get the same result.  The meter rule is consistent in its measurement or we say it is reliable!

Reliability in Psychology

This can be measured in a number of ways depending upon the circumstances.  However, each time we are looking for consistency of measurement:

Reliability of observations

This year some of the students have observed aggressive acts in men’s and women’s football to see if the men’s game is really more aggressive.  (Personally I never realised that real men played football but that’s a different issue).

 Inter-rater reliability

One way of tackling this problem would be for one person to watch a game played by each gender, look for various aggressive acts and score them accordingly.  However, you only have one person’s opinion.  Better would be to get two or three people to do it independently and compare scores afterwards.  To ensure that results were reliable the raters would sit down beforehand and decide on the criteria to use and how to apply these.  For example decide exactly what was meant by ‘dirty tackle’ (no jokes please) or an ‘aggressive act.’  This would ensure inter-rater reliability.  Or in English it would ensure consistency in measurement between the observers.  All singing from the same hymn sheet in politico-speak.

 

Reliability of tests

If you measure someone’s IQ today you would expect to get a similar result if you used the same test to assess the same person in a few weeks time.  If the results were the same time (i.e. if the results were consistent (that word again)), you could assume the test was reliable!

 

 

Split test reliability

Rather than waiting a few weeks to try the test again it is possible to use split test reliability.  For example with an IQ test, split it in half give both halves to the participant and compare their score on each separate half.  If scores on each half are similar psychologists assume the test to be reliable.

Validity

Does the test or the experiment measure what it’s s’pose to be measuring?

We have mentioned this word ‘validity’ on a number of occasions, usually in relation to ‘ecological validity.’  However, there are a number of different types of validity; here we’ll concentrate on ‘internal’ and ‘external’, sometimes referred to as ‘ecological.’

Internal (or experimental) Validity

Are the effects that have been caused actually due to the independent variable?  For example if we’ve found that coffee (the I.V.) does increase speed of reaction (the D.V.), can we be certain that this increase is really due to the coffee or could it be due to a confounding variable such as the time of day or just faster reactions of the second group etc. 

Basically any of the confounding variables mentioned so far such as IQ, gender, age, time of day, lighting conditions etc., can cause false conclusions to be drawn.

External validity (like ecological validity)

How much can the results obtained tell us about real life, or put another way; can we generalise our findings to the real world?

Coolican (1994) points out 4 major issues:

  • Population: Can we generalise from our small sample, probably all students, to the population as a whole? 
  • Location, location, location (Coolican only said it once): Can the results that we’ve obtained in a laboratory setting really tell us how people will behave in real life.  Think back to memory experiments most of which were carried out in laboratories, or to ‘Stan the Man’ Milgram’s experiment in the labs of Yale University.  Would people really behave this way in real life?
  • Measures: If we use the Eysenck Personality Questionnaire (EPQ) and measure a person as very extrovert and slightly neurotic, can we be sure that they are really like this in real life or in social situations?  Similarly when we measure IQ, is the test we are using telling us anything real about the person?
  • Times: Can experiments carried out 40 0r 50 years ago such as Asch, Milgram etc. still tell us anything about people today.  I have mentioned how for example conformity changes over time.  Wars, for example tend to bring populations together and make us more conformist as was measured following the Falklands Conflict of 1982.

Methods of checking validity

Clearly it is useful for a psychologist to have some idea of whether or not tests are valid.  There are a number of ways this can be done:

Meta analyses: data can be collected form lots of different studies in different parts of the World and see if results are similar.  For example Bouchard & McGue compared findings for IQ tests between MZ twins and found similar levels of correlation between them all.

Concurrent validity: if we are measuring IQ we could compare the scores obtained to school tests in maths and English, or we could compare the results of personality tests with assessments by a person’s friends and family.

Predictive: a test should be able to predict later performance, behaviour or personality.  So again, a high score on an IQ test should be able to predict later success at school etc.  In school you sit YELLIS and ALIS tests which are used by teachers as predictors of your future performance.

Next page