Salvador Dali: The Making of New Man AS Psychology: Research Methods

 

Home   AS   A2      Links 



 

Is Psychology a Science?

What is science?

According to BF Skinner (1980), ‘there is no place in a scientific study of behaviour for a mind or self’.  But, pick up any A-level psychology text and it will be described as the study of mind and behaviour; often in the title.   

Others go further and believe that even an observable characteristic such as behaviour cannot be studied objectively and certainly not when it’s human behaviour. 

To what extent therefore can psychology claim to be a science?

 

Modern definitions of science

Recent attempts to define what makes science scientific have generally included the following characteristics:

1. Controlled observations

Generally in scientific research something (the IV) is manipulated and we observe the effect this has on something else (the DV).  A physicist might manipulate the weight of a pendulum and measure its period, whilst obviously keeping length of string and height of release constant. 

In psychology this characteristic is best exemplified by the laboratory experiment where as many variable as possible are kept constant to see if the IV is causing the DV.  

 

2. Objectivity

Physics and chemistry are objective and hopefully mostly free of personal opinions but is psychology?  Popper demonstrated this to an audience of students.  He said ‘observe.’  After a pause the reply was ‘observe what?’  Popper had made his point.  When we observe we look for certain things.  In research we set out looking for certain behaviours or characteristics.  We have a predetermined idea of what we’re looking for and as we all know if we set out with expectations we’re quite likely to meet them.  Essentially this is the argument of the social constructivists.  Our existing knowledge determines our expectations and our viewpoint.  This is particularly noticeable in psychology with researchers belonging to one approach or another, e.g. cognitive or behaviourist.

 

3. Testing theoretical predictions

Having created models or theories we are able to make predictions based upon these.  We can then test these predictions with research.  Work on spatial memory in meadow moles produced the spatial adaptation model of animal memory.  Subsequent research on lizards showed this to be wrong resulting in the pliancy model.

 

4. Falsifiability

A concept introduced by Popper in 1969.  Having a theory that can be objectively tested and ultimately proven wrong is what distinguishes science from religion and pseudoscience such as psychoanalysis.  Psychological research tests an alternative or experimental hypothesis, however, we are not seeking to prove this, rather we seek to disprove our null hypothesis.  As Popper put it;

‘No amount of observations of white swans can allow the inference that all swans are white, but the observation of a single black swan is sufficient to refute that conclusion.’

In psychology many theories have been tested over the years and been shown to be wrong.  Weiss’ replication of Brady’s ‘executive monkeys’ experiment highlighted a crucial error in the research.

Similarly Schacter and Singer’s controversial research of 1962 was tested and questioned on methodological grounds seventeen years later!

However, there are concepts in psychology that do elude testing and falsifiability.  Freud’s hypothetical constructs, the Id, Ego and Superego, along with Eros and Thanatos and the psychosexual stages and the Oedipus complex can never be tested objectively.  They will forever remain non-falsifiable.  Similarly Maslow’s hierarchy of needs.

 

5. Replicability

Already mentioned on numerous occasions above; others must be able to test your findings.  Generally in psychology, laboratory experiments can be repeated, provided sufficient detail is included in the published article.  Much of the behaviourist approach has been tested many times and the schedules of reinforcement for example are seen as close to psychological fact as it’s possible to get.  Piaget’s work has been tested to death too!

Replication in social psychology however is more hit and miss.  Generally, research such as that of Asch and Milgram that was set in tightly controlled environments has been repeated.  Research reliant on real life observations is not always so easy to recreate. 

In order to replicate research all details need to be included in the write up, including details about participants, procedures, design decisions and of course the raw results.  Sir Cyril Burt’s research (below) is a lesson on what can happen when research is not published in full. 

 

On a note of historical interest it is worth pointing out that Burt’s work formed the basis of the eleven plus system in this country, which I missed by one year.  Testing at age 11 determined whether you went to grammar school and got a good standard of education or were relegated to a secondary modern school and got a poorer standard.  According to the argument, since intelligence was largely genetic then being thick at 11 meant that you’d always be thick so there was no point in wasting a good education on bad genes!

Working with 21 pairs of MZ twins reared apart in 1955 Burt obtained a correlation in IQ of 0.771.   Eleven years later, and now working with 53 pairs of MZ twins Burt’s figure for correlation was remarkably still 0.771.

Leon Kamin was the first to question the findings and subsequently the co authors who Burt claimed to work with could not be found.  His work could not be verified since his wife burnt it on his death!  This led close friend of Burt, Leslie Hearnshaw, who had read the eulogy at Burt’s funeral and had been chosen by Burt’s sister to write his biography, to conclude that much of Burt’s research had in fact been fraudulent.

 

 

6. Paradigm

According to Kuhn, a paradigm is the most important aspect of a true science.  Essentially a paradigm is a framework or central concept around which the science fits.  For example the Laws or relativity for physics.

As a science emerges and develops it progresses through three distinct stages:

Pre-science

No clear paradigm has yet emerged.  There may be a central thread holding the subject together but it is a broad church, perhaps embracing many different theoretical perspectives and as yet unclear as to which one to follow.

Normal science

A widely accepted view has emerged that seems best able to explain current observations in the field.  For example Newton’s Laws of Motion (prior to Relativity).

Revolutionary science

A sort of transitional phase to a new paradigm.  As evidence accumulates and new observations emerge the old paradigm may look increasingly shaky.  Gradually thinking in the area starts to gravitate (speaking of Newton) to a newer paradigm.  However, this tends to be a slow process as many of the proponents of the existing paradigm refuse to give way.  Eventually however, there is a paradigm-shift as the new paradigm acquires concensus.  An example would be the movement from the geocentric to heliocentric view of our solar system.

 

 

Applying paradigms to psychology

I think you can see where this one’s leading.  Re-read the three stages above.  Where does psychology appear to be?  I think the clue is in the phrase ‘different theoretical perspectives’ which seems to sum psychology up quite nicely. 

According to Kuhn (1962) psychology is in the pre-science stage.

  1. There is no one central approach, rather a collection of different theoretical ideas centred around psychodynamic, behaviourist, cognitive and humanistic thinking.
  2. Because psychology covers such a huge area it tends to tread on the toes of those around it.  It overlaps with biology, sociology, neuroscience, philosophy and so on, which have little in common with each other. 

 

Of all the theoretical approaches it is behaviourism that comes closest to having a paradigm since it is simply a study of behaviour with the idea that everything is learned. 

However, you psychologists feeling put down by not being seen as scientists (by Kuhn at any rate) take heart.  Many of the physical sciences struggle to produce a paradigm.  Chemistry for example sub-divides itself into organic, inorganic and physical each with different assumptions and approaches.  

 

Peer review (refereeing)

This is seen as essential in scientific research and again relies on thorough and accurate reporting of scientific research.

The process doesn’t usually start until a report has been submitted for publication in a scientific journal.  Prior to publication the editor will ask external scrutineers to look through the journals that have been submitted and essentially pick out the best.

Research is only published if:

  1. It makes an important contribution to scientific knowledge
  2. It has sound methodology
  3. It is ethically sound

However, there are important problems with the process that may lead to bias.  Imagine you’re carrying out the process.  What will impress you most, an original study or a replication?  A significant finding or one where firm conclusions cannot be drawn?

Non-significant findings and repeated studies produce the ‘file-drawer problem.’  They never see the light of day and result in a skewing of scientific publications.

Generally however, the process is seen to ‘add merit’ to the scientific approach.

Note: the leaked emails of ‘climategate’ talked about rigging the peer review process!  

 

Are scientific methods appropriate for psychology?

Fields of research such as psychology and sociology attempt scientific methods of study, for example the laboratory experiment.  However, the very focus of this research makes truly objective investigation difficult, some would say impossible.

Unlike the physical or biological sciences the focus of research is human behaviour with all the added complications this brings.  As we’ve seen time and time again in the past two years there is no perfect way to observe or research people. 

The laboratory experiment allows for tight control of variables enabling cause and effect relationships to be established but it is woefully short of ecological validity and there are serious issues with demand characteristics.  People behave differently when they know they’re being observed. 

To overcome demand characteristics and the issue of validity we can attempt naturalistic observations or field and natural experiments but then we lose that tight control so we can’t be sure what’s causing what.

 

The Hawthorne works, site of one of a famous study on productivity carried out between 1927 and 1932.  Any change in work conditions, such as adjusting lighting levels, resulted in short term increases in production of telephone relays. 

Researchers eventually realised it was knowledge of their observations that was causing the effects. 

With any method we also have the issue of expectations and blinkered viewpoints that Popper claims make truly objective analysis impossible.

 

Nomothetic versus idiographic

Psychology has a trend to look for general rules of behaviour or cognition which it formulates into models or rules.  This is referred to as the nomothetic approach.  Fewer attempts are made to consider individuals and their differences, the idiographic approach.  One exception to this is the approach favoured by the humanists:

 

Humanists and their non-scientific approach

Perhaps in the past I’ve given the impression that I don’t appreciate the humanistic approach but that’s not true.  Fair enough I tend to refer to them as the ‘Lib-Dems’ of psychology, but I like the Lib-Dems J.  However, unlike the other approaches there’s no sex and violence, no hard-core genetic determinism and certainly no ping-pong playing pigeons.  Perhaps that’s why the AQA course generally passes the humanists by, just a nodding glance of acknowledgement to Marie Jahoda and her deviation from ideal mental health.  There’s not a lot to hang your coat on!

Unlike the medical, behaviourist and cognitive approaches, humanists reject the scientific approach to studying people.  Its main proponents such as Carl Rogers, Abraham Maslow, Rollo May and RD Laing prefer the phenomenological approach, relying on detailed self-report of conscious thoughts and analysis by qualitative techniques. 

The humanists therefore claim to produce a more holistic explanation of the human condition that focuses on uniquely human characteristics such as self-actualisation, hope, love, creativity and striving to be an individual. 

 

Social constructionism

Prepare for some philosophy!  Social constructionism is the idea that reality doesn’t exist as an external entity.  Reality is essentially what we as individuals interpret through our senses and our minds.  Crucial to this process is interaction and communication with others.  The nearest we’ve come to this concept during the course is Vygotsky’s concept of internalization.  According to the Lev-meister, we talk with more knowledgeable others, watch how they solve problems and then their ideas and their tactics become part of our way of thinking.  As a result, our understanding of the World is culturally determined. 

Social constructivists therefore appreciate that our current way of thinking is relative.  It is relative to the time in which we live and the culture in which we live.  Unlike more empirical approaches, constructionists realise that what we know today will change tomorrow.  The main proponent is Gergen (Kenneth to his friends). 

To try and ground this in more concrete terms; consider the cognitive approach.  It makes use of present day technology to produce analogies for cognitions.  Currently it adopts the information processing approach in which cognitive functions are all about inputs, processing and outputs.  Memory is described in terms of storage, deleting files and increasing efficiency.  Before the advent of mainframes and PCs cognitions were likened to telephone exchanges.  Tomorrow who knows?

Examples of social constructs include language, games, money, A-level grades.  More controversially some see race, ethnicity and even gender as social constructs.  Apparently in the 1960s only three distinct races were considered, today it’s over 30, suggesting that these things aren’t quite as black and white as we like to think! 

 

 

Research Methods in Psychology

 

Introduction

What follows is meant as a summary or brief overview only of this topic area.  It is essential that a combination of class exercises and/or texts are used with the notes to provide a fuller understanding of the issues covered.  Easily the best way of learning research methods is a combination of reading followed by practise.  Read a section, e.g. on levels of data and then practise what you’ve just learned by answering questions on the topic.  Questions on the paper will require short response answers.

 

An overview

The board requires that you have knowledge of the following areas and some texts cover them in this order. 

 

Ethical issues

Deception

Consent

Right to withdraw

Protection from physical and psychological harm

Dealing with ethical issues e.g. debrief, committees and guidelines

 

Research Methods

Experimental

Laboratory 

Field

Natural

Quasi

 

Non experimental

Correlations

Questionnaires

Interviews and surveys

Case studies

Content analysis

 

Research Design and implementation

Aims and Hypotheses

Research design

Independent, dependent and extraneous variables

Sampling

Pilot studies

Reliability and validity

Demand characteristics and investigator effects

 

Data analysis

            Analysis of quantitative data

            Measures of central tendency and of dispersion

            Correlation coefficients

            Presentation and interpretation of quantitative data

            Analysis and interpretation of qualitative data

Presentation of qualitative data

           

However, this is not a logical teaching order so although this booklet will cover all the stuff mentioned above it will follow a different sequence and hopefully one more similar to the route taken in class.

 

Government Health warning

The following information does contain sums and other material likely to cause offence to the squeamish.  However, I’ll endeavour to keep the aforementioned to an absolute minimum and will, wherever possible avoid the gratuitous use of numbers!

 

Ethical issues in Psychological research

Ethics are the moral codes laid down by professional bodies to ensure that their members or representatives adhere to certain standards of behaviour.  All scientific bodies have such codes but those in psychology are particularly important because of the subject matter of the topic.

1.       Psychology is unlike most other subject areas in that its subject matter is entirely human or animal.  Because of this practically all research involves living things that can be caused physical or psychological harm.

2.       Psychological research also needs to consider the wider community.  Milgram’s research taught us something unpleasant about the human race in general.  Some research, for example studies on IQ, have been used to discriminate against different races or ethnic groups.  It could be argued that Bowlby’s research was used to discriminate against women, making them feel guilty for not being at home caring for their children. 

3.       The knowledge gained from psychological research can be exploited by people or groups to gain an advantage over others.  Skinner’s work on behaviour shaping could be abused in this way.

 

Protecting the individual in psychological research

Many of the ideas mentioned in this section will be raised as we cover other topics later in the year and particularly in the last topic on social influence. 

  • Deception
  • Consent (informed or not)
  • Protection of participants from physical and psychological harm
  • The right to withdraw
  • The right to withdraw data
  • Confidentiality and Privacy

We shall then consider ways of determining whether or not studies should take place, and strategies for minimising risks if they do.

 

Mr Wallace with the ‘dicky ticker.’

Milgram’s procedure involved deception, lack of informed consent, physical and psychological harm, denied participants their confidentiality and right to withdraw (allegedly). However, a therapeutic debrief was provided and no ethical guidelines were broken since they didn’t exist at the time!

Did what we learn justify these methods?

 

 

Deception

Examples of studies involving deception: Asch, Milgram, Cruchfield

Deception involves either concealing the real intention of a study from participants or taking steps to mislead them at the outset.  All of the examples above used the second ploy, deliberately lying to participants about the genuine reason for a study.  Two of them also used stooges or confederates (people pretending to be participants who are really part of the experimental set up).  The use of stooges always means deception has been used.

However, is deception necessary?  The researchers above would all argue that their experiments could not have taken place without it.  Imagine if Milgram had said at the start, ‘Mr Wallace is really a stooge, who will scream a bit but will receive no shocks.’  The study would have told us nothing of interest and obedience would doubtless have been close to 100%.

To a lesser extent nearly all studies involve an element of deception in that it generally isn’t a good idea to tell your participants what you are looking for in advance.  Menges (1973) estimated that as few as 3% of studies involve no deception at all.  When using the BEM sex role inventory to test gender, telling male participants in advance that you are trying to find how masculine or feminine they are will almost certainly influence the way they respond to the questionnaire!

Baumrind on the other hand argues that deception is always wrong since it prevents informed consent (see below), researchers have an obligation to protect their participants (see below) and psychologists should be seen as professional and therefore trustworthy.

 

Debriefing

It is really a matter of common courtesy to debrief your participants at the end of any procedure and inform them of the point of the research.  Debriefing is crucial if any form of deception has been employed. 

A proper debrief should:

1.  Inform participants of the purpose of the research

2.  Ensure that there are no negative or unforeseen consequences of the procedure

3.  Ensure that the participant leaves in ‘a frame of mind that is at least as sound as when they entered.’  (Aronson 1988).

4.  Give the participant the right to withdraw their data and to see the finished write-up of the report if they so wish.

As well as having the best interests of the participant in mind, debriefs can also be a useful source of additional information in an experiment.  Participants may tell you things that you would otherwise not be aware of.

 

George ‘dubya’ being debriefed following his eight year participation in a study into the effects of having a Dick* in the Whitehouse.

 

George is thanked for taking part, assured that his behaviour was normal and given the right to withdraw (from Iraq).  Researchers are assured that his frame of ‘mind’ has not been impaired by the lobotomy.

 

Therapeutic debriefing

In extreme cases such as Zimbardo’s study, participants may receive questionnaires, be asked to complete diaries and have follow up meetings with the experimental team.  In the case of Milgram some participants also received follow up psychiatric visits!

 

Consent and Informed consent

Consent

Simply refers to participants willingly and voluntarily taking part in your experiment.  Milgram and Asch for example did obtain consent.  In the case of Milgram he placed his infamous advert in the local paper and people turned up.  During WWII the Nazis carried out many procedures on prisoners without their consent.  Following the war it was decided that consent should be enshrined as a basic human right.

Informed consent

This refers to participants giving their consent in full knowledge of the aims of the study, the expectations of them and their right to withdraw and to confidentiality.  This clearly was not the case with Asch or Milgram, but arguably was with the Zimbardo procedure.  This raises the issue of whether fully informed consent is ever possible.  If researchers know the likely outcomes of a study then what is the point in carrying it out in the first place? 

Informed consent and deception are closely related in that there cannot be informed consent in any situation where deception is used.

 

Special cases

 

 Children

Children under the age of 16 are deemed not to be old enough to give consent.  In this case permission has to be sought from parents or guardians.

 

 Detained

People in prisons or psychiatric hospitals need particular consideration.  Prisoners may feel pressured into taking part as failing to do so may prejudice their situation.  Similar concerns apply to patients.  Additionally with psychiatric patients permission may need to be sought from either relatives or psychologists.

 

 Students

It has been common practice by many universities to expect students to participate in experiments as a requirement of the course.  In my fresher year I was expected to earn a certain number of points by being a participant in studies. Those involving pain (like the electric shocks I suffered in acquiring my aversion to the number 3) gained higher points.  Here a certain degree of coercion is used and may not be entirely ethical.

 

 Field experiments

Piliavin conducted research on the NY underground in which stooges pretending to be blind or drunk (not both!), fell over.  The research team observed the reactions of bystanders.  In situations like this ‘participants’ are not aware that they are taking part in a study so cannot give consent.  In addition it is usually impossible to carry out debriefs afterwards

 

Various ways of overcoming the issue of consent will be discussed later.  These include presumptive consent and prior general consent.

 

Protection from physical and psychological harm

Physical harm

The BPS guidelines suggest that participants should be exposed to no more risk than they would be in everyday life.  For example people driving cars are exposed to a certain level of risk.  If psychologists wish to study some aspect of driving related behaviour then the procedure they use should not put their participants at greater risk than this. 

 There are occasions when researchers have caused their participants physical harm although these tend to be rare.  Milgram appears to have delighted in the response of some of his participants who would ‘bite their lips and dig their fingernails into their flesh.  Full blown, uncontrollable seizures were experienced by three subjects.’  (Wrightman and Deux 1979).

Psychological harm

This is more difficult to gauge but may involve embarrassment, loss of self esteem, stress and anxiety.  

Asch, Zimbardo and Milgram procedures would all have involved loss of self esteem, embarrassment and some stress, and in the case of Milgram and Zimbardo, extreme anxiety.

Confidentiality is one way of protecting participants from psychological harm.  If you do something shameful or embarrassing then others not knowing will help reduce the impact.

 

Confidentiality

The data protection act requires that the identity of all participants remains confidential.  As well as safeguarding privacy there is an obvious practical benefit from this approach.  Participants are unlikely to volunteer for procedures if they believe that their identity and behaviour will be divulged. 

There were clear breaches of confidentiality in the Milgram and Zimbardo studies as in both cases participants were secretly filmed. 

 

Guidelines require that participants are not identified unless they give their permission and various methods may be used to disguise their identity.  For example in case studies patients may be identified only by their initials such as KF or HM. 

 

The right to withdraw and to withdraw data

This should be available and made clear to participants before the research starts.  Both Milgram and Zimbardo claim that withdrawal was possible in their studies although when questioned afterwards it is clear that not all participants realised this. 

Advance payment was an issue in the Milgram study.  This may put additional pressure on participants who may feel obliged to earn the money that they have received.

The debrief should make it clear that participants have the right to withdraw their data on being told the nature of the study.  If serious deception has taken place then participants have the right to witness their data being destroyed!

 

Dealing with the ethical issues

This is a favourite question in which you are expected to describe and/or evaluate measures taken by psychologists to minimise the adverse effects of research.  Obvious points to mention would be seeking consent, avoiding deception, providing the right to withdraw, debriefs and confidentiality. 

For fuller marks some or all of the following could also be discussed:

  • Ethical guidelines and codes of conduct
  • Cost-benefit analyses
  • Ways of obtaining consent and avoiding deception

 

Ethical guidelines and codes of conduct

Following the immoral experiments of the Nazis in WWII, each country set up its own set of guidelines for performing scientific research.  In Britain the British Psychological Society (BPS) and in the USA the American Psychological Association (APA), produce codes of conduct for both experimentation and for clinical practice. 

For human participants the codes cover topics already mentioned such as deception, consent, withdrawal of data, confidentiality etc.

Additionally all institutes that perform psychological research have ethical committees that consider whether or not particular pieces of research should be carried out.  This body should have non psychologists that can express more objective views on research.

Cost-benefit analyses

Committees may carry out cost-benefit analyses in which the likely benefits of a particular piece of research is weighed up against the costs to human or animal participants.  Put simply does the knowledge we gain about human behaviour and the advantages this might have for the wider population warrant the suffering or embarrassment of a few individuals?  Such analyses are notoriously difficult to carry out objectively, particularly in advance of a piece of research.  Psychologists still argue about the costs and the benefits of the Milgram procedure, and that’s with the benefit of forty years of hindsight!  Additionally the costs to the larger social group may also be considered, for example and an ethnic or racial group or women etc.

Obtaining consent and avoiding deception

Presumptive consent (of ‘reasonable people’)

This asks people for their views on a particular procedure.  If generally they find it acceptable then that procedure is used… but NOT on those asked. 

Prior general consent

A pool of possible participants would be asked for their views on research.  For example they may be asked about their views on the use of deception or embarrassment during research.  Only those participants who consider these ploys acceptable would then be selected for later research involving fibs etc. 

Role playing

People are asked to act out the role of participants in problematical studies involving deception or psychological harm etc.  Clearly these are less than satisfactory since people can only guess at how they would respond in such situations.   When asked, fewer than 1% of people believe that they would obey in Milgram’s study!

 


Research Methods

The Experiment

  • Definition of an experiment
  • Advantages and disadvantages of the experimental method
  • Types of experiment

 

Definition

In an experiment a variable is manipulated to see what effect it will have on another. 

For example if we wanted to know whether caffeine affected reaction times:

We could take two groups, give one group coffee (experimental group) and compare them to another group without coffee (control group).  We would then set them a task designed to measure their reaction times.

In experiments there are 2 variables:

  • Independent variable (the one we alter or manipulate) in this case whether or not the person has had coffee.
  • Dependent variable (the one that alters as a result of what we do), in this case reaction time.  The dependent variable is usually the one we measure or record.

Not rocket science, BUT the problem is always remembering which is which.  The way I do it is simply to think of the dependent variable as the one that depends on what we do!

In this case reaction time (dependent variable) depends on whether or not the participant has had a cup of coffee.

Crucially an experiment allows us to establish a causal link between the IV and the DV. Following an experimental procedure we should be certain that the alteration we have made in the IV has caused the change in the DV.

Other variables

The I.V. and D.V., as we psychologists refer to them, are not the only variables to worry about. 

In the coffee experiment suppose we find that the coffee group have faster reaction times, can we be certain the coffee has caused this.  Other possible reasons:

  • The experimental group (on coffee) might just by chance contain people with faster reactions
  • Perhaps we measured one group in the morning, the other in the afternoon.
  • Perhaps those in the control group had a hangover etc.

Confounding variables

These are variables that get in the way of our results or make our results difficult to interpret.

Think of Brady’s executive monkeys.  Brady assumed that being in control had caused the stress that lead to the ulcers.  Control being the IV and ulcers being the DV.  In fact it was more likely to be the activity levels of the monkeys that caused the results.  This is an example of a confounding variable.

Common confounding variables include:

  • Intelligence of participants
  • Personality of participants
  • Gender of participants
  • Time of day
  • Weather
  • Noise levels
  • Temperature…

Obviously in an experiment we take steps to minimise these, for example we could ensure that the procedure is carried out at the same time of day, in the same room, with similar temperature settings etc.

 

Laboratory experiments

Lab experiments don’t have to be carried out in a laboratory.  However, any experiment that is carried out in a special, tightly controlled environment is classed as laboratory.  Importantly it is obvious to those taking part that that they are in an experimental procedure. 

Laboratory experiments are therefore artificial and tightly controlled, leading to the following advantages and disadvantages:

Advantages of lab experiments

Disadvantages of lab experiments

Cause and effect: We can usually see that the IV has caused the alteration in the DV.  Provided we have controlled our experiment we should be able to show that it was the coffee that was responsible for the faster reaction times.

Lacks ecological validity: As we’ve seen so many times (e.g. in memory and in Milgram), experiments, especially those in laboratories are very artificial.  Can they really tell us how people will behave in real life situations?

 

Replication: Provided care has been taken in conducting and reporting the procedure another person should be able to repeat your procedure to see if they get the same results.

Demand characteristics: New one for you; this refers to participants behaving differently because they know they’re being watched.  We saw this in Milgram.  It could be that they guess what you want and try to please the experimenter, e.g. by obeying!

 

 

Laboratory experiments we have seen this term:

Most studies on memory would have been lab experiments: Brown-Peterson technique, Baddeley’s study of encoding, Loftus’ research into leading questions, Sperling’s study of sensory memory etc…

Not all experiments however are carried out in artificial settings and not all allow full control of the IV.  Other types of experiment are covered next:

 

Field experiments

Not, as the name implies, experiments conducted in fields, although they could be!

More likely settings would include the work place, school, the street etc.  Basically the same rules apply: an independent variable is manipulated to see how it affects a dependent variable.  Confounding variables can still get in the way, and cause and effect can still be determined.  However, the setting is more natural. 

 

Advantages of field experiments

Disadvantages of field experiments

Ecological validity: because the settings are more natural it is assumed that people will behave more naturally, so field experiments should have greater ecological validity.

Less control of variables: the experimenter has less control over the environment so more variables may affect the outcome.  As a result we cannot be certain that the IV has caused the change in DV.  Cause and effect relationships are therefore difficult to establish. 

Demand characteristics: these can be less since participants may not be aware that they are in an experiment, as was the case with Hofling!

Ethics: If patients are unaware of the study how can the consent to take part or withdraw from the experiment?

 

Replication: It is difficult to repeat the procedure exactly as it was the first time.

 

In practice it can be difficult to distinguish a laboratory experiment from a field experiment.  Consider the Brewer and Treyen’s office schema study.  This is clearly an experiment and in an artificial setting, but the participants were not aware at the time of the procedure, that they were taking part in a study so there behaviour was quite natural.  Similarly with Loftus’ study on weapons focus.  Participants listened to an argument whilst waiting to start an experiment.  Again the setting was artificial and there was full control over the IV (blood soaked knife or pen).  However, again the participants were unaware of the procedure so again their response would have been natural.  Are these laboratory experiments or field experiments?  No right answers really.  However, the crucial thing is that you can justify your answer and explain the positive and negative points. 

 

Natural experiments

Not, as the name implies, experiments carried out in the buff, although they could be if you were comparing the memory of those who naturally prefer to go au naturelle with those who prefer to wear clothes.  These are similar to and often confused with quasi-experiments, but there is one crucial difference.  Natural experiments take advantage of a naturally occurring event.  The effect of the eruption of Mount St Helens on stress related illnesses is the one all the texts prefer to mention.  In this case the IV was the eruption, a naturally occurring event. 

A better example and one that we’ve studied is Hodges & Tizard’s study of institutional care which examined the effect of different types and duration of care on the children’s subsequent behaviour and development.

IV is the type and duration of care (in this case not controlled by the researchers, it happened anyway).  DV is the effect this has on subsequent development (and which can be measured using various tests).

Advantages of natural experiments

Disadvantages of natural experiments

Demand characteristics: it is often the case that the experimenter isn’t even present when the event occurs, thankfully in the case of Mt Saint Helens!  As a result participants are not trying to please the researchers.

Lack of control: the researchers have no control at all over the variables and there may be lots of confounding variables.  In the Mt St Helens case ill health caused by smoke, or stress due to loss of house etc.

Research opportunities: it is possible to research events that it would be unethical to study any other way or that may be impossible to set up.

Replication: in some cases clearly impossible, in others very difficult.  As a result it may be impossible to check the validity of research.

 

Cause and effect: following on from lack of control, it may be impossible to decide if the IV is causing the change in the DV.

 

Quasi-experiments

Not, as the name implies, experiments were you run around in the dark pretending to shoot each other with lasers!  But they could be if you were looking for age or sex differences.

In a real experiment you can manipulate the IV and you can decide who goes in which group.  In your study on coffee you decide which participants go in which group.  However supposing you wanted to see if 40-somethings had faster reactions than teenagers: you never know it could happen! 

In one group your participants will have to be teenagers and the other group will have to comprise 40-somethings.  You are unable to randomly allocate your participants to the different groups.  Similarly with sex differences; by definition the boys are going to be in one group and the girls in the other!

 

Experimental research design

Here we decide how we are going to sort or group our participants.  Do we use the same people in all conditions or groups, or do we choose different people for different conditions or groups?  In some cases, as we’ll see the decision is made for us.  In others the solution isn’t so obvious and there may be pros and cons for each.

 

Repeated Measures Design

Here we use the same participants in each group or condition. 

For example, returning to the earlier experiment on coffee and reaction times.

In a repeated measures design we could give our group of participants the test on day one with no coffee and record their reaction times.

The next day we could repeat the procedure, with the same group of people, but this time give them coffee before the experiment began.

 

Advantages

The two groups have the same age, sex, personality, ideas, past experiences, IQ, reaction times (crucially for this one) etc.  They are perfectly matched.  They are the same people!

 

Disadvantage

Order effects: Assuming, as we expect the group do better on the second day, can we be sure that this increase in performance is due to the coffee?  It could be that they’ve had the chance to practice the task the day before!  It’s not surprising they’re better the second time around.  This is called order or practice effect. 

Boredom: Of course, on some tasks it could work the other way, and a task done the second time shows a deterioration because they’re fed up with doing it.

Extra materials: For example if you use the same participants for two memory experiments you will need two lists of words etc. for them to recall.  This introduces other variables.  Perhaps the second list is easier than the first.

 

Counterbalancing       

To overcome order or boredom effects we could use ABBA.

One half of the participants could do no coffee followed by coffee the next day (condition A followed by condition B).

The other half could do coffee on the first day and no coffee on the next day (condition B followed by condition A).  Hence ABBA (nothing to do with thanking anyone for the music!).

In some cases repeated measures has to be used:

If you’re comparing gender with subject choice at AS you use the same people in each condition and compare a persons gender score with their AS choices.

Examples we’ve seen this year

The love quiz:    the same people take each questionnaire

The 44 thieves:  later behaviour is compared to early attachments in same people

 

 

Independent Measures design

You guessed it.  If we used the same people in each group last time, this time we use different people in each group.  Clearly this overcomes practice and boredom effects ‘cos they only do it the once!

Each participant is randomly allocated to one group or the other, so in our coffee experiment:

One group, comprising one set of participants do the test with coffee

The other group, comprising a different set of participants do the test without coffee.

Sorted, no problems with practice or repeat effects or with boredom or tiredness effects.

 

However

Can we be certain that the likely faster reactions of the first group are down to the coffee?

It could be that the participants that we’ve randomly assigned to that condition have naturally faster reactions.  They may be younger, or some of them may engage in activities that require fast reactions. 

In other experiments, sex, personality, age, IQ etc. could all be an issue because the participants are going to differ on all of these.

There are some occasions when independent measures design has to be used:

Sex differences

Age differences

By definition the two conditions are different.  You couldn’t have someone in the male condition and the female condition, or in the under 30 condition and the over 30 condition!

Advantages

1.       No order or practice effects

2.       Can use the same stimulus material (such as word lists in memory) for each group

Disadvantages

1.       Participants are not matched in terms of IQ, personality, age etc.

2.       You will need twice as many participants. 

Clearly these are the same as the advantages and disadvantages of repeated measures but in reverse.

 

Matched Pairs Design

This is the ideal compromise.  In the reactions experiment you would have different people in each condition, i.e. some would have the coffee and others not.  However, the two sets would be matched in terms of IQ or whatever characteristics are relevant, in this case reaction times, age etc. 

Advantages

1.       No order effects since each participant only does the task once

2.       You can use the same material twice

3.       Groups are similar in terms of individual characteristics

 

Disadvantages

1.       Very time consuming and difficult to match all of your participants in this way

2.       It is impossible to match people for all characteristics even if you were to use MZ twins between the two groups!

 

Observations

These are a vital tool in the psychologists armoury and if done properly can provide oodles of ecologically valid and detailed information about all manner of behaviours.  However, they also have many pitfalls and raise a whole host of methodological and ethical issues.  Observations can be subdivided in many ways each with their own distinct advantages and disadvantages.  In unit one we have seen a few worthy examples of observations including the strange situation, Lorenz’ work on geese and in addition to these you should also be familiar with Bandura’s bobo doll research.

The following divisions of observations will be considered:

Naturalistic v controlled

Structured v unstructured

Participant v non-participant

Disclosed v undisclosed

 

Naturalistic Observation

This is an easy one to explain.  People or animals are observed in their natural environment, without any sort of intervention or manipulation of variables and without their knowledge.

Examples include:

  • Seyfarth & Cheney’s research on the warning calls of the vervet monkey
  • Sylva’s study of play in young children.
  • Much of the work carried out by Konrad Lorenz

Ethologists specialise in studying animals in their natural environment.

 

Was it the tickly beard of the alluring aroma of his rough shag?

Either way Lorenz was still attracting the birds into his 70s.

 

The researcher observes behaviour in its natural environment as many of the ethologists studying animal behaviour record their information.  Ainsworth’s study of attachments in Ugandan women would be a human example of naturalistic observation.

 

Advantages of naturalistic observation

Disadvantages of naturalistic observation

 

Ecological validity: Clearly this provides data that is very high in ecological validity since it has not been tainted by observer intervention with the observed not usually knowing that their behaviour is being watched. 

 

Reliability: there is the issue of bias.  For example if a researcher is looking at aggressive acts in a football game and assumes that boys are going to be more aggressive, the results may inadvertently be interpreted in this way.

 

Demand characteristics: For the same reason there should be no demand characteristics.  If you’re not aware that you’re being observed then you won’t be trying to please the researcher. 

Ethics: are a major problem with many observational studies and especially with naturalistic.  Not knowing you’re being watched creates issues with privacy and participants not consenting to take part

Detailed: Information collected tends to be more detailed and provides a fuller idea of behaviour than the sort of information that can be collected in a laboratory.  Think of the criticisms of behaviour in the strange situation

Cause and effect: However control of the environment is not possible and confounding variables make it impossible to determine cause and effect relationships.  You cannot be certain what factors are creating the behaviour being observed

Sometimes this is the only possible way of doing research, especially if people are unwilling or unable to complete questionnaires or interviews.

 

Replication: in many cases it would be impossible to recreate exactly the same situation so that someone else could verify your findings.

 

 

Controlled

As the name suggests the researcher in some way manipulates the behaviour of the observers or the observed.  Ainsworth’s strange situation is the best example seen to date with researchers organising the behaviour of the mother and stranger to see how the child reacts.  Other examples include the Bobo dolls and Piliavin’s work on bystander apathy on the New York subway.

These allow for greater control of confounding variables meaning it is easier to establish cause and effect relationships. 

However, they are lower in ecological validity since the trigger for the behaviour is usually not a natural event.  Often, but not always, participants may also know they are being observed creating demand characteristics.

Disclosed

Participants know they are being observed.  This reduces ethical issues of consent and privacy but reduces validity due to increased demand characteristics.

Undisclosed

Participants are unaware of the observation.  This raises ethical issues (privacy and consent) but increases validity by reducing demand characteristics.  Sometimes one way mirrors might be used to discretely observe people, for example shopping behaviour in a supermarket. 

 

Participant

Here the researchers get involved with the group of participants they are observing.  Festinger (1956) joined a cult to observe how they would react when their predicted end of the World deadline came and went.  The cult leader was able to reassure his flock that their prayers had saved the planet!  This is an example of undisclosed participant observation.  On occasions researchers may join in but make others aware of their role as psychologists. 

On occasions researchers have been able to infiltrate groups and remain members for a period of time allowing for detailed, longitudinal information to be gathered, for example about the behaviour and motivations of street gangs and religious cults.  It is difficult to see how such groups could be studied in any other way. 

Clearly there are ethical issues with this type of deceitful observation and the researcher themselves may unwittingly interfere with the group dynamics and the behaviour of the group. 

Non-participant

The more likely scenario in which participants are observed from a distance rather than the researchers infiltrating the group. 

 

Ethics of observations

Observations raise a number of unique ethical issues.  These vary depending on the nature of the observation taking place but here are a few:

Consent: participants are often unaware of being observed so have no opportunity to consent to taking part in your research.

Debrief: often there is no opportunity for a debrief.  For example in Piliavin’s observation of bystander apathy on the New York subway, participants were observed without their knowledge and would have left the train before researchers had chance to debrief.

Deception: participants being unaware of observation is deception in itself.  Additionally, researchers may cause additional deception by using stooges.  Again Piliavin used members of the research team to pretend to be blind or drunk.

 

Controlled observation: Bandura altered the conditions while children watched the doll being hit.

Participant observation: Marsh (1996) joined football fans to observe their behaviour. 

Naturalistic observation: Seyfarth and Cheney recorded the calls of vervet monkeys.

 

Correlational analysis

In the past year we’ve seen lots of examples of this.  For example whenever I’ve criticised a study because it doesn’t show cause and effect it’s probably been a correlational study.

An example:

For example we could look for a correlation between IQ and performance at GCSE or A-level.  Common sense would perhaps tell us that students that have higher IQs are more likely to perform well at GCSE. 

In year 13 we look at the controversial area of IQ and find that there is a high correlation between the IQs of MZ twins.  If one twin has a high IQ it is likely the other does too.  This is taken as evidence for the nature or genetic determination of IQ.  However, as we will see there are a host of other reasons why this might be the case. 

Types of correlation

Positive:  the most common; as one variable increases so does the other, e.g. IQ and GCSE score in the example above.

 

Positive                                                             Negative

                                 

Negative: as one variable increases the other decreases, e.g. it might be fair to assume that the higher your stress levels the lower your life expectancy.  Again we are unable to show cause and effect.  As mentioned frequently in ‘Stress,’ illnesses could be due to secondary habits such as smoking, poor diet etc.

 

Advantages of correlations

Disadvantages of correlations

 

Correlations allow us to study links between variables that could not be studied in any other way.  We could not inflict so much stress on a person that we endanger their life.  However, we can use a correlational analysis to show a possible link between the two occurring naturally.

 

 

Cause and effect: do I really need to explain this one? 

A correlation shows a possible link between 2 variables it does not prove that one causes the other, e.g. smoking and heart disease.

 

Economical and fast: large amounts of data can be compared quickly and cheaply, e.g. by using a questionnaire to collect data.

 

Correlations can disadvantage certain people in society if misused.  For example it was established long ago that blacks under perform on IQ tests compared to whites.  This knowledge was misinterpreted as evidence of white superiority.

 

 

Case studies

These involve study of an individual, small group, institution or an event.  A case study can involve a whole host of techniques including observations, questionnaires, surveys, interviews, testing and even on occasion experiments.  They are frequently longitudinal in nature and may also involve asking others, such as friends and associates.

Examples:

Clive Wearing, HM, KF, S, Genie, Czech twins, Anna O, Little Albert, Little Hans, Phineas Gage

 

Good points

They provide a wide variety of in-depth and detailed information that would be impossible to acquire using heavily controlled situations such as experiments.  They can offer provide a real feel for what it is like to be suffering from a particular disorder or be involved in a certain situation

They often provide the only method possible of studying a certain condition or event.  It would not be possible to artificially re-create situations such as Genie or HM experimentally, so our only access to information about privation or severe amnesia is through case studies.

 

 

 

However

They it is notoriously difficult to generalise from a case study and create a general theory.  Case studies by their very nature are one-offs or unusual and often involve people who are not themselves representative of the general population.  The case of Genie and the Czech twins shows this nicely.  Both suffered severe deprivation over a prolonged period but their outcomes are very different; the Czech twins seeming to make a full recovery, whereas, as far as we know, Genie never recovered from her early problems.

Often case studies require an element of retrospective data collection, with parents, friends etc being asked to think back to the participants earlier years.  Retrospective data collection is not reliable.

Objectivity by the researchers can be difficult, with psychologists getting too close to patients as with the case of David Rigler and Jean Butler and their research/fostering of Genie.

Confidentiality can be an issue though some of this can be overcome by the use of pseudonyms or initials. 

 

Interviews

There are a number of species of interview each with their own advantages and disadvantages.  I’ll consider the main ones only:

Informal interviews

The interviewer has an aim in mind at the outset but is willing to be flexible about getting answers.  The interviewer tries not to direct the interviewee but instead listens and lets the interview take its natural course.

Advantages

Disadvantages

Lots of information can be gathered

Difficult to analyse, especially if different participants discuss different issues

Interviewee made to feel relaxed

Low reliability

 

Clinical interview

These were made popular by Freud and in particular Piaget and are a type of informal interview.  Piaget for example would read ‘moral stories’ to a child and start off by asking the same questions to all the children, for example ‘who is the naughtier boy in the stories.’  However, follow up questions would be informal and vary from child to child. 

 

Structured or formal interviews

These follow a set pattern with the interviewer having prepared a set of questions in advance that are asked in a particular order.

Note: sometimes the questions may be open and allow the interviewee to respond how they like, for example ‘how did you feel when Freddie ate your pet hamster?’  Or they can be closed and allow only a ‘yes’ or ‘no’ response.  For example ‘were you upset when Freddie ate your pet hamster?’ 

 

Advantages

Disadvantages

Easily replicated

Little flexibility so important points may be missed

Data is easier to analyse

Questions may be ambiguous (think of the SRRS for determining stress levels).

Data is less likely to be influenced by the interviewer

This format may encourage brief answers

 

 Limitation of interviews in general:

Social desirability bias

We all like to create a favourable impression.  When faced with an interviewer we are less likely to be honest than when filling out an anonymous questionnaire.  For example people being questioned about their love life are likely to exaggerate in face to face interviews. 

Lie scales can be introduced to assess how honest answers may be.  For example if people were being questioned about their childhood a ‘lie question’ might be; ‘As a child did you always do as you were told first time and without moaning?’  A response of ‘yes’ would be assumed to be a fib and indicate that perhaps the interviewees answers may not be reliable.

 

Questionnaires

We all know what they are and have all filled lots of them in.  Basically a questionnaire is a list of written questions that is able to gather lots of relevant information relatively quickly and cheaply. 

The biggest problem is wording of the questions.  Again there is the issue of ‘open’ or ‘closed’, but more importantly, as we saw in EWT, the issue of leading questions.  These are a favourite of politicians or of newspapers that want to find support or criticism of a particular issue.  For example imagine you wanted to find out if people wanted more money spent on the NHS, a relatively neutral question might be

‘Should more money be spent on the NHS?’

The Mirror (presumably wanting a ‘yes’ response might get their pollsters to ask:

‘Should extra money be provided to the NHS to take care of Britain’s sick and elderly?’

Whereas the Telegraph (being very stereotypical here) may get their pollsters to ask:

‘Would you be happy to pay more taxes to fund bureaucracy in the NHS?’

 

Rather extreme examples admittedly, real surveys carried out by experienced pollsters would be far more subtle, but you get the idea!

It is always a good idea to test your questionnaire in a pilot study first to make sure it doesn’t take hours to complete and that participants understand the questions.  Feedback like this may provide ideas for follow up questions to be asked in the real study.

 

Advantages

Disadvantages

Lots of people can be tested quickly

Lots of questionnaires will not be returned!

This allows more reliable generalisation to the overall population

People may tell fibs.  Even in anonymous questionnaires this may be an issue.  Again lie questions may be included, e.g. in Eysenck’s Personality Questionnaire (EPQ).

Data can often be analysed easily

 

 

Typical questions on Research Methods

Describe two disadvantages of investigations using correlational analysis (2 + 2 marks)

Identify the research method used in this study and explain one advantage and one disadvantage of this method.                                                                   (2 + 2 marks)

Give one advantage of using a questionnaire in this study.              (2 marks)

Following the survey it was decided to carry out an observational study into under-age drinking.  Outline procedures for carrying out such an observation. (6 marks)

 

Content analysis

Content analysis studies human behaviour indirectly usually by studying the things we produce, e.g. television programmes, magazines etc.

An analysis of what we produce should be able to tells us a lot about the way we structure our society and about our values, prejudices and so on.  For example a content analysis of television advertisements of the 1970s would probably paint a far more sexist view of the World than that present today, certainly in the UK at least.

Manstead and McCulloch (1981) watched 170 television advertisements in a week and scored them on a whole range of factors such as gender of product user, gender of person in authority, gender of person providing the technical information about the product and so on.

 

 

The classic ‘Do the Shake and Vac and put the freshness back’ advert of the 1980s portrayed a woman obsessed by the fresh smelling nature of her house.

 

 

Good points

They can produce lots of detailed and easily analysed material about a particular aspect of society. 

Since the research is observational it is high in validity.

Provided the information is well presented and sufficient records are kept and published relating to the material sourced and the content analysed replication and verification of results should be possible.

However

There is the possibility of bias, with the observations being subjective.  To overcome this a number of raters should be used (inter-rater reliability)

The choice of material and content to be analysed also introduces a potentially huge source of bias.

 

Other research methods

As already stated, many research studies use a combination of techniques.  We saw this with case studies but as Cardwell and Flanagan point out, Schaffer and Emerson’s Glasgow babies study used natural observation, interviews and even occasionally experiments when mothers recorded how the children responded to a series of everyday events.

Meta analysis

The results of a number of studies (usually buy a variety of researchers) in a related area are combined to see if overall trends are visible. This can increase reliability since contradictory findings may be uncovered.  However, different studies may be difficult to compare because of different sampling, design and methods used.

Longitudinal studies

A favourite with developmental psychologists since they allow children to be revisited to see how they change and grow over time. 

The big disadvantage is attrition.  People move to different areas or become impossible to contact. 

 

Research Design and Implementation

Aims and Hypotheses

Aims

When carrying out a piece of research it is essential that you have an aim in mind.  This needs to be reasonably precise, for example ‘I’m gonna study memory’ would not be sufficiently precise.  However the aims are broader, or less precise than the hypotheses.  A suitable aim for memory might be ‘to see if age affects the duration of STM.’ 

Miller’s aim was to discover the capacity of STM.

Milgram’s aim was to see if normal people would obey when told to kill someone!

 

 

Hypotheses

These are more precise and should be operationalised, i.e. give some clue as to how the research will be carried out.  You must remember that two hypotheses are included:

a. Experimental or alternative hypothesis

This makes your prediction, for example:

As age increases the duration of STM decreases.’

 

b. Null hypothesis

This might at first glance seem redundant, since what you are saying is that you will not find what you’re expecting.  A suitable null hypothesis for the experiment above could be:

‘Age will have no effect on the duration of STM.’

 

The two examples above are simplified to give you the overall idea.  When deciding on an experimental hypothesis you need to give some indication of the method to be used.  For the above experiment it might be:

‘Duration of STM, as measured by the Brown-Peterson technique, will decrease with age.’

The null hypothesis would normally read:

‘Age will have no affect on duration of STM.  Any effect found could be due to chance.’

 

Why do we have a null hypothesis?

A null hypothesis is easier to prove.  For example if we were trying to show that all MZ twins had the same voting intentions.  Our hypothesis might be:

‘Pairs of MZ twins will always vote in the same direction in the coming General election.’

The null hypothesis might be:

‘Twins status will have no affect on direction of voting.  Any similarity found may be due to chance.’

Suppose we test 50 twins and both members of each pair are intending to vote in the same direction.  Have we proved that all twins will vote in the same direction?  It could be that the next pair we test won’t.  In that case all we need to do is find one pair that have different intentions to prove our null hypothesis.

When we finally get round to testing the results with a stats test it will be the null hypothesis that we’re testing.

 

One tailed or two tailed?

Having decided on your hypothesis and aims you need to decide on the direction.  In the examples above I’ve already done this.

When we say that we expect ‘duration of STM to decrease as age increases’ we are making a definite prediction.  That prediction has direction.  Compare this to the statement that ‘duration of STM will be affected by an increase in age.’  Will duration increase or decrease?  The hypothesis doesn’t say.  It could go either way. 

One tailed

If the hypothesis has a direction we say it is ‘directional’ or one-tailed.  In the first example we are saying that duration of STM will decrease.

Two tailed

If we are not prepared to commit ourselves and simply say there will be an affect then this is non directional or two tailed.

Try the exercises for practice and further illumination.

In year 13 coursework you will almost certainly be copying (sorry replicating) someone else’s work, for example Peterson and Petersons.  In this case your hypothesis will be based on their research.  If you were to replicate Milgram’s (don’t even think about it), you would choose a one tailed hypothesis such as:

‘Participants will follow instructions and fry an innocent person with 450V of electricity when told to do so by a man in a white coat.’ 

 

You could predict this with some confidence since all past research suggests that this is the case.

Note: this business of one or two tailed does not apply to the null hypothesis.  This will always read, “not be affected” or “no correlation” etc. 

 

Selecting your victims (sorry participants)

Having decided on your method (experiment, correlation etc.) and your design (repeated or individual) you now need to decide how you will choose the people who will be assigned to your conditions or groups.

When asked, everyone replies in unison (not the Trades Union): ‘RANDOMLY.   WRONG!!!

 

Random Sample

It is practically impossible to get a truly random sample.  In a random sample every member of your target population would have an equal chance of being selected.  So for example if you wanted a random sample of primary school children in the Market Harborough area you would need to obtain all of their names, put them in a hat and draw your sample out.  In actual fact that would be the easy bit.  The difficult task would be finding them and persuading their parents to let you chosen ones take part! 

The main disadvantages of this method are:

1.       Time consuming

2.       Inevitably some of those selected will not take part

Other, more realistic methods of obtaining a representative sample:

 

Systematic sample (similar to random, with the same disadvantages)

This could be done by visiting the target schools and selecting every 5th child in the register.  This would still be time consuming.  If your target was people in MH, you could select every 20th street and then visit every 10th house in those streets etc.  However, it cannot be claimed that every person in MH has an equal chance of being selected!

 

Stratified sample

Here each variable affecting the outcome of the procedure needs to be considered.  For example if you were investigating voting intentions you would want to select on grounds of: gender, occupation, age, education, home ownership etc.  So because the male:female ratio is about 50:50 your sample would be 50:50.  Because about 65% of the adult population are home owners then 65% of your sample would be too. Etc., etc.

Disadvantages

Time consuming

Not a truly representative sample

 

Opportunity sample

Now we’ve hit rock bottom!  This is probably the least effective way since it involves selecting whoever happens to be available and willing to take part!

Next year in your search for victims, chances are you’ll go to the sixth form centre and pick out a few friends or non-threatening strangers.  Valentine (1982) estimates that 75% of all American and British psychology research is conducted on students, and the majority of these will have been selected in this way! 

Disadvantages

A very poor representative or cross-sectional sample!

 

Sample Size

How many people are going to be part of your opportunity sample?

Things to consider:

Large samples can be expensive and are definitely time consuming

Small samples make it difficult to get a significant result (20 is about the minimum for most statistical tests).

Generally, the larger the sample the better since bias is likely to be reduced.

 

Reliability and Validity

Both of these have been mentioned during the year, particularly ‘validity’ as in ‘ecological validity’ or ‘experimental validity.’  However, you now need to fully understand what both of them mean, how they can be increased and most importantly how to remember which is which!

Reliability

Reliability is akin to consistency

If you use a meter rule to measure the length of your classroom today, and you repeat the procedure next week, you will expect to get the same result.  The meter rule is consistent in its measurement or we say it is reliable!

Reliability in Psychology

This can be measured in a number of ways depending upon the circumstances.  However, each time we are looking for consistency of measurement:

Reliability of observations

This year some of the students have observed aggressive acts in men’s and women’s football to see if the men’s game is really more aggressive.  (Personally I never realised that real men played football but that’s a different issue).

Inter-rater reliability

One way of tackling this problem would be for one person to watch a game played by each gender, look for various aggressive acts and score them accordingly.  However, you only have one person’s opinion.  Better would be to get two or three people to do it independently and compare scores afterwards.  To ensure that results were reliable the raters would sit down beforehand and decide on the criteria to use and how to apply these.  For example decide exactly what was meant by ‘dirty tackle’ (no jokes please) or an ‘aggressive act.’  This would ensure inter-rater reliability.  Or in English it would ensure consistency in measurement between the observers.  All singing from the same hymn sheet in politico-speak.

 

Reliability of tests

If you measure someone’s IQ today you would expect to get a similar result if you used the same test to assess the same person in a few weeks time.  If the results were the same time (i.e. if the results were consistent (that word again)), you could assume the test was reliable!

 

 

Split test reliability

Rather than waiting a few weeks to try the test again it is possible to use split test reliability.  For example with an IQ test, split it in half give both halves to the participant and compare their score on each separate half.  If scores on each half are similar psychologists assume the test to be reliable.

 

 

 

Validity

Does the test or the experiment measure what it’s s’pose to be measuring?

We have mentioned this word ‘validity’ on a number of occasions, usually in relation to ‘ecological validity.’  However, there are a number of different types of validity; here we’ll concentrate on ‘internal’ and ‘external’, sometimes referred to as ‘ecological.’

Internal (or experimental) Validity

Are the effects that have been caused actually due to the independent variable?  For example if we’ve found that coffee (the I.V.) does increase speed of reaction (the D.V.), can we be certain that this increase is really due to the coffee or could it be due to a confounding variable such as the time of day or just faster reactions of the second group etc. 

Basically any of the confounding variables mentioned so far such as IQ, gender, age, time of day, lighting conditions etc., can cause false conclusions to be drawn.

 

External validity (like ecological validity)

How much can the results obtained tell us about real life, or put another way; can we generalise our findings to the real world?

Coolican (1994) points out 4 major issues:

  • Population: Can we generalise from our small sample, probably all students, to the population as a whole? 
  • Location, location, location (Coolican only said it once): Can the results that we’ve obtained in a laboratory setting really tell us how people will behave in real life.  Think back to memory experiments most of which were carried out in laboratories, or to ‘Stan the Man’ Milgram’s experiment in the labs of Yale University.  Would people really behave this way in real life?
  • Measures: If we use the Eysenck Personality Questionnaire (EPQ) and measure a person as very extrovert and slightly neurotic, can we be sure that they are really like this in real life or in social situations?  Similarly when we measure IQ, is the test we are using telling us anything real about the person?
  • Times: Can experiments carried out 40 0r 50 years ago such as Asch, Milgram etc. still tell us anything about people today.  I have mentioned how for example conformity changes over time.  Wars, for example tend to bring populations together and make us more conformist as was measured following the Falklands Conflict of 1982.

Methods of checking validity

Clearly it is useful for a psychologist to have some idea of whether or not tests are valid.  There are a number of ways this can be done:

Meta analyses: data can be collected form lots of different studies in different parts of the World and see if results are similar.  For example Bouchard & McGue compared findings for IQ tests between MZ twins and found similar levels of correlation between them all.

Concurrent validity: if we are measuring IQ we could compare the scores obtained to school tests in maths and English, or we could compare the results of personality tests with assessments by a person’s friends and family.

Predictive: a test should be able to predict later performance, behaviour or personality.  So again, a high score on an IQ test should be able to predict later success at school etc.  In school you sit YELLIS and ALIS tests which are used by teachers as predictors of your future performance.

 

 

Relationship between researcher and participant

As we’ve already seen this can cause problems, particularly in the experimental method.  In the Milgram evaluation I touched on demand characteristics, the idea that simply because the participant was taking part in an experiment that this would affect his behaviour (all Milgram’s participants were ‘he’s). 

Possible effects:

Participant reactivity

Put simply, participants will behave differently or unnaturally because they know they are being watched.  This doesn’t just apply under experimental conditions but in any walk of life!  The classic example which is well worth a read, but not necessary for the exam, is to be found on page 272 and is known as the ‘Hawthorn Effect.’

Demand characteristics

The idea that participants will behave the way they believe you want tem to behave.  It could be that participants guess what the experiment is about, or at least think they’ve guessed, and this will influence their behaviour accordingly. 

This was a criticism of the Milgram procedure.  In Asch’s study on conformity, some of the participants said afterwards that they conformed because they didn’t want to mess up the experiment!  

Orne (1962) persuaded participants to do strange, if not very foolish things.  This argument is often used in the debate over hypnosis.  Orne, for example, persuaded his participants to put their hands into a tank containing a supposedly very venomous snake.  His most famous ‘experiment’ was to persuade participants to spend hours adding up random numbers and then getting them to tear up all their hard work!

Sometimes, of course, the reverse may be true, and for whatever reason, e.g. having been conned in previous studies, participants may deliberately seek to mess up your experiment by behaving counter to how they think you want them to behave.

 

People can be persuaded to behave in the most bizarre and unnatural ways just because someone asks them to.  This doesn’t mean that they’d behave that way in real life.

Of course money or the promise of publicity goes a long way too!

 

 

Reducing demand characteristics

The most common ploy is called the single blind technique in which participants are not told details of the study or in which they are led to believe it’s about something different.  This is a ploy I used in my research on hypnosis.  (Will bore you with the details sometime soon!).

Clearly this raises ethical issues such as deception and informed consent.

 

Investigator effects

A confession.  As a sixth former, many years ago, I spent, what seemed at the time, about five years doing titrations in A-level chemistry.  Not one of the results I obtained was genuine.  We calculated the ‘right answer’ for all of them and then obtained a reading that was close to this.  This was cheating, and we know that results in Psychology have also been fiddled, some on a grand scale.  Obtaining ‘expected results’ like this can be deliberate. Or it can happen without intent.  We often find what we are expecting or hoping to find.  Having decided that women are worse drivers we notice bad driving by women whilst ignoring similar driving by men.  This happens in research and is called experimenter expectancy.  The classic example is Rosenthal and Lawson (1964).  They gave rats to students, telling some that their rats were ‘maze bright’ and could navigate a maze very quickly, and telling others that their rats were ‘maze dull’ and not very good at navigating a maze.  In fact the rats were all similar and allocated to each group of students randomly. 

From what I’ve said, you can probably guess the findings:  Students with the supposedly maze bright rats found that their rats could navigate mazes significantly faster! 

 

Reducing experimenter effects

The most common ploy is called the double blind technique, in which neither the participants nor the researchers dealing with the participants know the conditions etc.  Obviously someone distant from the procedure still needs to know which participants are in which condition so that results can be analysed!  This procedure is commonly used in drug testing when genuine medicines are compared to placebos. 

Remember that on top of this there is the 'data analysis' section, which involves some number crunching.  See contents page for details.  However, the worksheets provided in lessons should cover this are in sufficient detail so I'll skip notes on this and concentrate on conformity instead.  Hope this has not been too heavy a read, I've tried to keep it brief and to the point!

 

 

 Planning and Reporting your Practical Report

All psychological investigations are written in a common format whether they are GCSE, A-level, degree level or professional research.  You may also find similarities with other subjects such as Biology. 

Psychologists usually publish their research in Journals such as The British Journal of Psychology.  Their main purposes are to make other interested parties aware of their methods and findings and crucially to provide sufficient detail to allow for replication.  It is essential that others can check the reliability and validity of the methods and results.  Typically a published article has the following structure.  I shall consider each aspect in turn and explain its contents and format.

 

Title

Contents

Abstract (Summary)

Introduction:      Background research

Aims/Hypotheses

Method:                    Design

                        Participants

                        Materials/Apparatus

                        Procedure

                        Control

Results:                     Summary-and Descriptive Statistics

                        Inferential Statistics

Discussion:               Explanation of findings

                        Relationship to background research

                        Limitations and modifications

                        Implications and further research

Conclusion

References

Appendices

 

TITLE

The title should be concise yet clear enough to give the reader an idea of the investigation's central concerns.  'Memory study' would be too vague whereas 'Testing the Passive Decay Theory of Forgetting from Short-Term Memory using a task similar to that employed by Peterson and Peterson (1959)' would be for too long.  Best also avoid titles which start 'An investigation into...' or 'A study of...'.

 

General Points

When writing up a report try to be concise yet precise.  Include everything relevant, but do not take 8 pages to describe something that could be explained in a paragraph or two.  Remember that your investigation should have enough information in it for someone else [maybe a few years later] to replicate it exactly, so you need to include sufficient detail.

Write in the third person passive.  In other words do not say 'The participants will be...' say 'The participants were...'.

Write it up as if you were describing someone else's experiment that took place last week, and you were not actually there.  Do not say 'our experiment...', 'I calculated...' or 'we noted...' but say 'the experiment...', 'it was calculated...' or 'it was noted...'.

Include page numbers at the foot of each page.

Do not forget to report ethical aspects of a study where appropriate, such as consent, right of withdrawal, confidentiality and protection from harm, etc.

Your report must be written in your own words.  If you present a report which contains material either copied from books, handouts or other student's work, this is PLAGIARISM and would make you no better than Alastair Campbell!

Now let’s turn to the write up itself:

 

Introduction, Aims & Hypotheses

·         This contains the background to your report and should be planned and written like a mini essay.  Crucially the introduction explains where your hypotheses come from.

·         Start with general theory, briefly introducing the topic.  Talk about other psychologists’ research in the area.  It is all too tempting to throw in all the really interesting material you have found during your extensive research of the topic, be concise and selective.  Only include material that really is relevant.

·         Narrow this down to specific and directly relevant research.  If you are planning a replication or adaptation of an existing study then give sufficient detail about this.

·         Lead logically into the aims and hypotheses.  In the old days of coursework, marks were awarded for a logical progression.  Basically think of the traditional ‘V’ shape of an athlete. 

Start broad at the shoulders with an overview of the general topic area, for example research into STM and LTM.  Then gradually home in ‘missile-like’ to the crucial study that you will be investigating, for example a serial recall task investigating the capacity of STM.  This would be down by the waistline if we extend my initial metaphor. 

 

Text Box: Hypothesesss

·         Aims should not appear out of thin air.  The psychological literature that you have reviewed should lead up to the aims.

·         A paragraph should be written explaining what you intend to investigate and why.  Use previously cited research to explain your expectations.  Later these expectations will be formally stated as the hypothesis.  Include a justification for the direction of the hypothesis explaining why you have decided on a one or two tailed.

·         Hypotheses: should be written in the present or future tense (remember you are making a prediction).  They should be fully operationalised, stating precisely what is to be manipulated or measured.  (e.g. instead of talking about the ‘participant’s memory, talk about ‘the number of words recalled,’ since this is how you will be measuring the participant’s memory).

·         State the minimal level of significance that will be acceptable (usually 5%) and why you have chosen this level.

 

Method

This is just a section heading.  The section has several sub-sections, and there are no hard and fast rules about what information goes where, providing you use a little common sense and try to avoid repeating yourself and you try to avoid repeating yourself!

The purpose of the whole section is to inform the reader precisely how the investigation was undertaken.  The acid test is whether or not the reader would be able to replicate your research precisely, just by reading your description.  The section should also demonstrate that you have taken ethical issues into consideration when designing your study.  You should use the third person when describing the method.  E.g. ‘it was decided’ rather than ‘I decided’ or ‘we decided...’  And try to avoid repeating yourself J

Design

Describe any design decisions made, for example:

·         Choice of method, e.g. field experiment, observation etc.

·         Choice of experimental design e.g. independent or repeated measure, or matched pairs

·         Identification of variables: I.V., D.V. or if your method is correlational describe your co-variables.

·         Identify any confounding variables (e.g. participant characteristics, order effects).  Explain attempts to be made to overcome confounding variables (e.g. counterbalancing, standardisation of instructions).

·         Ethical considerations.

·         Identify the level of data (e.g. ordinal) and the statistical test you will use together with the level of probability for retaining the null hypothesis.

 

Participants

Include the number employed and how they were sampled.  Include any factors that are relevant to your study such as age, gender, education level etc.  When you mention participants be careful not to give away clues as to their identity.

Consider treatment groups and how participants were allocated to their respective groups (i.e. how did you decide which participants were to be in the control group and which were to be in the experimental).  NB If you are comparing boys to girls or different age groups then this is decided for you!

Ethical issues

You must protect your participants confidentiality and therefore all must remain anonymous.  To the same end do not even mention the name of the school!

If you use minors (i.e. under 16s) then you need to state how you obtained permission for them to take part in the study.

 

 

Materials/Apparatus

The materials and apparatus used such as questionnaires should be described here.  You need to include sufficient detail so that it is clear how they were compiled.  The reader requires enough detail to be able to reconstruct the materials.  Do not be overly patronising and mention detail such as pens being required to fill out the questionnaires!  All questionnaires etc. should be referenced accurately to the appendix.  Remember to include any mark schemes for questionnaires used.

Procedure

This section should be written in the past tense since you are reporting what you did.  You need to state precisely what you did with the apparatus and the participants.  The rule is simple, describe what you did in sufficient detail for your study to be replicated, including any preliminary work or pilot studies undertaken.  Any standardised instructions or debriefs should be referred to and a justification for their use included.  Include a copy of these in the appendix. 

If you have scored any questionnaires or ranked categories such as GCSE subjects in terms of science or verbal content then you need to say how this was carried out.

 

Ethical issues

It is vital that your participants are fully debriefed after the testing phase is completed.  This is particularly important if any deception has been used.  A standardised debrief can be used and this must be checked by a member of staff.

All participants should be informed that data will be kept confidential and given the right to withdraw from the investigation and assured that their data will not be used.

 

 

Collecting your data

Once you have planned your research, written your introduction and method and have had your ideas and materials checked by your coursework supervisor, you are ready to carry out your procedure and collect your data.  Remember to give yourself sufficient time - don't expect to get it all done in one private study period!

Hint:

Always make a note of anything that happens in your INVESTIGATION LOG BOOK, no matter how inconsequential it may seem at the time.  A casual comment made by one of the participants may turn out to be the key that explains why your study produced the strange results it did.  Some things are too obvious and can often be overlooked because they are under our noses!  Note everything, record everything, and think about the effect it may have had on your data.

 

 

Important points to remember when carrying out research:

Ensure that you are fully au fait with the ethical guidelines as laid down by the BPS. 

Ensure that you are fully comfortable with the procedure and know precisely what is going to happen and in what order.  It is useful to have a dummy run of your procedure before ‘going live.’  Ask a few friends to act as participants and ensure that you haven’t overlooked any details.

Generally speaking any instructions given should be standardised and preferably presented in writing.  This way you can be certain that all participants are given precisely the same instructions.

Always inform participants of their right to withdraw from the procedure at any time if they so wish.

Always be courteous with participants and act in a professional manner.

Always carry out a ‘debrief’ at the end of the procedure.  Again this is best written to ensure that nothing is forgotten and should inform participants of:

The exact nature and point of your study

Their right to withdraw their data from the study

Their right to see the final write-up once it has been completed.

Be polite and thank your participants for taking part in your research!

Copies of both the ‘standardised instructions’ and the ‘debrief’ should be included in the appendices of your final write-up.

 

Results

This section is where you summarise your data and provide a report on any statistical analysis that you have carried out.  Clarity is all important in this section so you need to find concise and informative ways of presenting data. 

Note: all raw data should go in the appendix and be referenced.

 

Descriptive (or summary) statistics.

These are every bit as important as inferential.  The purpose of this section is to describe what you have found and give the reader ‘a feel’ for the data.  This can be done in the form of averages, standard deviations, tables, bar charts, scattergrams and graphs. 

Special care should be taken that all graphs etc. have titles and appropriately labelled axes.  All such graphs should be self explanatory, it should not be necessary to refer to the text to make sense of them. 

Keep graphs etc. simple.  If more complex or multiple graphs are used put these in the appendix and make reference to them in the results section.  Try to avoid death by a thousand graphs!

Tables, graphs and other figures should be consecutively numbered.

Be sure to explain what the obtained data shows in the text.

 

Inferential Statistics

These are used to tell us the likelihood of the hypotheses being true, i.e. what are the chances that the results we have obtained may have occurred by chance alone.

Begin by stating what test was done and why!  Refer to the research design (repeated measures etc.), level of data (ordinal etc.).  Most importantly were you looking for a difference or relationship.  You need to fully justify your choice of test.  

In your write up you will need to include the value that you have calculated (the observed value).  This needs to be compared with the critical value, which is the value you look up in a table.  These can be found in the back of most psychology text books. 

Does this show that your results are significant or not for that number of participants, at the stated level of significance for either a one or two tailed hypothesis?

All calculations should be included in the appendix.  Do not put that type of dull information in the main body of the report!

 

Discussion

AQA like this split into the following four subsections:

Explanation of Findings

This sounds like a repeat of the results section, but here you need to state what you’ve found in terms of psychology rather than in statistical terms, in particular relate your findings to your hypotheses.  Mention the strength of your findings, for example were they significant and at what level.  If your hypothesis was one tailed and your results have gone in the opposite direction this needs to be indicated.  If you have any additional findings to report, other than those relating to the hypotheses then they too can be included.

 

A word of warning: avoid the use of words such as ‘proves’ or ‘disproves.’  This is Psychology, there are few, if any hard facts.  ‘Suggests’ or ‘indicates’ are better alternatives.  Each year students insist on reporting that their research carried out on an opportunity sample of eight year five pupils disproves Piaget, a well respected figure in the area who spent fifty or more years of his life testing thousands of children!

 

 

Relationship to background research

If your results agree with previous studies then this section may be brief, but there may still be ways in which your findings differ in some way.  If your results run counter to previous studies then you need to make this clear and try to explain the discrepancy.

 

Limitations and Modifications

Consider how your findings may have been influenced by confounding and extraneous variables (i.e. factors other than the one you were testing). 

If you got the result you expected, can you be sure of what made this happen?  Look at the method for possible confounding variables that could have caused a type one error.  Consider whether any aspects of the study were unsatisfactory.  Believe me, no matter how careful you have been, you will not have carried out the perfect study.  How many pieces of research have you come across in Year 12 that were perfect?  For example were the participants entirely honest in their responses?  Were there any experimenter effects?  Was the sample size or method of selection adequate?

Comment on the statistical procedures used, particularly the power and sensitivity of the test.  Chances are you have had to use a non parametric test which are not as powerful as their parametric alternatives. 

Having mentioned the limitations of your research be sure to include how you could improve it.  This section is titled ‘limitations and modifications.’  All too often students only discuss the shortcomings.

 

Implications and suggestions for further research


This is not asking you to repeat the previous section!  Finally you need to discuss how relevant your research findings are to real life.  Think of any practical applications of your findings.  Also, how could you follow up the research to find out more.  This is different to modifications discussed in the previous section.  You will probably be up to your word limit by now so only a couple of suggestions are required!

 

Having completed their discussion many students assume that all is done.  To be fair all of the hard work is now over but there are still a few vital sections to consider. 

 

Abstract

Although this appears at the front of the report it is the last section to be written.  The purpose of the abstract is to tell the reader the bare essentials of the research you have carried out.  The style should be brief, but not in note form. 

Include a one sentence summary, giving the topic to be studied.  This may include the hypothesis and some brief theoretical background research, for example the name of the researchers whose work you have replicated.

Describe the participants, number used and how they were selected. 

Describe the method and design used and any questionnaires etc. you employed.

State your major findings, which should include a mention of the statistics used the observed and critical values and whether or not your results were found to be significant, including the level of significance 

Briefly summarise what your study shows, the conclusion of your findings and any implications it may have.

Example:

Office-Based Treatment of Opiate Addiction with a Sublingual-Tablet Formulation of Buprenorphine and Naloxone. Fudala et al 2003

Background Office-based treatment of opiate addiction with a sublingual-tablet formulation of buprenorphine and naloxone has been proposed, but its efficacy and safety have not been well studied.

Methods We conducted a multicenter, randomized, placebo-controlled trial involving 326 opiate-addicted persons who were assigned to office-based treatment with sublingual tablets consisting of buprenorphine (16 mg) in combination with naloxone (4 mg), buprenorphine alone (16 mg), or placebo given daily for four weeks. Safety data were obtained on 461 opiate-addicted persons who participated in an open-label study of buprenorphine and naloxone  and another 11 persons who received this combination only during the trial.

Results The double-blind trial was terminated early because buprenorphine and naloxone in combination and buprenorphine alone were found to have greater efficacy than placebo. The proportion of urine samples that were negative for opiates was greater in the combined-treatment and buprenorphine groups (17.8 percent and 20.7 percent, respectively) than in the placebo group (5.8 percent, P<0.001 for both comparisons); the active-treatment groups also reported less opiate craving (P<0.001 for both comparisons with placebo).

Conclusions Buprenorphine and naloxone in combination and buprenorphine alone are safe and reduce the use of opiates and the craving for opiates among opiate-addicted persons who receive these medications in an office-based setting.

 

 

References

This section should be straightforward provided the following guidelines are adhered to.  However, every year students drop one mark, or even both, because they think they know better!  The references should contain details of all the research you have covered.  It is not sufficient (as below) to simply list the books used! 

What not to do:

A New Introduction to Psychology, Gross & McIlveen

Bluffers Guide to Psychology, Uddin, Rice and Moss

This is a list of books, or a bibliography. 

 

What you should do:

Look through your report and ensure you include every researcher mentioned.

For each one provide information on where that particular study was originally published, for example:

Paivio, A., Madigan, S.A. (1970).  Noun imagery and frequency in paired-associate and free learning recall.  Canadian Journal of Psychology. 24, pp353-361.

This is the researchers and year, the title of their publication, the journal in which it was published with volume number and specific page references.

Other rules:

The references should be in alphabetical order, not the order in which they appear in your report.  See how it's done in the back of the set texts.

In the unlikely event of the same researcher having two reports, both in the same year, use 'a' and 'b' to separate them out, e.g. Waring (1962a) and Waring (1962b).

Sometimes the text books are naughty and do not provide a reference for a piece of research they've mentioned.  In this case you have one of two options:

a.       Look it up in another text book or

b.       If that fails use the following format:

Freud, S. (1922) cited in Gross RD (1996) The Science of Mind and Behaviour (3rd Edition). London: Hodder & Stoughton.

 

This is the researcher and year, the text book in which you found the information, where the book was published and the name of the publishers. 

Hint:  As I mentioned earlier it is all too common for students to reach the end of their report and realise that they can't remember where they found a particular reference.  Write them down as you go along, preferably in your log book, but also anywhere else where you won't lose the information.

 

ALL AUTHORS MENTIONED MUST BE REFERENCED!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

 

Choosing the correct Statistical Test

 

An important point to consider at the outset, particularly those amongst you that don’t like sums.  You will not be expected to calculate a level of statistical significance.  However, you will need to know when to use a particular test and also having been given an observed value, be able to decide its level of significance.  This isn’t as complex as it sounds.  It’s simply a matter of looking up the information in a table, though you will need to understand what the table tells us!

When choosing a test there are three things to consider.  Two of these have already been covered in this booklet, the third was covered at AS so a quick reminder.

 

1.  NOIR: What is the level of your data?

Nominal Data: is the simplest thing a number can do.  It can tell us how many things there are!  Basically nominal data is a headcount or a tally.  It doesn’t tell us if something is bigger, brighter or bolder, just how many.  For example, get a show of hands; how many people in the class study English.  Your head count provides nominal data.  If you were replicating Piaget’s research at a primary school you might count the number of five year olds who can successfully complete the three mountains task and compare this to the number of seven year olds.  Nominal data.

Ordinal data: allows us to put things in order.  For example A might be more attractive than B but uglier than C.  We have the order C A  B in terms of attractiveness.  Crucially however, we can’t be sure that the difference between C and A is the same as the difference between A and B.  C and A might both be very attractive whereas B might be a complete minger.  We can’t tell that the intervals are the same.  

Usain Bolt won the men’s 200m at Beijing, Shawn Crawford was second and Walter Dix third.  From this we can’t tell is the difference between first and second was the same as the difference between second and third.  First, second, third provides ordinal data.

Interval and Ratio: allows us to put things in order (ascending or descending) just as ordinal, however this time we can be sure that the intervals are the same.  We know that the difference between 10cm and 11cm is the same as the difference between 15cm and 16cm.  The same applies to weight or mass, temperature and time. 

 

An odd one to consider is IQ.  The jury is out on this one.  Some psychologists believe it yields interval/ratio data, others that it is merely ordinal.

Generally speaking if you need a piece of equipment to measure it, then its interval or ratio. 

For the purposes of statistics interval and ratio are taken as the same.  There is however, a subtle difference.  Ratio has a true zero. So no minus values, e.g. time, weight, height.  Interval data can be minus e.g. temperature in degrees Celcius.  As a result you can say that 20cm is twice as long as 10cm.  You cannot say 20C is twice as hot as 10C.

 

2. Correlation or difference?

Provided you’ve given careful consideration to your procedure and are confident tin what you’re looking for this should be easy.  Some groups have appeared confused in the past, particularly with issues such as the relationship between attractiveness and punishment.  This could be done either way:

You could produce an ascending scale of attractiveness and compare this to the level of punishment given to each person.  You would predict a negative correlation; as attractiveness increases, level of punishment given decreases. 

Alternatively you could split your photographs into two groups, with the beautiful people in one group and the mingers in the other.  Then count the level of punishment offered for each.  You are now looking for a difference between the two groups.  The danger is, having formulated a hypothesis that you don’t stick to it.

Generally however, it should be obvious from your hypothesis what you’re looking for!

 

3.  Repeated or independent measures design

Again obvious since we’ve covered it many times.  If you’re using the same group of participants to assess both variables its repeated measures.  If the participants in one condition differ from the other its independent.  There are times when the decision is made for you.  Sex differences, age differences, cultural differences… they have to be different participants in each condition. 

 

Decision time

Having decided on the above three dimensions, use the chart below to decide which test to use.  You will be expected to know about the four in bold: Chi squared, Wilcoxon’s sign test, Mann-Whitney ‘U’ and Spearman’s ‘rho.’

  

 

Test of

 

difference

Test of correlation (relationship)

Type of Data

Repeated Measures / matched Pairs

Independent measures / single participant

 

Nominal

 

Sign Test

Chi Squared

Chi Squared

Ordinal

Wilcoxon sign test

Mann Whitney ‘U’

 

Spearman ‘rho’

Interval/

ratio

Related ‘t’ test

Independent (unrelated) ‘t’ test

Pearson product moment (‘r’)

 

e.g. if you have ordinal data with independent measures design and you’re looking for a difference, you will use Mann-Whitney ‘U.’ 

Now a little bit of play acting or imagination.  Let’s pretend you’ve done your experiment, collected your raw data, chosen the correct test to use and made your calculation.  All your numbers will have been put into tables or grids, you’ll have calculated means and added things up, squared and square-rooted, subtracted one group from another and perhaps done some dividing too.  At the end of this you’ll have calculated ONE number.  This number will magically tell you whether your results are meaningful and statistically significant, or whether they’ve more than likely occurred by chance and are little more than a fluke. 

 

Critical and observed values

The number you calculate is your observed value.  This needs to be compared with the critical value in the appropriate table.  Each test has its own table with various critical values depending on the level of significance 5% (0.05), 1% (0.01), 0.5% (0.005) and so on.  The critical value also varies depending on the number of participants or degrees of freedom. 

With Spearman’s rho and chi squared tests the number you calculate needs to be equal to or greater than the critical value for your findings to be significant.

Aide memoire

 ‘Spearman’s rho’ and ‘chi squared’ both contain ‘Rs’ as does the word gReater

‘Mann Whitney U’ and ‘Wilcoxon’s sign’ do not contain R.  With these two tests the critical value needs to be equal to or smaller than the critical value. 

 

Type one and type two errors

Type 1

This is believing you have found a significant result when you haven’t.  You reject the null hypothesis when it should be retained.  For example you might set too lenient a level of significance. 

Type 2

You’ve guessed it… this is believing you have found nothing of significance when you have.  This one is particularly annoying for an undergraduate piece of research.  You have accepted the null hypothesis when it should have been rejected.  This could happen if you set yourself too high a level of significance. 

 

 

Chi squared test

Use when you have nominal data with independent measures design.  Unlike the other tests, chi-squared can be used to test for a correlation or a difference. 

For example: Piaget’s three mountains test:

 

5 year olds

7 year olds

Totals

Successful

a.

4

b.

18

 

 

22

Not successful

c.

16

 

d.

2

 

18

Total

20

20

40

 

You would put your raw data into a grid and then calculate the expected frequencies for each cell (a,b,c,d)

You then compare the scores you obtained with what would be expected by chance.  With some appropriate and very repetitive number crunching (especially if you have 20 cells) you calculate your critical value. 

The chi squared test uses degrees of freedom calculated:

Number of columns -1  x  Number of rows -1

In this case 2-1  x  2-1 = 1 x 1 = 1

You look up your observed value in the appropriate table for 1 degree of freedom at the 5% level.

Your number needs to be equal to or greater than the critical value.

 

 

If asked to justify a choice of test do so in terms of whether you’re looking for a correlation or a difference, using an independent or repeated measures design and level of data obtained.

For example:  I chose to use Mann Whitney ‘U’ because I was looking for a difference with an independent measures design and would be obtaining data at the ordinal level.

Note: if using matched pairs design treat as repeated measures. 


 

 

Spearman’s Rho

Use when you are looking for an association (for example a correlation) with ordinal level of data.

For example, testing the matching hypothesis which predicts that men and women with similar levels of attractiveness are more likely to get married. 

This time you put your raw data in a table that looks like this:

 

Couple

Groom

Bride

Rank

(groom)

Rank

(bride)

Difference between ranks

Difference squared

A

4

5

 

 

 

 

B

4

4

 

 

 

 

C

9

8

 

 

 

 

D

2

10

 

 

 

 

E

7

7

 

 

 

 

F

8

8

 

 

 

 

G

3

4

 

 

 

 

H

8

9

 

 

 

 

I

6

6

 

 

 

 

J

4

5

 

 

 

 

 

 

 

 

 

 

 

 

You can complete the rest when we look at ranking a set of data.

Essentially you give each groom a rank dependent on their attractiveness compared to the other grooms and then repeat the process for the brides.  The higher the correlation the more similar the two sets of ranks (i.e. the more similar their levels of attractiveness.  When you calculate the difference in ranks the more similar the attractiveness the smaller the differences.  You square the values to get rid of any negative values (remember -2 squared is 4 not -4!).

After a little more jiggery pokery you end up with a critical value… this time always between -1 and +1.

You look it up in the appropriate table.  This time the number of pairs is important.  There is a critical value at 5% that varies depending upon the number of pairs of participants.  Your observed value needs to be gReater than or equal to the critical value.

 

 

 

Mann Whitney ‘U’ Test

Use when you are looking for a difference with ordinal data and an independent measures design. 

For example you might want to test the hypothesis that boys and girls take different subjects at A-level, boys preferring spatial and mathematical, girls preferring subjects that are more verbal.

To do this you allocate a score for each A-level subject…for example allocating spatial and mathematical subjects a low score: physics and maths (1), chemistry (2) etc and verbal subjects a high score English, French, German (10), politics and history (9) and so on…

You put your raw data in a table that looks like this:

 

Boys scores

Girls scores

Rank (boys)

Rank (girls)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Σ

 

 

Unlike correlational (Spearman’s) the boys and girls can go in any order… this is independent measures so there are no pairs as such.  Also, unlike Spearman’s the number of boys and girls scores can be different.  You could have 10 boys and 12 gorls for example.

This time you rank all the scores together… place all the boys AND girls scores in ascending order and calculate a rank.  For the calculation you only need to add up one set of ranks, in this case the boys.  Then following some other number crunching you end up with TWO values.  The smaller value is called ‘U’ and the larger value ‘U’’ (pronounced U prime). 

You check U (smaller number) against the critical value for the number of participants in each column.  This time the observed value needs to be equal to or smaller than the critical value. 

 

 

 

 

Wilcoxon’s sign test

Use when you are looking for a difference, with a repeated measures design and ordinal data.

For example investigating the Mozart effect.  This is the idea that listening to the music of Wolfgang Amadeus Mozart (his real name was Johannes Chrysostomus Wolfgangus Theophilus Mozart) but I digress, will improve all manner of cognitive functions.  This could be tested using a repeated measures design.  Day 1 you get your participants to complete a memory task whilst listening to a popular contemporary instrumental track.  Day 2 they return and complete a similar task listening to Mozart. 

Obviously a better design option here is then to deploy counter-balancing measures or ABBA if you prefer.

Raw data would go on a table like this:

 

Participant

Mozart

Non-Mozart

Difference

Rank

A

 

 

 

 

B

 

 

 

 

C

 

 

 

 

D

 

 

 

 

E

 

 

 

 

F

 

 

 

 

G

 

 

 

 

H

 

 

 

 

I

 

 

 

 

J

 

 

 

 

 

 

 

 

 

 

Any ‘0’ ranks are ignored.  The sum of positive ranks is added and then the sum of negative ranks.  The smaller of the two values is taken and then it’s a very quick job to look up the value in an appropriate table for the appropriate number of participants (in the above case 10).  The simplest of all inferential tests to calculate. 

Wilcoxon’s sign test contains no letter ‘R’ so this time the observed value needs to be equal to or smaller than the critical value found in the table.