SALMON: Study And Learning Materials ONline
Planning and Designing your Research: From research question to statistical analysis

This page contains the advice I give to students embarking upon the difficult task of designing and running their own research projects.

Pose a research question

Students sometimes experience difficulty arriving at a research question. Although it is useful to read the literature, this can become a delaying tactic. Your research question will not simply emerge if you read more and more research papers. As you read more and more papers you will discover that you know less and less. You end up realizing that you will never know all there is to know. This is a useful lesson in life, but it will not help you design your project. By all means read something before you begin. A good textbook is often better at giving you a broad overview than plowing through a pile of detailed journal articles.

The crucial thing is to think about what you have read.

A common mistakes at this stage is to give a broad area of research interest rather than a research question per se. To state the blindingly obvious a research question must end with a question mark! Thus "I am interested in care shown by family members" is not a research question.

Here is the research question I will use for illustrative purposes on this web page: "Does inclusive fitness theory predict grandparents' investment in their grandchildren?". The theoretical background to this question is discussed on the web page available via the SALMON home page.

Refine your research question

This stage can be tricky. It is often very time consuming and it does involve creative thinking. It will not be solved by simply reading more of the literature. But it is crucial to devote time and effort at this stage. Poor planning at the early stage of a project either causes difficulties later on, or simply produces a poor piece of work. It involves much thought. Here are some pointers that you can use to check that you are going in the right direction:

  • Quite often this stage involves writing down several questions.
  • Each question is narrowly focussed and very specific.
  • Each question suggests a research design.
  • For example, our initial question: "Does inclusive fitness theory predict grandparents' investment in their grandchildren?" might lead to the following set of questions:

  • Which grandparent did participants spend most time with when growing up?
  • Which grandparent did participants spend least time with when growing up?
  • Which grandparent did participants feel most emotionally close to when growing up?
  • Which grandparent did participants feel least emotionally close to when growing up?
  • Design an experiment or series of experiments

    Consider the first question "Which grandparent did participants spend most time with when growing up?

    You might ask participants to rank each grandparent in the order of  the amount of time they spent with them.

    • Rank order 1 is given to the grandparent they spent they most time with
    • Rank order 2 is given to the grandparent they spent the second most time with
    • Rank order 3 is given to the grandparent they spent the third most time with
    • Rank  order 4 is given to the grandparent they spent the least time with

    You will need to spend some time deciding how you present this forced-choice rank ordering task to participants i.e. designing the task. For example, you need to use simple straightforward language, you need to make sure you identify each grandparent unambiguously.

    Write down your hypotheses

    You need to be clear in your own mind what you expect to find when you analyse the data. Consider the question "Which grandparent did participants spend most time with when growing up? The null hypothesis is that there will be no difference in the amount of time participants spent with each of their four grandparents whilst growing up. However inclusive fitness theory predicts that they will spend more time with their maternal grandmother. Thus the alternate hypothesis is that they will have spent more time with their maternal grandmother whilst growing up.

    Decide how to statistically analyse your results

    It is crucial to decide how you will analyse your data before you begin to collect any data. This will protect you from collecting data that is difficult or impossible to analyse.

    You may find it useful to create a table showing the data you will collect. For example, if you asked participants to rank each grandparent in the order of the amount of time they spent with each grandparent, you might end up with the following set of results:

    Table 1. Rank order of the relative amount of time spent with each grandparent.

    Participant number MoMo MoFa FaMo FaFa
    101 1 4 3 2
    102 2 1 4 3
    103 1 2 4 3
    104 4 2 3 1

    Notes: MoMo = Mother's mother; MoFa = Mother's father; FaMo = Father's mother; FaFa = Father's father. 

    Rank order 1 = most time spent with this grandparent, rank 4 = least amount of time spent with this grandparent

    How would you analyse this set of results? You need to ask yourself questions about the nature of the measurement scale you employed. Can you use parametric statistics or is a non-parametric test more appropriate? Your Research Methods course should have prepared you to tackle this issue. I would use a non-parametric test (Friedman) to test if there was a difference between the amount of time spent with each grandparent and then a Mann-Whitney U test if the Friedman proved significant.

    Draw graphs showing all possible outcomes of your experiment

    Consider the research question: Which grandparent did most of the participants spend most time with when growing up?

    There are several possible outcomes to our study:

    • participants may spend most time with their maternal grandmother when growing up, or

    • participants may spend most time with their paternal grandfather when growing up, or

    • participants may spend an equal amount of time with each grandparent when growing up, or

    • participants may spend more time with maternal than paternal grandparents

    You need to decide if a particular pattern of results is consistent or inconsistent with inclusive fitness theory.

    Why bother to consider the possible outcomes of an experiment at the design stage? You should try to design an experiment in which  any possible pattern of results would be interesting. But - to be frank - it is very difficult to achieve this ideal. For example, the lack of a significant difference between groups in usually not very interesting.

    The following four histograms illustrate the four possible outcomes to our experiment. You should understand how each outcome relates to the theory underlying your research.

    Consider the ethical issues involved in your study

    Do not deceive participants about the purpose of your study. Do not deceive participants about what will happen to them during the study.

    You should be open and honest about the purpose of your study and what it will involve for the participant. Do not withhold information from the participant which - when you give them after they have participated - could have prevented them giving you their consent.

    For example, you might say to participants:" I am interested in the how much care and attention you received from your grandparents when you were growing up. If you agree to participate in this study you will be asked to say which of your grandparents paid you the most attention, and which paid you the least attention whilst you grew up. "

    Do you have any questions about the purpose of this study?

    Do you have any questions about what you will be asked to reveal about yourself during the study?

    Do you want to take part in the study?

    You do not need to reveal to participants the precise pattern of results you expect to obtain from the study. After all, you do not know this information before you carry out the study.

    Dealing with the data from your experiment

    You may find it useful to create a scatter plot of your raw data. This will help you visually identify trends in your data and highlight 'outliers' - datum points that lie outside the range of the other datum points in a particular condition. It will also give you an idea of what to expect when you analyse the data using a stats package on a computer.

    If you are using stats software on a computer you may find it useful to perform a dry run using data from a statistics textbook. Find a design that is the same as your own. Put the data from the book's worked example into the stats package. You should get the same answer from the stats package as that given in the textbook. If not you have done something wrong.

    Writing up your research

    At the beginning of this page I urged you not to get bogged down in reading all the journal articles in your research area before designing your experiment. I advised you to think about what you had read, rather than go on reading ad nauseam. The reason for this advice is to prevent you getting bogged down, confused and depressed before you even started your project. Bear in mind that your are conducting an undergraduate project. It is not the end of the world if someone else has already carried out what you thought was an original piece of research. However, it is the end of the world if a member of staff carries out a piece of research they considered to be original only to discover it has been published already. Scientists want to publish original findings. It's a bit like the Olympics but more so - there are no silver and bronze medals. This is why professional scientists are paranoid about reading absolutely everything in their are.

    But of course you will eventually have to show in the introduction that you understand the research literature in your chosen field of research. Use your time wisely. Try to get hold of a recent review article of your research area in a good journal. Read the article and follow up any of the references that seem particularly important to your study.

    Remember to adopt the 'hour-glass' structure for introduction and discussion sections that you were taught to use earlier in your course.

    Do and don't

    • Don't add pieces on to your design just for the sake of padding out your project. This can lead to problems of justification at the write-up stage. Don't include a variable (e.g. sex of participant) unless there is a reasonable chance that it will make a difference to the results.
    • Students often ask how many participants they need to run. The answer to this can be predicted using power tests. These are covered in stats books / lectures.
    • Don't worry about beginning to design your project as a straightforward replication of an existing piece of research. Students often worry unnecessarily that they should avoid replication. In reality, if you start with what seems like a replication you will often end up extending the existing study, particularly if you spend some time thinking about the design and analysis of your project.
    • Don't put off your project. It an lead to extra stress when you come to analyse and write up your work.
    • Do contact your supervisor by email if you need help or guidance. Writing an email helps to clarify what the problem is before a face-to-face meeting. I may request a  face-to-face meeting especially if I think the advice I need to offer is not straightforward, or there is a choice that you need to understand.
    • Please don't assume that  I will be able to answer emails or see you face-to-face outside term time. For example, don't plan to "do the project" during the Easter vacation if you think you will need supervision.
    • Finally, good luck with your project. Enjoy it, after all you will have hopefully picked a topic that fires your imagination.