Consider the figure below:

This model contains 8 arrows, and each one visualizes a separate hypothesis. Here are three examples of hypotheses.

  1. Arrow A represents a positive association between a dichotomous independent variable (Protestant) and an ordinal dependent variable (church attendance). A good formulation of this association is: “Protestants have a higher frequency of church attendance than other religious groups”. This formulation is better than ‘Protestant has an effect on church attendance’. The word ‘Protestant’ is variable speak. The formulation ‘Protestantism affects church attendance’ would also be incorrect. The term ‘Protestantism’ refers to a movement, not to a characteristic of survey respondents.
  2. Arrow B represents a positive association between an ordinal independent variable (church attendance) and a dichotomous dependent variable (volunteering). A good way to formulate this association is: “Volunteers attend church more often than non-volunteers.” This is a slightly better formulation than “the higher the frequency of church attendance, the higher the likelihood of volunteering” because it is easier to understand.
  3. Arrow C represents a positive association between two ordinal variables. A good formulation of this association is: “Altruistic values increase with the frequency of church attendance.”

Common pitfalls in the formulation of hypotheses are the following:

  1. The use of the word ‘important’ and ‘role’. If you say “religion plays an important role in volunteering” then it is unclear what your expectation is. When you find yourself using these words, reformulate your hypothesis such that it is clear to the reader how you are going to test it and what you expect the result of the test to be. For example, your hypothesis could be that Protestants are more likely to volunteer than Catholics, particularly in religious organizations.
  2. The failure to specify a direction, as in: “Gender is related to giving” or “There are gender differences in giving”.  Usually there are good reasons to predict specific differences. In this case, a better formulation would be: “Women give more often than men but lower amounts”.
  3. The use of ‘data language’, such as variable labels: “PROT is positively associated with ATTEND” (representing Arrow A in the figure above). A better formulation would be: “Protestants attend church more often than members of other religious groups”. By the way: the use of data language in the description of your results is also a bad practice.

Suppose you would like to base your hypothesis on the results of previous studies. One way to do this is to write: “As many previous studies have shown, …. ”. A better way to do this is to list a number of the previous studies after the first part of your sentence: “As many previous studies have shown (e.g., author 1 (year), author 2 (year), …. ”.