Statistical Literacy and Mind Mapping LAB
Goals:
- To learn basic statistics required for interpreting physiology studies
- To learn how to make mind maps
Background:
Today we will cover a few important skills:
- We will discuss basic statistics, which is vital for scientific literacy. The majority of scientific findings contain a significant statistical component.
- You will learn how to make a mind map, which is a powerful tool recommended by several local nursing programs. This tool requires you to organize the information that you learn in a way that shows the relationships between concepts and ideas.
General Instructions:
We will proceed as a class through each activity. Make sure to answer questions in order.
Pre-lab:No pre-lab today, but please read this section before coming to lab.
Activity 1: Basic Statistics
Statistics serves a number of purposes. First, it condenses large amounts of raw data into one or a few meaningful figures. Second, it facilitates the comparison between different sets of observations. It allows scientists to test hypotheses and predict future trends. As statistics are heavily used in scientific studies, and since these studies are used to make health care decisions, it is imperative that all students entering the health care field understand statistics.
The mean, is the numerical center of the data, or the average. The standard deviation (SD) is the variation around the mean. If data are highly variable, the standard deviation will be large as well. Roughly 68% of the data fall within 1 standard deviation of the mean, and 95% of the data fall within 2 standard deviations of the mean. Outliers are values that fall several standard deviations above or below the mean. For example, the mean for an exam is 75 points. If the standard deviation is 2, that would mean that almost everyone got a C on that exam. If the standard deviation is 12, that would mean that some people got very low scores and others very high. The mean and standard deviation are often depicted in a bar graph as the bar height (mean) and error bars above and below the mean. In the example below of the effect of antibiotic concentration on bacterial inhibition, the bars represent the mean area of inhibition, and error bars show the size of one standard deviation, above and below the mean. If the error bars do not overlap significantly, then the two groups are likely different from each other. In this context, it would mean that a 100% concentration of the antibiotic inhibited more bacteria than a 50% concentration.
Statistical tests can be used to determine whether two or more groups are different from each other. If only two groups are being compared, such as in the above example of bacterial growth, then a t-test can be used. A t-test uses mean and variability to determine whether two groups are significantly different from each other. If the variability is high and the means are close, then the two groups will likely be indistinguishable. If the variability is low and the means are far apart, there is more likely a difference between the groups (as in the above example). If more than two groups are being compared, an analysis of variance (ANOVA) test is performed. This is similar to the t-test, with more groups.
If data are not in discrete groups, but are rather continuous, such as number of hours studying for an exam by each student in a class, you can measure the correlation, or strength of the relationship. If the correlation is positive, as with the following graph of studying and exam scores, as one variable goes up, so does the other. In a negative correlation, as one variable goes up, the other goes down (such as the greater the depression, the lower the self-esteem). In either case, the strength of the relationship can be measured with a correlation coefficient r. If r is 0, the dots are randomly scattered on the graph, and you can conclude that the two variables are unrelated. If r is 1 or -1, the dots are all in a row. The closer the correlation coefficient is to 1, the stronger the relationship between the two variables.
Many studies want to consider the effect of more than one variable on their parameter of interest. For example, if you are curious about the factors affecting exam scores, you are interested in number of hours studied, but also number of hours of sleep, number of hours of work, and number of hours of partying. These can all be factored in with an analysis called a regression. A regression analysis will tell you which of the factors actually best predict the exam score.
Remember, however, that correlation is not causation! The graph above might mean that increasing the number of hours a particular student studies will increase her exam grades, but it might not. It is also possible that the type of people who study for many hours are the type of people who score well on exams, even though the extra study hours themselves are making no difference. The actual cause of the higher exam grades could be something else that people who tend to study more have in common, like a high degree of interest in the material, good writing skills, or paying attention in class.
Another test commonly performed in medicine is the chi-square test. This test is used to determine whether there is a significant difference between the expected data and the observed data, so that you can test whether your data fit your hypothesis.
Hazard ratios are often used in medical research when there is a risk of a negative outcome. The hazard ratio is set to 1 for no increased risk. If the hazard ratio of a certain outcome (for example, cancer) is greater than 1, it means that there is an increased risk of that outcome. If the hazard ratio for a certain outcome (for example, osteoporosis) is less than 1, there is a decreased risk of that outcome.
Statistical tests are used to determine the percent chance that there is no effect of your treatment. They do not give you a definitive answer. Look at the first graph comparing antibiotic treatments. Does it definitively prove that the concentration of antibiotic makes a difference? The answer is no. There is always the possibility that what looks like a difference is actually random luck. Statistics can be used to measure the percent chance that that the groups you are testing are NOT actually different and that the difference is a result of chance. This percent chance is called the p-value. Scientists have arbitrarily set the acceptable p-value for calling a result “statistically significant” at 0.05 (or 5%). This means that if the p-value is above 0.05 (say 0.1, a 10% chance that the groups are not different), we are not confident enough to say that the groups are indeed different. However, if the p-value is below 0.05 (say 0.01, or a 1% chance of no difference), then we ARE confident enough to call the result statistically significant. Significant p-values are often marked with asterisks above the bars.
If multiple groups or factors are all being tested at the same time, we must be more conservative with the p-value to decrease the number of incorrect conclusions. This is when corrections are made to the p-value for multiple comparisons. These corrections are referred to by names such as Tukey, Scheffe, and Bonferroni.
You have now learned about p-value, which is statistical significance. However, it is important to keep in mind that a statistically significant (i.e., non-random) effect can also be extremely small, or even irrelevant. Thus, it is at least equally important to consider biological significance. This refers to whether the finding is important, and includes factors to consider such as the magnitude and relevance of the effect, as well as how the effect was achieved (how much of what kind of treatment), and any negative side effects. For example, a study might find that eating five pounds of carrots a day causes statistically significant weight loss. Is it time for the carrot diet? What if the average weight loss after eight weeks was .02 pounds, the incidence of diarrhea was 75%, and the incidence of carotenemia (yellow pigmentation of the skin) was 69%?
In addition to biological significance, you should also consider the scope of inference, which is the population to which the study pertains. If the study was done on lab rats, is it necessarily applicable to humans? If the study was done on middle-aged Caucasian women, is it necessarily applicable to men? Young women? People of Asian descent?
Last, consider any study biases. There are two main forms of error in a scientific study. Random error occurs when the groups do not consist of clones, as is most often the case. You are studying the effect of vitamin D on development of cancer, and there are people in each group that are closet smokers, or that eat a heavier meat diet than others. Studies should attempt to minimize this form of error by screening because it increases variability and makes it harder to get a significant p-value. However, this form of error is normal and not lethal to a study. In fact, with a large sample size, the effect of this error decreases. The other form of error, systematic error, is lethal to a study because it introduces bias. This occurs when the group compositions are different in a way that was not intended. For example, when studying the effect of vitamin D on cancer, if you place all of the vegetarians in the vitamin D group and all of the meat eaters in the placebo group, you will no longer be able to determine whether any difference between the groups is due to Vitamin D, or to their diet. In sum, systematic error leads to bias, making it impossible to draw conclusions about the experiment.
Statistical Literacy and Mind Mapping LAB Questions:
1-5. Consider the following data graph:
Figure 1: Mean ± SD duration of cold symptoms in days with zinc or with placebo. Asterisks denote statistically significant differences at p<0.05.
- Approximately what are the mean durations of cold symptoms with zinc and with placebo?
- Approximately what is the size of 1 standard deviation with zinc? With placebo?
- Which statistical test was likely performed to determine whether there is a statistically significant effect of zinc on duration of cold symptoms?
- Are you confident about the effect of zinc on duration of cold symptoms? Why or why not?
- This study was performed on adults who took several zinc pills within 3 days of onset of symptoms. The dominant side effect was that 64% of patients had nausea. Discuss the biological significance of the effect of zinc on cold duration.
- Do you think the standard deviation of heights in this classroom is closer to 6” or 16”? Why?
- The mean on an exam was 79 and the standard deviation was 7. You got an 89. Are you within 1, 2, or 3 standard deviations of the mean? Does this make you an “outlier.”
- In an antibiotic trial, there is a difference in mean bacterial load between treatment and control. Why is this difference not enough to conclude that there is an effect of treatment? What else needs to be factored in before statistical significance can be determined?
- In the same antibiotic trial, what do you need to know to determine if the treatment is biologically significant?
- Which of the following most likely has a correlation coefficient of 0.8: commute time and exam score, quadriceps diameter and jump distance, or head circumference and intelligence?
- Which of the following will have a hazard ratio of less than 1: amount of smoking and lung cancer, amount of jumping jacks and math abilities,or amount of dietary fiber ingestion and constipation?
- A study showed that people who eat a variety of nuts are less likely to develop heart disease, with a p-value of 0.02. What does this p-value mean? Is this a statistically significant finding?
- Does the nut study in the previous question mean that eating nuts causes improved heart health? What are other possible explanations for the result? What else would you like to know before recommending that your patients eat more nuts?
- A patient tells you that she has started drinking home-made shakes into which she pours 5 tablespoons of cinnamon because she read the new headline: “Cinnamon could boost weight loss, recent study finds.” You look up the study and find that it was done on rats. What should you tell the patient?
- A study found a link between excessive screen time and suicide risk. The study was performed on teenagers in Florida. Does it mean that screen time causes suicide risk? What are other explanations?
- Based on this study, would you advise parents of 10 year olds to reduce screen time in order to reduce suicide risk? Why or why not?
- You want to test the effect of the energy drink Red Bull on athletic performance. You invite your classmates to pick either Red Bull or water, and then you have everyone race to see if the Red Bull group or the water group ran faster. Could you have biased your study? How?
- You may have noticed that health recommendations change over time. Whether or not eggs are deemed to be healthy has flip-flopped several times. Why do you think these recommendations change? What does that tell you about the scientific process?
Activity 2: Mind Mapping
A mind map is a learning tool that helps students organize information and find relationships between facts. They are very popular in health care programs, particularly in nursing school. They begin with a main concept or idea that the rest of the map revolves around. From that main idea, branches are created that represent a word or phrase that relates to the main topic. Sub-branches that stem from the main branches further expand on concepts. It is helpful to use different colors/images to differentiate the branches.
A side note: concept maps are similar to mind maps, though they are closer to flow-charts in nature. They usually contain general concepts at the top of the map, with more specific concepts arrayed hierarchically below. The connector lines usually imply causation. These are the types of maps often shown in lecture.
Today you will practice making a mind map. Take a moment to look at the one for homeostasis, below.
Now create a mind map of the muscular system.(7 points) Hopefully this tool will help you study in the future!