- If there are 6 green marbles and 3 red marbles in a bag, and you were asked to draw one without looking, what is the probability of drawing one green marble?
- If you don’t put that green marble back in the bag, what is the probability of drawing another green marble?
In §4.4 we are introduced to the concept of conditional probability. The notation P(A | B) denotes the probability of event A occurring given that we know event B has occurred. That’s the definition.
Now consider conditional probability in the Monty Hall Problem which is introduced in the following videos:
- After watching these videos, we know that if we are given the option to switch doors that, probabilistically speaking, it is in our best interest to switch. Why is the probability of winning NOT 50/50 when the contestant is given the opportunity to switch?
- Find the answers to the following. Given that there are 5 yellow marbles and 7 blue marbles in a bag,
- What is the probability of drawing a blue marble given that a yellow one was already drawn?
- What is the probability of drawing a yellow marble, not replacing it, and then drawing a blue marble?
- Which one, A or B, exemplifies conditional probability, and why?
In §6.2 we are introduced to the Normal Probability Distribution and the special case of the Normal Probability Distribution, the Standard Normal Probability Distribution, which is a Normal Probability Distribution with mean (u) zero and variance (σ2) one.
Watch these videos
- What is a z score?
- What is the purpose of a z score?
- If a z score were found to be 3.1, where on the graph would it be? Is this a rare or a common score? Why?
One way to find probabilities from a Standard Normal Distribution is to use probability tables, which are in your book or in online tables.
A hint from Dr. Klotz: This is my favorite online z table. http://www.z-table.com/ Don’t lose that link!
- According to the table, what is the probability when z ≤ -2.21? The probability when z ≤ 2.21?
- According to the table, what is the probability when z ≤ -0.47? The probability when z ≤ 0.47?
- Show the math of adding each of the pairs in #4 and #5. What is the total each time? Why is that the total?
- Elenore says that her probability found from the z table was -1.78. How do you know she made an error?
- What are the properties of the Standard Normal Probability Distribution?
Sampling with a Pair of Dice
Watch this video
Go to the link: http://www.random.org/dice/ (or any other online dice rolling link)
- Roll the 2 virtual dice and calculate the sum of the pair of virtual dice. Do this 5 times. List your rolls. Show your calculations.
- Then after you have rolled the virtual pair dice 5 times and calculated 5 sums, calculate the average of these 5 sums. Show your calculations.
- Conduct this experiment again, showing your list and average, but this time roll the virtual pair of dice 20 times. Calculate the 20 sums, and then find the average of these 20 sums. Show your work.
- Explain the Central Limit Theorem.
- How does these exercises relate to this week’s lesson, particularly the Central Limit Theorem?
- If we took everyone’s averages for the 20 rolls and made a histogram, what would we expect to see, and why?
Hint from Dr. Klotz: I strongly recommend recording the dice rolls and calculations in Excel because you need them again next week. It will save you time if you have the spreadsheet ready to go.
Constructing Confidence Intervals
Watch this video
What is a confidence interval?
Go to the Notations and Symbols area for Week 5 and look at the equations for confidence intervals. Why will there be two numbers in every confidence interval? Which one is given first?
In that same Notations and Symbols area, notice that the title of every confidence interval contains the words “population mean” or “population proportion.” This is a “game changer” in terms of your Project. What doorway has been opened in your discussion of ROI? (Hint: let’s say the person you were talking to said, “Your data means nothing because I am not going to any of THAT SAMPLE of 20 colleges.”)
In the discussion for week 4, you rolled a pair of dice 5 times and calculated the average sum of your rolls. Then you did the same thing with 20 rolls. Use your results from the week 4 discussion for the average of 5 rolls and for the average of 20 rolls to construct 95% confidence intervals for the true mean of the sum of a pair of dice (assume σ = 2.41). You will need a formula from the Notations and Symbols. We want to try this here so that the assignment / project are easier to do.
What do you notice about the length of the interval for the mean of 5 rolls versus the mean of 20 rolls? Did you expect this? Why or why not?
Using your mean for 20 rolls, calculate the 90% confidence interval and the 99% confidence interval. Look at the width of the interval for 90%, 95%, and 99%. What is happening? Why?
Hints – these websites might help to find z or t sub alpha over 2. (you have to know which one)
Errors in Hypothesis Testing
Watch the video: https://www.youtube.com/watch?v=0zZYBALbZgg
There are several ways to do hypothesis testing. At Grantham University, we always use the p-value method.
Make sure that you are always discussing the p-value method and not critical values.
Let’s start with some background questions to answer:
What are the steps in hypothesis testing? List and explain them for the p-value method.
What is the goal of hypothesis testing? Look at what is in the title of the formulas in Notations and Symbols. How is this like last week’s lesson?
What are null and alternative hypotheses? Why are they necessary?
Now, let’s try to get through a hypothesis test together. The homework and projects are hypothesis tests, so if we can get some examples here, you will have something to follow. Get as deep into this hypothesis test as possible. Guess if you need to. Put something out there – even if it is not right. We learn from examples and through fixing errors. Let the process happen. Then go visit LOTS of other posts this week. Let what others are doing give you food for thought. You may be replying with “I don’t think so…” messages, questions, or “ah-ha” moments.
Here’s the scenario for the hypothesis test:
A major university claimed that the mean number of credit hours that their entire population of undergraduate students took each semester was 13.2. A counselor questioned whether this was true. She took a random sample of 250 undergraduate students, and the mean of that sample of students showed that they completed 12.6 credit hours. The population standard deviation is 1.6. Conduct a full hypothesis test using the p-value approach. Let α = .05. Determine if the mean credit hours for the sample is significantly different than that of the population.
Let’s walk through some early steps.
List your “givens.” This is the information that is given in the prompt. Try to use the symbols that will be in the formula.
Type what we are supposed to determine. Is it a mean? a proportion? a standard deviation?.
Is this test one-tailed or two-tailed? How do you know? What boldface word gives it away in the scenario?
Now, go to this week’s Notations and Symbols. What formula seems to match what we have been given and what we need to find?
Now, give the steps of hypothesis testing a try. Follow the p-value process, not the critical value process. P. 390-392 may be very helpful.
Last part: Let’s say that you rejected the null hypothesis when you should not have. Is that a Type I or a Type II error?
Relationship of Height and Weight
- What does it mean to say that there is a relationship between two variables?
- We use linear regression to determine if there is a linear relationship between two variables. There are several variables that can provide information. What is a coefficient of determination? What is Pearson’s Product Moment Correlation Coefficient? How are the two related? What do these two values tell us?
- Using the given Height and Weight data set, follow the steps in the weekly video or on pages 584-585 of the textbook for performing a regression analysis to analyze the Height and Weight Data set (assume height is the input variable x and weight is the output variable y).Once you have performed the analysis in Excel, state the correct simple linear regression equation. You can make Excel put that equation on the graph by clicking on the green plus sign that appears next to the upper right hand corner of the graph. Then select “trendline.” The trendline has a black triangle that can be selected to add elements to the graph.
- Use the regression equation to predict the weight (in pounds) of a person who is 65 inches tall and the weight (in pounds) of a person who is 120 inches tall.
- Why might the regression equation you have found not be a good prediction of the weight of someone who is 120 inches tall?
- Many students say that a person who is 120 inches tall is an “outlier.” Can we just say that without proof? No! How do you prove (formula) that a person who is 120 inches tall is/is not an outlier? Find the formula in the book…