Part I – You must solve manually using calculator

**Part 1: Questions to be solved manually using calculator:**

- 1 Following tables 1 and 2 were collected by Mr. Biden as a part of gaming business. Mr. Biden knows that covariance associated with Game 1 is given by “H”. In order to decide his strategy for next year,
**he wants to know correlation and covariance between Customers and Profit with reference to Game 2 in terms of H. Please help Mr. Biden to get the required information.**

Table 1: Game 1 | Table 2: Game 2 | |||

Customers | Profit | Customers | Profit | |

10 | 5 | 60 | 30 | |

20 | 15 | 90 | 40 | |

30 | 20 | 150 | 40 | |

40 | 25 | 30 | 10 | |

50 | 10 | 120 | 75 |

Q.2 Mr. Kumar is a computer operator. He is interested in determining relationship between Input and Output values generated by the computer he is working on. Use simple linear regression to help Mr.

Kumar answer few questions based on the data provided in the below table.

Input | Output |

1 | 4 |

2 | 7 |

3 | 10 |

4 | 13 |

5 | 16 |

6 | 19 |

- Develop a relationship between Input (independent variable) and output (dependent variable) using simple linear regression.
- Predict the output values for input values 7 and 8. 3. Calculate sum of squared errors (SSE)
- Calculate Total Sum of Squares (SST).
- Find the r2 for the model developed in sub question 1

Q.3

Consider the following set of points:

x1 | x2 | y |

1 | 1 | 9 |

2 | 0 | 15 |

0 | 1 | 2 |

- Find the least square regression line = 0 + 11 + 22 from the given data points.
- Find the residual corresponding to each y
- Find the Sum of Squared Error (SSE)
- Find the Total sum of squares (SST)
- Estimate the value of y given (1 = 2, 2 = 3)
- Find the 2 of the model

Q.4

Mr. King performed logistic regression for his manufacturing project. The actual observed odd ratio for his project is 3/2. However, his predicted odds ratio is 2/3. Please help Mr. King to calculate the absolute difference between observed probability and estimated/predicted probability.

Q.5

Using the information provided below, convert the problem into logistic regression framework and solve

- Framework for logistic regression: ln(odds ratio)= 0 + 1 ∗independent variable

Length | Number of
| Number of Negative Cases |

4810 | 47 | 139 |

4520 | 177 | 241 |

4400 | 1087 | 1183 |

4370 | 187 | 175 |

4350 | 397 | 671 |

3780 | 40 | 14 |

3660 | 39 | 17 |

- Find the appropriate equation for the above data. (Hint: You will need to use formula from simple linear regression. But you would need to calculate your Y values based on information provided keeping logistic regression framework in mind).
- Find estimated probability for length=5000.
- Based on estimated probability in earlier case, how many total cases (estimate) one needs to gather if he/she wants to get 100 positive cases for length=5000.