Sunday, February 16, 2014

Problem Set 12 Hints

Firstly, know that EVERY problem MUST have a population parameter (p or mu), sample parameter (x-bar or p-hat), standard deviation (given for means but must be calculated for proportions), and a number of observations which is always n. In some problems it may not be obvious, for example, the problem might use language like "a majority of..." in which case you must think for yourself what a "Majority" looks like as a percentage or as a decimal. Think about it....

Problem One:
The little thing that looks like the Jesus fish is the Greek letter lowercase alpha, which is often used in stat to represent the level of significance. So for example, if alpha = .05 then the problem is using a 5% level of significance.

Problem Two:
Think carefully about which number is which. Remember that your population parameter always comes from the bigger group. NO SHORTCUTS - tell me specifically in CONTEXT what all the assumptions are, and remember that your interpretation is going to require two sentences with the significance test.

Problem Three:
a. I'm looking for a fantastic CUSS here. Label your graph properly. A rough sketch of the histogram is fine but give me an idea of scale. As Toni suggested, starting with "54" probably isn't a good idea - start with something more even.
b. Since the n is so small here, I don't want to hear anything about how we can use the "central limit theorem" in this problem because....we can't. Show WHY we can't and tell me what else we can do (then do that).

Problem Four:
a. use "fewer than half" as a reference for what values you should use for p, your population parameter
b. please do the whole significance testing procedure again for this part
c. DON'T SKIP THIS QUESTION - I know that for about half of you you're going to think about skipping it. It's worth 10 points and will knock you down a letter grade if you skip. Instead, think carefully about what is happening between a and b and tell me what you observe. Write a congratulations to Erin and/or Drew for making state choir after this question for five bonus points. If you get stuck with this question, just ask someone for help.

Problem Five
Do two different significant tests for this question, and think carefully about what "a majority" means. If you got the last question, you'll know what I'm talking about. If not, ask a friend or me for help.

Sunday, February 2, 2014

Friday, January 31st: Significance Testing with Means

A significance test (also known as a hypothesis test) compares a sample with a population. Given what we know about the population, is it likely that the sample is representative of the population? Do we have evidence to show that the population parameter may not be accurate? These questions are just a few that statisticians address when performing significance tests.

There are four steps to performing a significance test:
1) State hypotheses (null and alternative) and the population parameter of interest.
Ho: mu = (population number)
Ha: mu >, <, or not = to (population number)
where mu represents.....(whatever your problem is about)

2) State assumptions. They are exactly the same as your confidence interval assumptions (random assumption, size assumption, and independent assumption).

3) Calculate your z-score using the z-score formula from earlier this year:
z = x-bar - mu/Sx/root(n), and calculate the p-value by using the Normalcdf function in your calculator.

4) Interpret in context for both your results and the context of the problem.
-Since the p-value (insert p-value) is (less than/greater than) the significance level (insert sig level here as a decimal), we reject/fail to reject the Null Hypothesis.
*Reject Null if less than, fail to reject Null if greater than
-We do/do not have sufficient evidence to show that (whatever your alternative hypothesis is saying in words, not symbols)

Wednesday, January 29th: Test Review, Thursday January 30th: Confidence Interval Test

We reviewed with two confidence interval problems in class on Wednesday to help reinforce main ideas. I gave you an additional problem to take home with you. Remember that the AP exam people are tricky and that they are going to try to ask you some sort of conceptual problem about the interval along with the calculation of the interval itself.

The tests that I've graded fall somewhere in the middle in terms of test difficulty. It wasn't the hardest test, wasn't the easiest - probably somewhere near the middle. My advice to you guys is this, as I've been saying for a while: try to tackle the open response part FIRST so that you don't run out of time at the end. You can always pick an answer for multiple choice, but it's really tough to make up work at the last minute for the free response.

I will pass back your tests to you as soon as I have everyone take the test. At this moment, I still have a student who needs to make up the exam.

Tuesday January 28th: Confidence Intervals with Small Samples

What happens if your sample size is under 30, and you cannot use the Central Limit Theorem to verify the shape assumption? In this instance, we cannot use the Normal Curve, so we must use a flatter distribution with a wider spread called the t-distribution. The data is more spread out because less data = more variability =  greater spread.

Each t-value is different. Unlike the z-critical values, we don't have set values for 90%, 95%, and 99%. Therefore, we have to calculate the critical value based on the size of our sample.

To calculate a t* critical value, first draw the normal curve with your level of confidence in the middle. For instance, if I was using 95%, I would draw the curve and put 95% in the middle. Then, take whatever area is left over on the LOWER end of the curve only and add it to the middle. I would have 2.5% left over, so I would add it to 95% to get 97.5%. Then, on your TI-84, go to 2nd - distribution and go to #4: t-interval. Fill in your percentage, but put it as a decimal (.975) and put in your sample size. Hit calculate, and this gives you the t-critical value to use in your confidence interval formula!

The interpretation and calculation is the same for t-intervals. The only difference is that, under assumptions, we need to state that the sample size was not sufficiently large, so a t-distribution must be used.

Monday January 27th: Confidence Intervals with Proportions

Proportion confidence intervals are the same as x-bar confidence intervals, except we use p-hat and p instead. Use the exact same confidence interval formula:

p-hat +/- crit. val (std. dev) where the std. dev is equal to root(p-hat(1-p-hat)/n)

In most instances, you'll see something like "45 out of 56..." in a problem, so you'll need to use this information to calculate what your p-hat is. The procedure for a proportion confidence interval is the same except for your third assumption, you must check, in regard to sample size:

np-hat >= 10 and n(1 - p-hat) >=10. This verifies the size assumption, so by the Central Limit Theorem, the sample either is or is not sufficiently large.

Thursday January 23rd: Calculating Sample Size in a Confidence Interval

Let's suppose that you already have a confidence interval, but you want to know how many observations your sample needs in order for your interval to look that way. Or let's say that you want your interval to be a certain width: say, only 10 on both the plus and the minus side - what sample size will make sure that this is true? (The width of the confidence interval, by the way, is called a margin of error. It's what you get when you multiply the critical value by the standard deviation).

To find the sample size, set your margin of error equal to your critical value times your standard deviation:
MOE = (crit. val)(std. dev).
Divide each side by your critical value to cancel it out. Square each side to get rid of the radical. Cross multiply to get the n on the left of the equation so that it will be by itself, and divide the standard deviation to get your answer.

Wednesday January 22nd: More on Confidence Intervals

We know that in statistics all samples are different. I could take a sample of student GPAs, and Mrs. Cain could take a sample of student GPAs, and chances are that our mean GPA would not be exactly the same.

Since a confidence interval is created AROUND a sample parameter (x-bar or p-hat), and since all samples are different, all confidence intervals will then also be different. My 95% confidence interval will be different from Mrs. Cain's 95% confidence interval.

95% of the time, a 95% confidence interval will still capture the true population parameter. For example, if I took a sample and made a 95% confidence interval every day, 95% of the intervals that I make will have the true population mean in them. The other 5% won't - that's why we're only 95% confident that our interval is correct.