Tuesday, December 17, 2013

Hints for Problem Set 10

Problem 1:
(B) You need to give me an explanation using CORRECT statistical terminology to get full credit. Just writing "Central Limit Theorem," is going to earn you about a 2/10.....

Problem 2:
(A) You will have two x-bars for this question. This question is a bit tricky - my advice is to draw a picture of a Normal Curve and shade the area that you're trying to find. Once you find your answer, ask yourself, "Does this make sense?" You are finding the probability that it will shut down, or NOT work....

(B) I recommend finding what two standard deviations from the mean would be, first, then use those two values to find two separate z-scores. The difference of those two z-scores is your answer.

Problem 3:
This is a tough one because it doesn't explicitly tell you what the means are. Consider the population mean to be the total baggage limit divided by the number of passengers on the flight. It gives you the standard deviation, and suppose your sample mean is represented by the individual passenger.

EARLY CHRISTMAS PRESENT - Skip number four on the problem set. It is worded poorly. Instead of taking paragraphs on paragraphs to explain it, just don't worry about it - Happy Holidays.

Problem 5:
(A) Change p-hat to p. The mean is already given to you (think about what p=0.5 represents...) and standard deviation by doing the square root of p(1-p)/n

(C) Remember that you're looking for the upper part of the curve here, so think about what you would need to do to your probabilities to get the correct answer...

(D) Think about the Central Limit Theorem here, and the computer program that I showed you in class. What happens to the Normal Curve when there are more observations? And because of the way the Normal Curve looks, what then should happen to your probabilities?

Monday, December 16, 2013

Sampling Distributions with Proportions

Sampling distributions with proportions are very similar sampling distributions with means. We are still going to use the z-score formula, but we're going to use it with proportions parameters. They are:

p-hat, the sample proportion (instead of x-bar)
p, the population proportion (instead of mu)
std. dev. of p, which is found by taking the square root of p(1-p)/n
and n, the number of observations.

Use the same z-score formula (sample-population/std. dev) to solve.

I realize that this explanation is kind of vague, so please check the link below for a video example of how to solve this type of problem:

Saturday, December 14, 2013

Central Limit Theorem

The Central Limit Theorem says that, with more observations, a distribution appears to look more and more like a Normal Curve (bell-shaped). With too few observations, the curve is too flat and we can't assume Normality, which means we cannot use Normalcdf.

The AP Stats rule is that, if the population standard deviation is known and if n>= 30 observations, then we can assume that the distribution takes a Normal Shape. If n is fewer than 30, then we can't use the Normal Distribution. We will have to use something called the t-distribution, which we won't go over until the beginning of next year.

Conceptual Stuff about Sampling Distributions

Here are a few key stats-y points about sampling distributions that you want to keep in mind, especially when you are doing your problem set (10 bonus points if you recommend me a good workout song at the end of your problem set!).

1. If your z-score is positive, then your x-bar should be greater than your mu. If your z-score is negative, then your x-bar should be lower than the mu. If they are the same, then the z-score should be zero and your probability will be 50%.

2. When n is low, the z-score becomes smaller. When n is higher, the z-score becomes larger (if all other variables are held constant).

3. Your z-score represents how many standard deviations your sample mean is from your population mean. If the z-score is within + or - 3 standard deviations from the mean, then we can assume that the sample is representative of the population without calculating the probability. If the z-score is more than -3 or +3 standard deviations from the mean, then are sample mean is too far from the population mean to say that the sample accurately represents the population.

4. If a problem asks you to find the probability that the sample is "between" two different sample means, find the probability of the first (Normalcdf(first z-score)), then find the probability of the second (Normalcdf(second z-score)), then subtract those probabilities.

Remember that a probability MUST be between zero and one - ALWAYS!!!

Intro to Sampling Distributions

A sampling distribution is a distribution based around a sample statistic, such as x-bar. We would much rather use the information from the population (mu) instead, but sometimes that information is not available or sometimes it is too hard/not possible to sample an entire population. So, we use a sampling distribution instead.

The main question then becomes: if we take a sample from the population, what is the probability that the sample statistic (x-bar) is actually representative of the population statistic (mu)? In other words, are the sample and population close enough to one another that they are essentially the same? That depends. To find out, we need a sample mean, a population mean, a population standard deviation, and n, which is the number of observations in the sample. Plug these values into the z-score formula

z = (x-bar - u)/(sigma/rad(n)). (Awkward writing the formula in the blog without all of the math symbols!)

This will give you the z-score. Then, plug this value into Normalcdf, and this will give you the probability that our sample represents the population. Note: if your x-bar is higher than mu, do 1-Normalcdf instead.

Monday, December 9, 2013

Hints for Problem Set 9

Problem 1: I am looking for a paragraph answer. Please be specific. If you choose to design an experiment, you may include an experiment diagram to SUPPORT your answer but that should not be your entire answer.

Problem 2: positively skewed means skewed to the right and negatively skewed means skewed to the left. You may use the TI to calculate your boxplot but you must include a scale and titles for full credit.

Problem 3: binomial/geometric probabilities, think about the stuff we just covered.

Problem 4: when you comment on the differences for both histograms that you made, with titles labels and a scale, use CUSS to do this most effectively, and comparator language.

Problem 5: this is probably the most challenging question. Note that you cannot describe linearity without a scatter plot. There are several key calculator functions that you'll need to do this problem - check your notes. Note that we determine transformations using improvements in both the residual plot and in r-squared. Don't forget to put the transformed variable in your new regression equation(ex y =3.2logx + 7)

If you need help for question 5, or any other question for that matter, please contact me directly or come after school tomorrow. Remember to always write in complete sentences and to never begin a sentence with "because"

Tuesday, December 3, 2013

Binomial Distribution to Normal Curve

Today we discussed instances where the binomial distribution could be transformed into the Normal Curve (because the Normal Curve is more accurate). This can happen if the number of trials (n) times the probability (p) is greater than 10. In other words:

np >= 10  for us to use the normal curve.

To solve, we must recall that the average of a binomial distribution is np, and the standard deviation of a normal distribution is the square root of np(1-p). If we're using the Normal curve, then we must use a z-score (because we use z-scores with Normal Curves). Remember that the formula for a z-score is:

z = x-bar - mu/std. dev.  

Calculate the mean (although you already should have, to check the assumption for Normal Curve) and the standard deviation, then plug it into the formula to get your z-score. Once you have your z-score, plug this into Normalcdf(z-score) and that will give you your probability!

Don't forget that we would do 1 - Normalcdf(z-score) if we wanted the top half of the curve.

The example that we did at the end of class today was: Suppose that there is a 5% chance that Mr. Guyton stops you in the hallway. If Amber surveys 300 students, what is the probability that less than 16 of them will have been stopped by Mr. Guyton?

For a video solution to this, please click here:


Monday, December 2, 2013

Binomial and Expected Value open response

We did two tough problems in class today! Well done for those of you who followed along and gave it your best shot.  VERY impressed with the efforts considering that it was the first day back from a holiday break.

For the second problem that we did, I realize that we didn't get to part D in class.  For those of you interested in increasing your statistical knowledge, here's a video solution to part D problem 2 (the one from 2005).

http://www.educreations.com/lesson/view/2005-part-d/14386947/?s=AHSFQB&ref=app