Sampling Distribution of the Sample Mean, x-bar (2024)

  1. Last updated
  2. Save as PDF
  • Page ID
    31307
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vectorC}[1]{\textbf{#1}}\)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}}\)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}\)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    CO-6: Apply basic concepts of probability, random variation, and commonly used statistical probability distributions.

    Behavior of the Sample Mean (x-bar)

    Learning Objectives

    LO 6.22: Apply the sampling distribution of the sample mean as summarized by the Central Limit Theorem (when appropriate). In particular, be able to identify unusual samples from a given population.

    So far, we’ve discussed the behavior of the statistic p-hat, the sample proportion, relative to the parameter p, the population proportion (when the variable of interest is categorical).

    We are now moving on to explore the behavior of the statistic x-bar, the sample mean, relative to the parameter μ (mu), the population mean (when the variable of interest is quantitative).

    Let’s begin with an example.

    EXAMPLE 9: Behavior of Sample Means

    Birth weights are recorded for all babies in a town. The mean birth weight is 3,500 grams, µ = mu = 3,500 g. If we collect many random samples of 9 babies at a time, how do you think sample means will behave?

    Here again, we are working with a random variable, since random samples will have means that vary unpredictably in the short run but exhibit patterns in the long run.

    Based on our intuition and what we have learned about the behavior of sample proportions, we might expect the following about the distribution of sample means:

    Center: Some sample means will be on the low side — say 3,000 grams or so — while others will be on the high side — say 4,000 grams or so. In repeated sampling, we might expect that the random samples will average out to the underlying population mean of 3,500 g. In other words, the mean of the sample means will be µ (mu), just as the mean of sample proportions was p.

    Spread: For large samples, we might expect that sample means will not stray too far from the population mean of 3,500. Sample means lower than 3,000 or higher than 4,000 might be surprising. For smaller samples, we would be less surprised by sample means that varied quite a bit from 3,500. In others words, we might expect greater variability in sample means for smaller samples. So sample size will again play a role in the spread of the distribution of sample measures, as we observed for sample proportions.

    Shape: Sample means closest to 3,500 will be the most common, with sample means far from 3,500 in either direction progressively less likely. In other words, the shape of the distribution of sample means should bulge in the middle and taper at the ends with a shape that is somewhat normal. This, again, is what we saw when we looked at the sample proportions.

    Comment:

    • The distribution of the values of the sample mean (x-bar) in repeated samples is called the sampling distribution of x-bar.

    Let’s look at a simulation:

    Video

    Video: Simulation #3 (x-bar) (4:31)

    Did I Get This?: Simulation #3 (x-bar)

    The results we found in our simulations are not surprising. Advanced probability theory confirms that by asserting the following:

    The Sampling Distribution of the Sample Mean

    If repeated random samples of a given size n are taken from a population of values for a quantitative variable, where the population mean is μ (mu) and the population standard deviation is σ (sigma) then the mean of all sample means (x-bars) is population mean μ (mu).

    As for the spread of all sample means, theory dictates the behavior much more precisely than saying that there is less spread for larger samples. In fact, the standard deviation of all sample means is directly related to the sample size, n as indicated below.

    The standard deviation of all sample means (\(\bar{x}\)) is exactly \(\dfrac{\sigma}{\sqrt{n}}\)

    Since the square root of sample size n appears in the denominator, the standard deviation does decrease as sample size increases.

    Learn by Doing: Sampling Distribution (x-bar)

    Let’s compare and contrast what we now know about the sampling distributions for sample means and sample proportions.

    Sampling Distribution of the Sample Mean, x-bar (1)

    Now we will investigate the shape of the sampling distribution of sample means. When we were discussing the sampling distribution of sample proportions, we said that this distribution is approximately normal if np ≥ 10 and n(1 – p) ≥ 10. In other words, we had a guideline based on sample size for determining the conditions under which we could use normal probability calculations for sample proportions.

    When will the distribution of sample means be approximately normal? Does this depend on the size of the sample?

    It seems reasonable that a population with a normal distribution will have sample means that are normally distributed even for very small samples. We saw this illustrated in the previous simulation with samples of size 10.

    What happens if the distribution of the variable in the population is heavily skewed? Do sample means have a skewed distribution also? If we take really large samples, will the sample means become more normally distributed?

    In the next simulation, we will investigate these questions.

    Video

    Video: Simulation #4 (x-bar) (5:02)

    Did I Get This?: Simulation #4 (x-bar)

    To summarize, the distribution of sample means will be approximately normal as long as the sample size is large enough. This discovery is probably the single most important result presented in introductory statistics courses. It is stated formally as the Central Limit Theorem.

    We will depend on the Central Limit Theorem again and again in order to do normal probability calculations when we use sample means to draw conclusions about a population mean. We now know that we can do this even if the population distribution is not normal.

    How large a sample size do we need in order to assume that sample means will be normally distributed? Well, it really depends on the population distribution, as we saw in the simulation. The general rule of thumb is that samples of size 30 or greater will have a fairly normal distribution regardless of the shape of the distribution of the variable in the population.

    Applet: Sampling Distribution for a Sample Mean

    Comment:

    • For categorical variables, our claim that sample proportions are approximately normal for large enough n is actually a special case of the Central Limit Theorem. In this case, we think of the data as 0’s and 1’s and the “average” of these 0’s and 1’s is equal to the proportion we have discussed.

    Before we work some examples, let’s compare and contrast what we now know about the sampling distributions for sample means and sample proportions.

    Sampling Distribution of the Sample Mean, x-bar (2)

    Learn by Doing: Using the Sampling Distribution of x-bar

    EXAMPLE 10: Using the Sampling Distribution of x-bar

    Household size in the United States has a mean of 2.6 people and standard deviation of 1.4 people. It should be clear that this distribution is skewed right as the smallest possible value is a household of 1 person but the largest households can be very large indeed.

    (a) What is the probability that a randomly chosen household has more than 3 people?

    A normal approximation should not be used here, because the distribution of household sizes would be considerably skewed to the right. We do not have enough information to solve this problem.

    (b) What is the probability that the mean size of a random sample of 10 households is more than 3?

    By anyone’s standards, 10 is a small sample size. The Central Limit Theorem does not guarantee sample mean coming from a skewed population to be approximately normal unless the sample size is large.

    (c) What is the probability that the mean size of a random sample of 100 households is more than 3?

    Now we may invoke the Central Limit Theorem: even though the distribution of household size X is skewed, the distribution of sample mean household size (x-bar) is approximately normal for a large sample size such as 100. Its mean is the same as the population mean, 2.6, and its standard deviation is the population standard deviation divided by the square root of the sample size:

    \(\dfrac{\sigma}{\sqrt{n}}=\dfrac{1.4}{\sqrt{100}}=0.14\)

    To find

    \(P(\bar{x}>3)\)

    we standardize 3 to into a z-score by subtracting the mean and dividing the result by the standard deviation (of the sample mean). Then we can find the probability using the standard normal calculator or table.

    \(P(\bar{x}>3)=P\left(Z>\dfrac{3-2.6}{\dfrac{1.4}{\sqrt{100}}}\right)=P(Z>2.86)=0.0021\)

    Households of more than 3 people are, of course, quite common, but it would be extremely unusual for the mean size of a sample of 100 households to be more than 3.

    The purpose of the next activity is to give guided practice in finding the sampling distribution of the sample mean (x-bar), and use it to learn about the likelihood of getting certain values of x-bar.

    Learn by Doing: Using the Sampling Distribution of x-bar #2

    Did I Get This?: Using the Sampling Distribution of x-bar

    Sampling Distribution of the Sample Mean, x-bar (2024)

    FAQs

    What is the sampling distribution of X̅? ›

    For a variable x and a given sample size n, the distribution of the variable x̅(all possible sample means of size n) is called the sampling distribution of the mean. Note: The larger the sample size the smaller the sampling error tends to be in estimating a population mean, μ, by a sample mean x̅.

    How to find the mean of the sampling distribution of x bar? ›

    X-bar in statistics is a symbol for the sample mean. Given a sample of n observations of numbers, the sample mean is found by adding up all of the observations, then dividing by the total number of observations (n).

    What is x̅ in statistics? ›

    The x bar (x̄) symbol is used in statistics to represent the sample mean, or average, of a set of values. It's calculated by adding up all the numbers in the sample and then dividing by the number of values in that sample.

    What is the sampling distribution of x overbar? ›

    The sampling distribution of the sample mean "x overbarx" is the probability distribution of all possible values of the random variable "x overbarx" computed from a sample of size n from a population with mean muμ and standard deviation σ.

    How do you describe the sampling distribution of XBAR? ›

    If repeated random samples of a given size n are taken from a population of values for a quantitative variable, where the population mean is μ (mu) and the population standard deviation is σ (sigma) then the mean of all sample means (x-bars) is population mean μ (mu).

    How to find the sampling distribution? ›

    How to Find Sampling Distribution
    1. Draw Random Samples: Randomly select numerous samples of size n from the population. ...
    2. Calculate Sample Statistic: For each sample, calculate the desired statistic (e.g., mean).
    3. Determine the Difference: Calculate the difference between the sample means for each sample drawn.
    Sep 26, 2023

    What do you call the symbol X̅? ›

    X bar, x̄ (or X̄) or X-bar may refer to: X-bar theory, a component of linguistic theory. Arithmetic mean, a commonly used type of average.

    What does the following symbol refer to X̅? ›

    sample statisticpopulation parameterdescription
    x̅ “x-barμ “mu” or μxmean
    M or Med(none)median
    s (TIs say Sx)σ “sigma” or σxstandard deviation For variance, apply a squared symbol (s² or σ²).
    rρ “rho”coefficient of linear correlation
    3 more rows

    What measure of center does X̅ represent? ›

    The population mean is indicated by the Greek symbol µ (pronounced 'mu'). When the mean is calculated on a distribution from a sample it is indicated by the symbol x̅ (pronounced X-bar).

    Why is the sampling distribution of x-bar approximately normal? ›

    The sampling distribution of x is approximately normal because the population is normally distributed and the sample size is large enough.

    Is the mean of a sampling distribution equals x-bar the mean of a sample? ›

    Answer: True.

    Explanation: In simpler words, the probability distribution where the number of observations is large is referred to as the sampling distribution. Also, the mean of the sampling distribution of the statistical means and the mean of the sample population are always equal.

    What does distribution of x mean? ›

    Informally, we call the set of possible outcomes of a random variable X and associ- ated probabilities the distribution of X. We summarize this information with a probability distribution function when X is discrete, or a probability density function when X is continuous.

    What is the sampling distribution of the mean called? ›

    Mean. The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean μ, then the mean of the sampling distribution of the mean is also μ. The symbol μM is used to refer to the mean of the sampling distribution of the mean.

    What is the formula for calculating the standard deviation of the sampling distribution of X̅? ›

    To find the standard deviation of the sample mean (σ), divide the population standard deviation (σ) by the square root of the sample size (n): σ = σ/√n.

    What is the sampling distribution of the mean quizlet? ›

    The sampling distribution of the mean is defined as the probability distribution of means for all possible random samples of a given size from some population.

    What is the sampling distribution of Z mean? ›

    Sampling distribution of z :

    Specifically, suppose that we drew an infinite number of samples, each of size N . In each sample, we could compute the z statistic z=¯y−μ0σ/√N z = y ¯ − μ 0 σ / N . Different samples would give different z values. The distribution of all these z values is the sampling distribution of z .

    Top Articles
    Latest Posts
    Recommended Articles
    Article information

    Author: Gregorio Kreiger

    Last Updated:

    Views: 6277

    Rating: 4.7 / 5 (77 voted)

    Reviews: 92% of readers found this page helpful

    Author information

    Name: Gregorio Kreiger

    Birthday: 1994-12-18

    Address: 89212 Tracey Ramp, Sunside, MT 08453-0951

    Phone: +9014805370218

    Job: Customer Designer

    Hobby: Mountain biking, Orienteering, Hiking, Sewing, Backpacking, Mushroom hunting, Backpacking

    Introduction: My name is Gregorio Kreiger, I am a tender, brainy, enthusiastic, combative, agreeable, gentle, gentle person who loves writing and wants to share my knowledge and understanding with you.