University of the District of Columbia CH8 Process of Obtaining a Sample Size HW Businesses collect data to eventually improve their products and services. It is critical to have a sampling process which is statistically sound, so that the results from the subsequent analysis will also be sound. This chapter provides a sampling exercise which will be similar to exercises the student will experience on the job.Case 6 is on page 114. Review chapter 6 and consult the additional material uploaded prior to attempting it

Chapter 7

Sampling and Sampling Distributions

? Selecting a Sample

? Point Estimation

? Introduction to Sampling Distributions

? Sampling Distribution of ???

? Sampling Distribution of ???

? Other Sampling Methods

Introduction

An element is the entity on which data are collected.

A population is a collection of all the elements of

interest.

A sample is a subset of the population.

The sampled population is the population from

which the sample is drawn.

A frame is a list of the elements that the sample will

be selected from.

Introduction

The reason we select a sample is to collect data to

answer a research question about a population.

The sample results provide only estimates of the

values of the population characteristics.

The reason is simply that the sample contains only

a portion of the population.

With proper sampling methods, the sample results

can provide good estimates of the population

characteristics.

Selecting a Sample

Sampling from a Finite Population

Sampling from an Infinite Population

Sampling from a Finite Population

Finite populations are often defined by lists such as:

Organization membership roster

Credit card account numbers

Inventory product numbers

A simple random sample of size n from a finite

population of size N is a sample selected such that

each possible sample of size n has the same probability

of being selected.

Sampling from a Finite Population

? Replacing each sampled element before selecting

subsequent elements is called sampling with

replacement.

? Sampling without replacement is the procedure

used most often.

? In large sampling projects, computer-generated

random numbers are often used to automate the

sample selection process.

Sampling from a Finite Population

Example: St. Andrews College

St. Andrews College received 900 applications for

admission in the upcoming year from prospective

students. The applicants were numbered, from 1 to

900, as their applications arrived. The Director of

Admissions would like to select a simple random

sample of 30 applicants.

Sampling from a Finite Population

Example: St. Andrews College

Step 1: Assign a random number to each of the 900

applicants.

The random numbers generated by Excels

RAND function follow a uniform probability

distribution between 0 and 1.

Step 2: Select the 30 applicants corresponding to the

30 smallest random numbers.

Sampling from a Finite Population Using Excel

Excel Formula Worksheet

A

Applicant

1

Number

2

1

3

2

4

3

5

4

6

5

7

6

8

7

9

8

B

Random

Number

=RAND()

=RAND()

=RAND()

=RAND()

=RAND()

=RAND()

=RAND()

=RAND()

Note: Rows 10-901 are not shown.

Sampling from a Finite Population Using Excel

Excel Value Worksheet

A

B

Applicant Random

1

Number Number

2

1

0.61021

3

2

0.83762

4

3

0.58935

5

4

0.19934

6

5

0.86658

7

6

0.60579

8

7

0.80960

9

8

0.33224

Note: Rows 10-901 are not shown.

Sampling from a Finite Population Using Excel

Put Random Numbers in Ascending Order

Step 1

Step 2

Step 3

Step 4

Select any cell in the range B2:B901

Click the Home tab on the Ribbon

In the Editing group, click Sort & Filter

Choose Sort Smallest to Largest

Sampling from a Finite Population Using Excel

Excel Value Worksheet (Sorted)

A

B

Applicant Random

1 Number Number

2

12

0.00027

3

773

0.00192

4

408

0.00303

5

58

0.00481

6

116

0.00538

7

185

0.00583

8

510

0.00649

9

394

0.00667

Note: Rows 10-901 are not shown.

Sampling from an Infinite Population

? Sometimes we want to select a sample, but find it is

not possible to obtain a list of all elements in the

population.

? As a result, we cannot construct a frame for the

population.

? Hence, we cannot use the random number selection

procedure.

? Most often this situation occurs in infinite population

cases.

Sampling from an Infinite Population

Populations are often generated by an ongoing

process where there is no upper limit on the number of

units that can be generated.

? Some examples of on-going processes, with infinite

populations, are:

parts being manufactured on a production line

transactions occurring at a bank

telephone calls arriving at a technical help desk

customers entering a store

Sampling from an Infinite Population

? In the case of an infinite population, we must select

a random sample in order to make valid statistical

inferences about the population from which the

sample is taken.

A random sample from an infinite population is a

sample selected such that the following conditions

are satisfied.

Each element selected comes from the population

of interest.

Each element is selected independently.

Point Estimation

Point estimation is a form of statistical inference.

In point estimation we use the data from the sample

to compute a value of a sample statistic that serves

as an estimate of a population parameter.

We refer to ??? as the point estimator of the population

mean ?.

s is the point estimator of the population standard

deviation ?.

??? is the point estimator of the population proportion p.

Point Estimation

Example: St. Andrews College

Recall that St. Andrews College received 900

applications from prospective students. The

application form contains a variety of information

including the individuals Scholastic Aptitude Test

(SAT) score and whether or not the individual desires

on-campus housing.

At a meeting in a few hours, the Director of

Admissions would like to announce the average SAT

score and the proportion of applicants that want to

live on campus, for the population of 900 applicants.

Point Estimation

Example: St. Andrews College

However, the necessary data on the applicants have

not yet been entered in the colleges computerized

database. So, the Director decides to estimate the

values of the population parameters of interest based

on sample statistics. The sample of 30 applicants is

selected using computer-generated random numbers.

Point Estimation Using Excel

Excel Value Worksheet (Sorted)

A

B

C

Applicant Random

SAT

Score

1 Number Number

2

12

0.00027

1207

3

773

0.00192

1143

4

408

0.00303

1091

5

58

0.00481

1108

6

116

0.00538

1227

7

185

0.00583

982

8

510

0.00649

1363

9

1108

394

0.00667

D

On-Campus

Housing

No

Yes

Yes

No

Yes

Yes

Yes

No

Note: Rows 10-31 are not shown.

Point Estimation

??? as Point Estimator of ?

50,520

??? =

=

= 1684

30

30

? ????

s as Point Estimator of ?

??=

?(???? ? ??)? 2

=

29

210,512

= 85.2

29

??? as Point Estimator of p

??? = 20?30 = .67

Note: Different random numbers would have

identified a different sample which would have

resulted in different point estimates.

Point Estimation

Once all the data for the 900 applicants were entered

in the colleges database, the values of the population

parameters of interest were calculated.

Population Mean SAT Score

? ????

??=

= 1697

900

Population Standard Deviation for SAT Score

??=

?(???? ???)2

= 87.4

900

Population Proportion Wanting On-Campus Housing

?? = 648/900 = .72

Summary of Point Estimates

Obtained from a Simple Random Sample

Population

Parameter

Parameter

Value

Point

Estimator

Point

Estimate

? = Population mean

1697

??? = Sample mean

SAT score

1684

87.4

s = Sample standard deviation

for SAT score

85.2

.72

??? = Sample proportion wanting

campus housing

SAT score

? = Population std.

deviation for

SAT score

p = Population proportion wanting

campus housing

.67

Practical Advice

The target population is the population we want to

make inferences about.

The sampled population is the population from

which the sample is actually taken.

Whenever a sample is used to make inferences

about a population, we should make sure that the

targeted population and the sampled population

are in close agreement.

Sampling Distribution of ???

Process of Statistical Inference

Population

with mean

?=?

The value of ??? is used to

make inferences about

the value of ?.

A simple random sample

of n elements is selected

from the population.

The sample data

provide a value for

the sample mean ??? .

Sampling Distribution of ???

The sampling distribution of ??? is the probability

distribution of all possible values of the sample

mean ??.?

Expected Value of ???

E(??)? = ?

where: ? = the population mean

When the expected value of the point estimator

equals the population parameter, we say the point

estimator is unbiased.

Sampling Distribution of ???

Standard Deviation of ???

We will use the following notation to define the

standard deviation of the sampling distribution of ??.?

????? = the standard deviation of ???

? = the standard deviation of the population

n = the sample size

N = the population size

Sampling Distribution of ???

Standard Deviation of ???

Finite Population

????? =

?????

???1

??

??

Infinite Population

??

????? =

??

A finite population is treated as being infinite if

n/N < .05.
(?? ? ??)/(?? ? 1) is the finite population correction factor.
correction factor.
????? is referred to as the standard error of the mean.
Sampling Distribution of ???
When the population has a normal distribution, the
sampling distribution of ??? is normally distributed
for any sample size.
In most applications, the sampling distribution of ???
can be approximated by a normal distribution
whenever the sample is size 30 or more.
In cases where the population is highly skewed or
outliers are present, samples of size 50 may be
needed.
Sampling Distribution of ???
The sampling distribution of ??? can be used to
provide probability information about how close
the sample mean ??? is to the population mean ? .
Central Limit Theorem
When the population from which we are selecting
a random sample does not have a normal distribution,
the central limit theorem is helpful in identifying the
shape of the sampling distribution of ??.?
CENTRAL LIMIT THEOREM
In selecting random samples of size n from a
population, the sampling distribution of the sample
mean ??? can be approximated by a normal
distribution as the sample size becomes large.
Sampling Distribution of ???
Example: St. Andrews College
Sampling
Distribution
of ??? for
SAT Scores
????? =
??
??
=
87.4
30
= 15.96
???
?? ??? = 1697
Sampling Distribution of ???
Example: St. Andrews College
What is the probability that a simple random
sample of 30 applicants will provide an estimate of
the population mean SAT score that is within +/-10
of the actual population mean ? ?
In other words, what is the probability that ??? will
be between 1687 and 1707?
Sampling Distribution of ???
Example: St. Andrews College
Step 1: Calculate the z-value at the upper endpoint of
the interval.
z = (1707 - 1697)/15.96 = .63
Step 2: Find the area under the curve to the left of the
upper endpoint.
P(z < .63) = .7357
