Basic Statistics – Make Inferences from a Sample

Population – All members of the studied group

Sample – A portion of the studied group is used to represent the entire population

Random – Every member of the studied group has equal chance at selection

Census – Every member of the studied group is included

Bias – If the sample does not adequately represent the population

Error – Degree to which the results of from the sample are different from the actual results of the population

Outlier – A value that is far larger or smaller than most

Mode – Most commonly occurring value of a data set

Median – Middle value of a data set

Mean – Average value of a data set

Range – Distance between the least and greatest values of a data set

sample data photo — Photo by NASA on The Commons

The purpose of a sample is to gather information about a population. It can become very costly (time, money, effort) to study every member of a population, especially if there are many members in the population group or if they are difficult to study. A sample (smaller portion) of the population can be studied, but what is saved in costs is accompanied by a possible decrease in the accuracy of results. Larger samples (relative to population) increase the certainty that the results truly represent the population, as they decrease the effect of outliers on the overall data.

Random sampling is commonly recommended for statistical purposes. However, most samples are not truly random, as some members of a population are typically easier to study than others. Some common sampling techniques include cluster (members are assigned groups, and then one or more entire group is selected to represent the whole population), stratified (members are assigned groups, then a specific number or percent is selected from each group), systematic (applying a rule to determine the sample group – counting the nth member), and convenience (easiest-to-get members are selected).

Ex: Jacob’s high school has 300 males and 250 females. Jacob wants to determine the average shoe size in his high school for a statistics project. Which description of the population is best?

a. High school students
b. Elementary school students
c. Students at Jacob’s high school
d. Male students

Correct Answer: C

Ex: Jacob’s teacher said his sample should include about 25-30 students. Which sample group is best?

a. Members of the Jacob’s high school football team
b. Every 5th high school student as they enter Jacob’s school
c. Jacob’s high school girls’ volleyball team
d. The students in Jacob’s 2^nd period class

Correct Answer: B

Ex: Explain a potential problem with selecting every 5^th student as Jacob’s sample.

-Not every student has a chance (every 1-4 students have no chance)
-Jacob could get a sample that is not representative (too many males or females, too many freshmen, etc.)

As students enter the building, Jacob asks the shoe size of every 5^th high school student. He recorded the responses in a table:

12	14	6	5	11	15	8	13
8	8	15	7	13	9	12	7
13	10	12	14	5	8	9	10

Ex. Put the responses in order, from smallest to largest.

5-5-7-7-8-8-8-8-9-9-9-10-10-11-12-12-12-13-13-13-14-14-15-15

Ex. Determine the mode, median, mean, and range.

Mode: 8
Median: 10
Mean: 247/24=10.29
Range: 15-5=10

Ex. Jacob uses his data to make a statement about the population. Which statement is best? Which statement is worst?

a. No high school student has a size 6 shoe.
b. Most students have a size bigger than 10.
c. The average shoe size of the population is between 10 and 11.
d. Females have bigger shoe sizes than males.

Choice C is the best statement and Choice D is the worst.

Explanation:

A – This statement is supported by the sample data, but having values above and below indicates that a larger sample would include that value.
B – This statement is not supported by the sample data (11 values were larger than 10, and 13 values were not larger than 10). However, it is close enough that a larger sample could support this statement.
C – This statement is best because it is supported by the sample, and it is unlikely that a larger sample would shift the average significantly.
D – This statement is worst because no data was collected about gender, so no statement can be made and supported.

Confusing Correlation with Causation Don’t assume that simply because two variables occur together, that one causes the other.
Overgeneralization Drawing broad conclusions from a small or non-representative sample is going to give a wrong answer.
Ignoring Sample Bias Is the sample representative of the population? If not the answer will be incorrect.
Neglecting other Variables Don’t neglect other variables that could influence the results.
Misunderstanding Confidence Intervals A 95% confidence interval does not mean there is a 95% chance the true value lies within the interval – It means there is a range within which the true value is expected to fall in 95% of the time in similar samples.
Cherry-Picking Data Selecting only data that supports a hypothesis while ignoring data that contradicts it.
Watch out for Sampling Error Misinterpreting the natural variability in sampling, leads to overconfidence in the results and incorrect answers.
Variability Samples will vary and results from a sample may not be replicable.
Sample Size is important Small samples lead to less reliable inferences and results.
Outliers Outliers can have a huge effect on the results and lead to incorrect inferences about the population.

2 Comments

Lloyd

October 19, 2023

Why is the answer C not A? They both seem correct

Brian Stocker

November 23, 2023

Choice C is the best answer. For choice A, because there are values above and below 6, a larger sample would include 6. For choice C, it is unlikely that a larger sample would change the average by a lot.

Statistics

Basic Statistics – Make Inferences from a Sample

Making Inferences from a Sample

Definitions

Determine Appropriate Sampling

Apply Measures of a Sample to a Population

Common Mistakes Answering Inference Questions on a Test

More Basic Statistics Practice

Basic Geometry - How to find the Area of Complex Shapes

Scientific Reasoning Practice Questions

2 Comments

Leave A Reply Cancel reply

Statistics

Making Inferences from a Sample

Definitions

Determine Appropriate Sampling

Apply Measures of a Sample to a Population

Common Mistakes Answering Inference Questions on a Test

More Basic Statistics Practice

Basic Geometry - How to find the Area of Complex Shapes

Scientific Reasoning Practice Questions

You may also like

Basic Probability and Statistics Quick Review and Practice Questions

2 Comments

Leave A Reply Cancel reply