# Basic Statistics – Make Inferences from a Sample

- Posted by Brian Stocker
- Date October 17, 2017
- Comments 0 comment

### Making Inferences from a Sample

Making inferences from a sample, or statistical inference is the process of using data analysis to infer properties of a population, for example by testing hypotheses and making estimates. It is assumed that the observed data set is sampled from a larger population.

### Definitions

**Population** – All members of the studied group

**Sample** – A portion of the studied group is used to represent the entire population

**Random** – Every member of the studied group has equal chance at selection

**Census** – Every member of the studied group is included

**Bias** – If the sample does not adequately represent the population

**Error** – Degree to which the results of from the sample are different from the actual results of the population

**Outlier** – A value that is far larger or smaller than most

**Mode** – Most commonly occurring value of a data set

**Median** – Middle value of a data set

**Mean** – Average value of a data set

**Range** – Distance between the least and greatest values of a data set

### Determine Appropriate Sampling

The purpose of a sample is to gather information about a population. It can become very costly (time, money, effort) to study every member of a population, especially if there are many members in the population group or if they are difficult to study. A sample (smaller portion) of the population can be studied, but what is saved in costs is accompanied by a possible decrease in the accuracy of results. Larger samples (relative to population) increase the certainty that the results truly represent the population, as they decrease the effect of outliers on the overall data.

Random sampling is commonly recommended for statistical purposes. However, most samples are not truly random, as some members of a population are typically easier to study than others. Some common sampling techniques include cluster (members are assigned groups, and then one or more entire group is selected to represent the whole population), stratified (members are assigned groups, then a specific number or percent is selected from each group), systematic (applying a rule to determine the sample group – counting the nth member), and convenience (easiest-to-get members are selected).

**Ex: Jacob’s high school has 300 males and 250 females. Jacob wants to determine the average shoe size in his high school for a statistics project. Which description of the population is best?**

a. High school students

b. Elementary school students

c. Students at Jacob’s high school

d. Male students

**Correct Answer:** C

**Ex: Jacob’s teacher said his sample should include about 25-30 students. Which sample group is best?**

a. Members of the Jacob’s high school football team

b. Every 5th high school student as they enter Jacob’s school

c. Jacob’s high school girls’ volleyball team

d. The students in Jacob’s 2^{nd} period class

**Correct Answer:** B

**Ex: Explain a potential problem with selecting every 5 ^{th} student as Jacob’s sample. **

-Not every student has a chance (every 1-4 students have no chance)

-Jacob could get a sample that is not representative (too many males or females, too many freshmen, etc.)

### Apply Measures of a Sample to a Population

As students enter the building, Jacob asks the shoe size of every 5^{th} high school student. He recorded the responses in a table:

12 | 14 | 6 | 5 | 11 | 15 | 8 | 13 |

8 | 8 | 15 | 7 | 13 | 9 | 12 | 7 |

13 | 10 | 12 | 14 | 5 | 8 | 9 | 10 |

Ex. Put the responses in order, from smallest to largest.

5-5-7-7-8-8-8-8-9-9-9-10-10-11-12-12-12-13-13-13-14-14-15-15

**Ex. Determine the mode, median, mean, and range.**

Mode: 8

Median: 10

Mean: 247/24=10.29

Range: 15-5=10

**Ex. Jacob uses his data to make a statement about the population. Which statement is best? Which statement is worst?**

a. No high school student has a size 6 shoe.

b. Most students have a size bigger than 10.

c. The average shoe size of the population is between 10 and 11.

d. Females have bigger shoe sizes than males.

Choice C is the best statement and Choice D is the worst.

**Explanation:
**

A – This statement is supported by the sample data, but having values above and below indicates that a larger sample would include that value.

B – This statement is not supported by the sample data (11 values were larger than 10, and 13 values were not larger than 10). However, it is close enough that a larger sample could support this statement.

C – This statement is best because it is supported by the sample, and it is unlikely that a larger sample would shift the average significantly.

D – This statement is worst because no data was collected about gender, so no statement can be made and supported.

### More Basic Statistics Practice

**Written by**, Brian Stocker MA., Complete Test Preparation Inc.

**Date Published:**Tuesday, October 17th, 2017

**Date Modified:**Friday, January 13th, 2023

Got a Question? Email me anytime - Brian@test-preparation.ca

### You may also like

The probability of an event is given by – The Number Of Ways Event A Can Occur The total number Of Possible Outcomes So for example if there are 4 red balls and 3 yellow balls in a bag, the probability …