Sections 1.1, 1.2
Key Terms:
Variables are arranged in two main types:
Also called numerical or quantitative variables. Their values are numbers in a range, in some specific unit of measurement. A defining characteristic of scalar variables is that it makes sense to form averages. Examples: GPA, height in feet, income in USD.
Scalar variables are often broken into two types: continuous and discrete. For continuous variables all real numbers in some range are possible values. For discrete variables, only certain numbers are allowed. For instance the number of family dependents, or a county’s population, can be thought of as discrete. A person’s height would be continuous.
Also called qualitative. They classify the individuals in groups. Examples: Gender, Grade.
Categorical variables are further divided into Nominal and Ordinal, depending on whether the different categories have a natural order to them or not. Gender would thus be nominal, Grade would be ordinal.
Activity: Here is a number of different variables we could measure on students. Assign a type to each variable.
Height (in) | Grade (F/S/J/Sr) | Any siblings | Hrs sleep/night |
No units in term | Primary major | Hrs study/week | Days drink/week |
Overall GPA | GPA bracket | On probation |
Relationships between variables. Usually multiple variables are measured on the same individuals. We can then ask how these variables relate to each other. This will be an important component of the class.
Two variables are called dependent, or associated, or related, if they show some connection to each other; in other words if knowing the value of the one variable on an individual can give us some information about the value of the other variable on that individual.
An example for instance would be average GPA and gender. In general female students tend to have a higher GPA than male students. So knowing a specific student’s gender gives us some information about their GPA, but it will not completely determine their GPA.
When variables are quantitative and associated, then we can talk about a positive or negative association. In a positive association the large values of the one variable tend to be paired up with the large values of the other variable, and the small values of the one variable tend to be paired up with the small values of the other variable. For a negative association it is the other way around.