If \(x\) represents a variable, then a linear transformation is an equation of the form:
\[y = a + b x\]
where \(a\), \(b\) are some numbers. For instance \(y = x + 10\), or \(y = 2x - 1\).
We think of \(y\) here as a new variable, and the equation tells us how to convert values of the one variable into values of the other variable.
Examples:
A linear transformation between two variables tells us how individual values transform to each other.
So for instance if we had the temperature in Fahrenheit, \(x=56\), then we can find the corresponding temperature in Celsius: \[y=\frac{1}{1.8}\times 56 - \frac{32}{1.8} = 13.333\]
But how measures of center or spread behave requires more thinking!
Behavior of variables under linear transformation
Assume \(y = a + bx\). Then we can observe the following relation in the properties between \(x\) and \(y\).
- shape
- stays the same (modes, skewness, outliers)
- center
Follows the same transformation (mean, median do that)
e.g. \(\bar y = a + b\bar x\)
- spread
Only follows the multiplier (std. dev., IQR do that)
e.g. \(s_y = b s_x\).
aspect | multiplication by \(b\) | addition of \(a\) |
---|---|---|
shape center spread | ignores/unaffected affected affected | ignores/unaffected affected ignores/unaffected |
Practice: If some temperatures have a mean of \(67\) degrees F, and standard deviation of \(5\) degrees F, how would the corresponding temperatures in C behave?
Standardized scores, also called \(z\)-scores, are given by the following linear transformation:
\[z = \frac{x-\bar x}{s_x}\]
Alternatively, they relate to \(x\) via:
\[x = \bar x + s_x z\]
Key properties:
In the Behavioral Survey data we have examined, let us consider as \(x\) the height variable. The variable has a mean \(\bar x=67.18\) and a standard deviation \(s_x = 4.126\).