Random Vector
Definition
A random vector is a collection of random variables:
\[ X = (X^{(1)}, X^{(2)}, \dots, X^{(p)}) \]
where each component \(X^{(j)}\) is a random variable.
It represents a single observation with multiple features.
Interpretation
A random vector models the outcome of one observation generated by a Data Generating Process (DGP).
- Each component corresponds to a feature
- All components are generated together
- Their joint behavior is described by a distribution \(F\)
\[ X \sim F \]
Example
In a coffee dataset, one observation may include:
- rating
- acidity
- body
We represent this as:
\[ X = (\text{rating}, \text{acidity}, \text{body}) \]
Each component is a random variable, and together they form a random vector.
From Random Vector to Dataset
In practice, we observe multiple realizations of \(X\):
\[ X_1, X_2, \dots, X_n \]
- Each \(X_i\) is a random vector
- Each realization \(x_i\) is a row in the dataset
So:
- Rows → realizations of random vectors
- Columns → components of the random vector
Connection to Random Sample
A random sample is a collection of random vectors:
\[ X_1, \dots, X_n \overset{i.i.d.}{\sim} F \]
This means:
- Each observation has the same distribution
- Observations are independent
Key Idea
A random vector represents:
- One observation
- Multiple features
- Joint randomness across variables
One-line Summary
A random vector is a collection of random variables representing a single observation with multiple features, generated by the same data-generating process.