Principal Component Analysis
A dimensionality reduction technique that transforms data into a new coordinate system where the first axis captures the most variance, the second axis the next most, and so on.
Why It Matters
PCA reduces data complexity while preserving the most important information. It is essential for visualization, noise reduction, and speeding up downstream models.
Example
Reducing a dataset with 100 features down to 10 principal components that capture 95% of the total variance, making it much easier to visualize and model.
Think of it like...
Like summarizing a semester of lecture notes into a study guide — you identify the most important themes and discard the redundant details.
Related Terms
Dimensionality Reduction
Techniques that reduce the number of features (dimensions) in a dataset while preserving the most important information. This makes data easier to visualize, speeds up training, and can improve model performance.
Unsupervised Learning
A type of machine learning where the model learns patterns from unlabeled data without being told what the correct output should be. The algorithm discovers hidden structures, groupings, or patterns in the data on its own.