Machine Learning
Dimensionality Reduction
Techniques that reduce the number of features (dimensions) in a dataset while preserving the most important information. This makes data easier to visualize, speeds up training, and can improve model performance.
Why It Matters
High-dimensional data is hard to work with and can cause the 'curse of dimensionality.' Dimensionality reduction makes ML practical for datasets with thousands of features.
Example
Reducing a dataset with 1,000 gene expression features down to 50 principal components that capture 95% of the variation, making it feasible to cluster patient groups.
Think of it like...
Like summarizing a 500-page book into a 10-page overview — you lose some detail but keep the essential information that matters most.