XGBoost
Extreme Gradient Boosting — an optimized implementation of gradient boosting that is fast, accurate, and the most winning algorithm in machine learning competitions on tabular data.
Why It Matters
XGBoost is the default algorithm for structured/tabular data problems. It consistently outperforms other methods on business datasets.
Example
A fintech company using XGBoost to predict loan defaults, leveraging its built-in handling of missing values and feature importance rankings.
Think of it like...
Like a Swiss watch of ML algorithms — precision-engineered, reliable, and consistently performs at the highest level for its intended purpose.
Related Terms
Gradient Boosting
An ensemble technique that builds models sequentially, where each new model focuses on correcting the errors made by previous models. It combines many weak learners into a single strong learner.
LightGBM
Light Gradient Boosting Machine — Microsoft's gradient boosting framework optimized for speed and efficiency. LightGBM uses histogram-based splitting and leaf-wise growth for faster training.
CatBoost
A gradient boosting library by Yandex that handles categorical features natively without requiring manual encoding. CatBoost also addresses prediction shift and target leakage.
Random Forest
An ensemble learning method that builds multiple decision trees during training and outputs the majority vote (classification) or average prediction (regression) of all the trees. The 'forest' of diverse trees is more robust than any single tree.
Ensemble Learning
A strategy that combines multiple models to produce better predictions than any single model alone. Ensemble methods leverage the diversity of different models to reduce errors.