Data Science

Data Labeling

The process of assigning meaningful tags or annotations to raw data so it can be used for supervised learning. Labels tell the model what the correct answer should be for each training example.

Why It Matters

Labeled data is the fuel for supervised learning. The quality and consistency of labels directly determine model accuracy.

Example

Human annotators reviewing thousands of images and drawing bounding boxes around pedestrians for autonomous vehicle training, or marking emails as spam or not-spam.

Think of it like...

Like a teacher creating an answer key for a test — students (models) need the correct answers to learn from their mistakes.

Related Terms