Instructor Embedding
An embedding approach where you provide instructions that describe the task alongside the text, producing task-specific embeddings from a single model.
Why It Matters
Instructor embeddings produce better embeddings by understanding the task context — the same text gets different embeddings for search versus classification.
Example
Embedding 'Apple released new products' with instruction 'Represent this for finding technology news' produces a different vector than 'Represent this for finding fruit-related content.'
Think of it like...
Like a translator who adjusts their translation based on the audience — technical language for engineers, simple language for consumers, from the same source text.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Embedding Model
A specialized model designed to convert text, images, or other data into vector embeddings. Embedding models are optimized for producing meaningful numerical representations rather than generating text.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Bi-Encoder
A model that independently encodes two texts into separate vectors, then compares them using a similarity metric like cosine similarity. Bi-encoders are fast because vectors can be pre-computed.