| all-MiniLM-L6-v2 |
~22M |
384 |
~60-62 |
Lightweight, fast, general-purpose |
Less nuanced semantics |
| all-mpnet-base-v2 |
~110M |
768 |
~65 |
Strong performance, balanced |
Slower than smaller models |
| text-embedding-3-small |
N/A (API-based) |
1536 |
~65-67 |
Excellent semantics, easy to use |
API cost, not local |
| text-embedding-3-large |
N/A (API-based) |
3072 |
~67-70 |
Top-tier accuracy |
Higher cost/latency |
| intfloat/e5-small-v2 |
~33M |
384 |
~62 |
Fast, retrieval-optimized |
Slightly less general |
| intfloat/e5-large-v2 |
~335M |
1024 |
~66 |
Strong semantics, multilingual |
Larger, slower |
| BAAI/bge-small-en-v1.5 |
~33M |
384 |
~63 |
Competitive, fast |
English-focused |
| BAAI/bge-large-en-v1.5 |
~335M |
1024 |
~66-67 |
Near SOTA, great for RAG |
Resource-intensive |
| thenlper/gte-small |
~33M |
384 |
~62 |
Versatile, solid performance |
Less specialized |
| thenlper/gte-large |
~335M |
1024 |
~65-66 |
High accuracy, good for RAG |
Slower inference |
| facebook/dpr-ctx_encoder |
~110M |
768 |
~60-62 |
Tailored for retrieval |
Older, outperformed by newer |