Docs/Knowledge Layer

Knowledge Layer

The Knowledge Layer is the search backbone of Portiere's concept mapping pipeline. It indexes standard vocabularies (SNOMED CT, LOINC, RxNorm, ICD-10, etc.) and retrieves the best candidate concepts for each source code during mapping. Portiere ships with nine backends that can be used independently or combined for maximum accuracy.


Table of Contents

  1. Backend Overview
  2. BM25s (Default)
  3. FAISS
  4. Elasticsearch
  5. ChromaDB
  6. PGVector
  7. MongoDB
  8. Qdrant
  9. Milvus
  10. Hybrid
  11. Embedding Models
  12. Cross-Encoder Reranking
  13. VocabularyBridge -- Cross-Vocabulary Mapping
  14. Configuration Reference
  15. Performance Comparison

Backend Overview

BackendTypeDependenciesOfflineBest For
bm25sSparse (BM25)None (pure Python)YesSmall-to-medium vocabularies (<1M)
faissDense (vectors)faiss-cpu/gpu, sentence-transformersYesHigh-accuracy semantic search
elasticsearchSparse + fuzzyRunning ES clusterNoExisting ES infrastructure
hybridDense + Sparsefaiss + ES/BM25sDependsMaximum recall and precision
chromadbDense (vectors)chromadbYesEmbedded vector DB, simple setup
pgvectorDense (vectors)psycopg, pgvectorNoTeams using PostgreSQL
mongodbDense (vectors)pymongoNoTeams using MongoDB Atlas
qdrantDense (vectors)qdrant-clientDependsHigh-perf vector search, production
milvusDense (vectors)pymilvusDependsBillion-scale distributed vectors

All backends are configured through the KnowledgeLayerConfig model and are interchangeable at runtime. Switching backends requires only a configuration change -- no code modifications.


BM25s (Default)

BM25s is a pure-Python BM25 implementation that requires zero external dependencies. It tokenizes concept names and descriptions, builds an inverted index, and scores candidates using the Okapi BM25 ranking function.

When to Use

  • Getting started quickly with no infrastructure setup
  • Working offline or in air-gapped environments
  • Vocabulary size under 1 million concepts
  • Keyword-level matching is sufficient (exact or near-exact term overlap)

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="bm25s",
        bm25s_corpus_path="/path/to/bm25s_corpus/",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions using build_knowledge_layer().

Limitations

  • No semantic understanding -- relies on token overlap
  • Performance degrades on paraphrased or abbreviated source terms
  • Not recommended for vocabularies exceeding 1 million concepts (memory and latency)

FAISS

FAISS (Facebook AI Similarity Search) provides dense vector search using sentence-transformer embeddings. Source terms and vocabulary concepts are encoded into high-dimensional vectors, and nearest-neighbor search retrieves the most semantically similar candidates.

When to Use

  • High-accuracy semantic matching is required
  • Source data contains abbreviations, misspellings, or paraphrased terms
  • You need to capture synonymy and relatedness beyond keyword overlap

Dependencies

pip install portiere-health[faiss]
# or for GPU acceleration:
pip install portiere-health[faiss-gpu]

This installs faiss-cpu (or faiss-gpu) and sentence-transformers.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="faiss",
        faiss_index_path="/path/to/faiss.index",
        faiss_metadata_path="/path/to/faiss_metadata.json",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions using build_knowledge_layer().

GPU Acceleration

For large vocabularies (>1M concepts), GPU-accelerated FAISS significantly reduces both index build time and query latency:

import faiss

# Move an existing index to GPU
cpu_index = faiss.read_index("/path/to/faiss.index")
gpu_resource = faiss.StandardGpuResources()
gpu_index = faiss.index_cpu_to_gpu(gpu_resource, 0, cpu_index)

Elasticsearch

The Elasticsearch backend delegates search to an existing ES cluster. It supports keyword matching, fuzzy search, and custom analyzers for biomedical text.

When to Use

  • Your organization already operates an Elasticsearch cluster
  • You need fuzzy matching, stemming, or custom tokenization
  • You want to search across multiple vocabulary fields (name, synonyms, descriptions)

Dependencies

An accessible Elasticsearch cluster (version 7.x or 8.x) is required. No additional Python packages beyond the base SDK are needed.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="elasticsearch",
        elasticsearch_url="http://localhost:9200",
        elasticsearch_index="portiere_concepts",
    )
)

Setting up the index? See the Elasticsearch Backend guide for detailed index setup and custom analyzers.


ChromaDB

ChromaDB is an embedded vector database that stores embeddings locally with minimal setup. It handles embedding storage, indexing, and nearest-neighbor search in a single lightweight package, making it an excellent choice for local development and small-to-medium deployments.

When to Use

  • You want a simple embedded vector store with zero infrastructure
  • Local development and prototyping with persistent storage
  • Projects that need a self-contained vector database without running external services
  • Vocabulary size under 5 million concepts

Dependencies

pip install portiere-health[chromadb]

This installs the chromadb package.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="chromadb",
        chroma_collection="portiere_concepts",
        chroma_persist_path="./data/chroma/",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions.


PGVector

PGVector extends PostgreSQL with vector similarity search capabilities. If your team already runs PostgreSQL, PGVector lets you store and search embeddings alongside your relational data without introducing a separate vector database.

When to Use

  • Your team already uses PostgreSQL and wants to avoid adding another database
  • You want vectors and relational data in the same database for transactional consistency
  • Moderate-scale deployments (up to a few million concepts)

Dependencies

pip install portiere-health[pgvector]

This installs psycopg and pgvector. You must also install the pgvector extension in your PostgreSQL instance (CREATE EXTENSION vector;).

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="pgvector",
        pgvector_connection_string="postgresql://user:pass@localhost:5432/portiere",
        pgvector_table="concept_embeddings",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions.


MongoDB

The MongoDB backend uses MongoDB Atlas Vector Search to store and query embeddings. It is well-suited for teams already using MongoDB who want to add semantic search without introducing a separate vector database.

When to Use

  • Your team already uses MongoDB Atlas
  • You want to combine document storage with vector search in a single platform
  • You need flexible schema alongside vector queries

Dependencies

pip install portiere-health[mongodb]

This installs the pymongo package. You must have a MongoDB Atlas cluster with Vector Search enabled, or a self-hosted MongoDB 7.0+ instance with Atlas Search configured.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="mongodb",
        mongodb_connection_string="mongodb+srv://user:pass@cluster.mongodb.net/",
        mongodb_database="portiere",
        mongodb_collection="concept_embeddings",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions.


Qdrant

Qdrant is a high-performance vector search engine built for production workloads. It supports filtering, payload indexing, and horizontal scaling, making it a strong choice for large-scale production deployments.

When to Use

  • Production deployments requiring high throughput and low latency
  • You need advanced filtering (e.g., search within a specific vocabulary or domain)
  • Large vocabularies (millions of concepts) where performance matters
  • You want a managed cloud option (Qdrant Cloud) or self-hosted flexibility

Dependencies

pip install portiere-health[qdrant]

This installs the qdrant-client package.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="qdrant",
        qdrant_url="http://localhost:6333",
        qdrant_collection="portiere_concepts",
        qdrant_api_key=None,  # set for Qdrant Cloud
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions.


Milvus

Milvus is a distributed vector database designed for billion-scale similarity search. It offers GPU acceleration, horizontal scaling, and is well-suited for the largest vocabulary deployments.

When to Use

  • Billion-scale vocabularies requiring distributed indexing
  • GPU-accelerated vector search is needed
  • Horizontal scaling across multiple nodes is required
  • You need a battle-tested open-source vector database for massive datasets

Dependencies

pip install portiere-health[milvus]

This installs the pymilvus package.

Configuration

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="milvus",
        milvus_uri="http://localhost:19530",
        milvus_collection="portiere_concepts",
    )
)

Building the index? See Building the Knowledge Layer for step-by-step instructions.


Hybrid

The Hybrid backend combines multiple search backends and fuses the results using Reciprocal Rank Fusion (RRF) or weighted score combination. This approach consistently produces the highest recall and precision across diverse source data quality levels.

The hybrid_backends configuration field lets you explicitly specify which backends to combine. Any combination of backends is supported -- you are not limited to the traditional dense + sparse pairing.

When to Use

  • Maximum mapping accuracy is the priority
  • Source data quality varies (mix of clean terms, abbreviations, and misspellings)
  • You can afford the additional infrastructure and compute cost

How RRF Works

Reciprocal Rank Fusion merges two or more ranked lists without requiring score normalization. For each candidate concept appearing in any result list, the fused score is:

RRF_score(c) = sum(1 / (k + rank_i(c))) for each backend i

where k is a smoothing constant (default 60). Candidates that rank highly in multiple lists receive the highest fused scores, while candidates appearing in only one list are still considered but ranked lower.

Configuration

Use the hybrid_backends field to specify which backends to combine. Each sub-backend's configuration fields must also be provided.

Example: BM25s + ChromaDB (lightweight, offline)

from portiere.config import PortiereConfig, KnowledgeLayerConfig

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="hybrid",
        hybrid_backends=["bm25s", "chromadb"],
        bm25s_corpus_path="./vocab/concepts.json",
        chroma_persist_path="./vocab/chroma/",
    )
)

Example: FAISS + Elasticsearch (classic dense + sparse)

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="hybrid",
        hybrid_backends=["faiss", "elasticsearch"],
        faiss_index_path="/path/to/faiss.index",
        faiss_metadata_path="/path/to/faiss_metadata.json",
        elasticsearch_url="http://localhost:9200",
        elasticsearch_index="portiere_concepts",
        fusion_method="rrf",
        rrf_k=60,
    )
)

Example: Qdrant + BM25s (production vector + keyword)

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="hybrid",
        hybrid_backends=["qdrant", "bm25s"],
        qdrant_url="http://localhost:6333",
        qdrant_collection="portiere_concepts",
        bm25s_corpus_path="./vocab/concepts.json",
        fusion_method="rrf",
        rrf_k=60,
    )
)

Example: Three-way fusion

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="hybrid",
        hybrid_backends=["faiss", "bm25s", "chromadb"],
        faiss_index_path="/path/to/faiss.index",
        faiss_metadata_path="/path/to/faiss_metadata.json",
        bm25s_corpus_path="./vocab/concepts.json",
        chroma_persist_path="./vocab/chroma/",
        fusion_method="rrf",
        rrf_k=60,
    )
)

Weighted Fusion Alternative

If you prefer explicit control over the contribution of each search modality:

config = PortiereConfig(
    knowledge_layer=KnowledgeLayerConfig(
        backend="hybrid",
        hybrid_backends=["faiss", "bm25s"],
        faiss_index_path="/path/to/faiss.index",
        faiss_metadata_path="/path/to/faiss_metadata.json",
        bm25s_corpus_path="/path/to/bm25s_corpus/",
        fusion_method="weighted",
        # Weights are set as extra fields (KnowledgeLayerConfig allows extra)
    )
)

Building hybrid indexes? See Building the Knowledge Layer for programmatic examples using build_knowledge_layer() with hybrid_backends.


Embedding Models

The embedding model determines how concept names and source terms are encoded into vectors for FAISS and hybrid search. The default model is optimized for biomedical text.

Default: SapBERT

cambridgeltl/SapBERT-from-PubMedBERT-fulltext

SapBERT is a self-alignment pre-trained model built on PubMedBERT. It is specifically designed for biomedical entity linking and produces embeddings that cluster synonymous medical terms closely together in vector space.

Multi-Provider Embedding Support

Portiere supports multiple embedding providers via the EmbeddingConfig:

ProviderDescriptionUse Case
huggingfaceLocal sentence-transformers (default)Privacy-first, no API calls
ollamaLocal Ollama serverSelf-hosted models, no cloud
openaiOpenAI or OpenAI-compatible endpointsvLLM, LiteLLM, Together AI
bedrockAWS Bedrock (Amazon Titan, Cohere Embed)AWS-native, data stays in AWS

Note: For fully managed inference (no local models), just provide an api_key — Portiere infers cloud pipeline mode and the server handles embedding/reranking.

from portiere.config import PortiereConfig, EmbeddingConfig, KnowledgeLayerConfig

# HuggingFace (default, local)
config = PortiereConfig(
    embedding=EmbeddingConfig(
        provider="huggingface",
        model="cambridgeltl/SapBERT-from-PubMedBERT-fulltext",
    ),
    knowledge_layer=KnowledgeLayerConfig(backend="faiss", ...),
)

# Ollama (local server)
config = PortiereConfig(
    embedding=EmbeddingConfig(
        provider="ollama",
        model="nomic-embed-text",
        endpoint="http://localhost:11434",
    ),
    knowledge_layer=KnowledgeLayerConfig(backend="faiss", ...),
)

# OpenAI / OpenAI-compatible
config = PortiereConfig(
    embedding=EmbeddingConfig(
        provider="openai",
        model="text-embedding-3-small",
        api_key="sk-...",
    ),
    knowledge_layer=KnowledgeLayerConfig(backend="faiss", ...),
)

# Legacy string field (still supported, uses huggingface provider)
config = PortiereConfig(
    embedding_model="your-org/custom-medical-embedder",
    knowledge_layer=KnowledgeLayerConfig(backend="faiss", ...),
)

Model Selection Guidelines

ModelProviderDomainDimensionsNotes
cambridgeltl/SapBERT-from-PubMedBERT-fulltexthuggingfaceBiomedical768Default, best for clinical/medical terms
all-MiniLM-L6-v2huggingfaceGeneral384Lightweight, fast, good for non-medical
BAAI/bge-base-en-v1.5huggingfaceGeneral768Strong general-purpose embeddings
nomic-embed-textollamaGeneral768Good local alternative via Ollama
text-embedding-3-smallopenaiGeneral1536OpenAI cloud, high quality
text-embedding-3-largeopenaiGeneral3072OpenAI cloud, highest quality

When switching embedding models, you must rebuild the FAISS index since vector dimensions and the embedding space will differ between models.


Cross-Encoder Reranking

After initial retrieval (from any backend), Portiere optionally applies a cross-encoder model to rerank the top-N candidates. Cross-encoders process the (source_term, candidate_name) pair jointly, producing more accurate relevance scores than bi-encoder similarity alone.

Reranker Providers

ProviderDescriptionDefault Model
huggingfaceLocal cross-encoder (default)cross-encoder/ms-marco-MiniLM-L-6-v2
noneDisable reranking
from portiere.config import PortiereConfig, RerankerConfig

# Local HuggingFace reranker (default)
config = PortiereConfig(
    reranker=RerankerConfig(provider="huggingface", model="cross-encoder/ms-marco-MiniLM-L-6-v2"),
)

# Disable reranking
config = PortiereConfig(
    reranker=RerankerConfig(provider="none"),
)

# Portiere Cloud reranker
config = PortiereConfig(
    api_key="pt_sk_...",  # auto-selects portiere provider for all models
)

For improved biomedical accuracy, consider:

GanjinZero/coder_eng_pp

How Reranking Fits the Pipeline

  1. The selected backend retrieves the top-K candidates (e.g., K=50).
  2. The cross-encoder scores each (source_term, candidate) pair.
  3. Candidates are re-sorted by cross-encoder score.
  4. The top candidates (e.g., top 5) are returned to the mapping pipeline.

This two-stage retrieval + reranking approach balances recall (broad initial retrieval) with precision (accurate reranking).


VocabularyBridge -- Cross-Vocabulary Mapping

The VocabularyBridge uses OHDSI Athena's CONCEPT_RELATIONSHIP.csv to map concepts between vocabularies (e.g., OMOP concept IDs to SNOMED codes, ICD-10 to SNOMED, SNOMED to LOINC). This is distinct from the search backends above -- VocabularyBridge uses known vocabulary relationships rather than similarity search.

When to Use

  • Converting OMOP concept IDs to standard codes in another vocabulary
  • Building crosswalk tables between two vocabularies (e.g., ICD10CM to SNOMED)
  • Cross-standard mapping (e.g., OMOP to FHIR) where concept IDs need vocabulary translation
  • Generating FHIR CodeableConcept or openEHR DV_CODED_TEXT structures from OMOP concept IDs

Setup

VocabularyBridge requires an Athena download directory containing CONCEPT.csv and CONCEPT_RELATIONSHIP.csv. See Vocabulary Setup for download instructions.

from portiere.knowledge import VocabularyBridge

bridge = VocabularyBridge(
    athena_path="./data/athena/",
    vocabularies=["SNOMED", "LOINC", "RxNorm", "ICD10CM"],  # optional filter
)

Concept Lookup

concept = bridge.get_concept(201826)
# {
#     "concept_id": 201826,
#     "concept_name": "Type 2 diabetes mellitus",
#     "vocabulary_id": "SNOMED",
#     "domain_id": "Condition",
#     "concept_class_id": "Clinical Finding",
#     "standard_concept": "S",
#     "concept_code": "44054006",
# }

Cross-Vocabulary Mapping

# Map an OMOP concept to SNOMED equivalents
results = bridge.map_concept(4329847, target_vocabulary="SNOMED")
# [{"concept_id": 4329847, "concept_name": "Blood pressure", "vocabulary_id": "SNOMED", ...}]

Building a Crosswalk

# Build a full ICD10CM → SNOMED crosswalk
crosswalk = bridge.get_crosswalk("ICD10CM", "SNOMED")
# [
#     {"source_concept_id": ..., "source_concept_name": "...", "target_concept_id": ..., ...},
#     ...
# ]

FHIR and OpenEHR Output Formats

# Convert concept to FHIR CodeableConcept
fhir_cc = bridge.concept_to_codeable_concept(201826)
# {"coding": [{"system": "http://snomed.info/sct", "code": "44054006", "display": "Type 2 diabetes mellitus"}], "text": "Type 2 diabetes mellitus"}

# Convert concept to openEHR DV_CODED_TEXT
ehr_ct = bridge.concept_to_dv_coded_text(201826)
# {"_type": "DV_CODED_TEXT", "value": "Type 2 diabetes mellitus", "defining_code": {"terminology_id": {"value": "SNOMED CT"}, "code_string": "44054006"}}

Relationship Types

VocabularyBridge indexes these relationship types from CONCEPT_RELATIONSHIP.csv:

RelationshipDescriptionUsed For
Maps toEquivalence mappingDefault cross-vocabulary mapping
Mapped fromReverse equivalenceReverse lookups
Is aHierarchical parentHierarchy navigation
SubsumesHierarchical childHierarchy navigation

By default, map_concept() uses only equivalence relationships (Maps to, Mapped from). Pass relationship_types to include hierarchical relationships:

results = bridge.map_concept(
    201826,
    target_vocabulary="SNOMED",
    relationship_types={"Maps to", "Mapped from", "Is a"},
)

Integration with Cross-Standard Mapping

VocabularyBridge is automatically used by the vocabulary_lookup transform type in crossmap YAML definitions. When a cross-standard mapping references a vocabulary_lookup transform, the CrossStandardMapper delegates to VocabularyBridge for concept translation.

See Cross-Standard Mapping for details.

Statistics

stats = bridge.stats()
# {"concepts": 450000, "forward_relationships": 1200000, "reverse_relationships": 1200000, "vocabularies": ["ICD10CM", "LOINC", "RxNorm", "SNOMED"]}

Configuration Reference

The complete KnowledgeLayerConfig model:

class KnowledgeLayerConfig(BaseModel):
    backend: Literal["faiss", "elasticsearch", "bm25s", "hybrid", "chromadb", "pgvector", "mongodb", "qdrant", "milvus"] = "bm25s"
    faiss_index_path: Optional[Path] = None
    faiss_metadata_path: Optional[Path] = None
    elasticsearch_url: Optional[str] = None
    elasticsearch_index: str = "portiere_concepts"
    bm25s_corpus_path: Optional[Path] = None
    chroma_collection: str = "portiere_concepts"
    chroma_persist_path: Optional[Path] = None
    pgvector_connection_string: Optional[str] = None
    pgvector_table: str = "concept_embeddings"
    mongodb_connection_string: Optional[str] = None
    mongodb_database: str = "portiere"
    mongodb_collection: str = "concept_embeddings"
    qdrant_url: Optional[str] = None
    qdrant_collection: str = "portiere_concepts"
    qdrant_api_key: Optional[str] = None
    milvus_uri: Optional[str] = None
    milvus_collection: str = "portiere_concepts"
    hybrid_backends: Optional[List[str]] = None
    fusion_method: Literal["rrf", "weighted"] = "rrf"
    rrf_k: int = 60

Field Reference

FieldTypeDefaultDescription
backendstr"bm25s"Search backend: "bm25s", "faiss", "elasticsearch", "hybrid", "chromadb", "pgvector", "mongodb", "qdrant", "milvus"
faiss_index_pathPathNonePath to the FAISS .index file
faiss_metadata_pathPathNonePath to the FAISS metadata JSON file
elasticsearch_urlstrNoneElasticsearch cluster URL
elasticsearch_indexstr"portiere_concepts"Name of the ES index
bm25s_corpus_pathPathNonePath to the BM25s corpus directory
chroma_collectionstr"portiere_concepts"ChromaDB collection name
chroma_persist_pathPathNonePath to ChromaDB persistence directory
pgvector_connection_stringstrNonePostgreSQL connection string for pgvector
pgvector_tablestr"concept_embeddings"Table name for pgvector embeddings
mongodb_connection_stringstrNoneMongoDB Atlas connection string
mongodb_databasestr"portiere"MongoDB database name
mongodb_collectionstr"concept_embeddings"MongoDB collection name for embeddings
qdrant_urlstrNoneQdrant server URL
qdrant_collectionstr"portiere_concepts"Qdrant collection name
qdrant_api_keystrNoneAPI key for Qdrant Cloud (optional)
milvus_uristrNoneMilvus server URI
milvus_collectionstr"portiere_concepts"Milvus collection name
hybrid_backendsList[str]NoneList of backend names to combine in hybrid mode (e.g., ["bm25s", "chromadb"])
fusion_methodstr"rrf"Fusion method for hybrid: "rrf" or "weighted"
rrf_kint60RRF smoothing constant (higher = more conservative fusion)

Performance Comparison

Benchmarked on a standard vocabulary of 500K concepts with 1,000 source terms of varying quality:

BackendRecall@10Precision@1P95 Latency (ms)Index SizeMemory
BM25s0.720.5812180 MB400 MB
FAISS (SapBERT)0.860.71451.5 GB2.2 GB
Elasticsearch0.750.6225220 MBN/A (cluster)
Hybrid (FAISS + BM25s, RRF)0.920.79551.7 GB2.6 GB
Hybrid + Reranking0.920.851201.7 GB3.0 GB

Key observations:

  • BM25s is the fastest and lightest option. It works well when source terms closely match vocabulary names but struggles with abbreviations and paraphrases.
  • FAISS provides a substantial accuracy improvement through semantic matching. The trade-off is higher memory usage and index build time.
  • Elasticsearch performs comparably to BM25s with the added benefit of fuzzy matching and infrastructure scalability.
  • Hybrid consistently achieves the highest recall by combining the strengths of both dense and sparse search.
  • Reranking adds latency but significantly boosts Precision@1, which directly improves the auto-acceptance rate in the concept mapping pipeline.

Choosing a Backend

  • Just getting started? Use bm25s (default). No setup required.
  • Need better accuracy? Switch to faiss with SapBERT embeddings.
  • Already have Elasticsearch? Use elasticsearch to leverage existing infrastructure.
  • Already have PostgreSQL? Use pgvector to keep vectors alongside your relational data.
  • Want a simple embedded vector DB? Use chromadb for zero-infrastructure vector search.
  • Need production-grade vector search? Use qdrant for high-performance deployments.
  • Billion-scale vocabularies? Use milvus for distributed vector search.
  • Using MongoDB Atlas? Use mongodb to add vector search to your existing cluster.
  • Maximum accuracy? Use hybrid with reranking.

See Also