Hybrid Search
engrava’s search_hybrid() combines up to five scoring signals
into a single ranked result list.
Signal Model
| # | Signal | Weight key | Default | Source |
|---|---|---|---|---|
| 1 | FTS5 keyword | default_fts_weight | 0.30 | BM25 full-text score (min-max normalized) |
| 2 | Vector similarity | default_vector_weight | 0.55 | Cosine similarity from embedding search |
| 3 | Recency | default_recency_weight | 0.10 | Exponential decay based on current_cycle |
| 4 | Priority | default_priority_weight | 0.05 | Boost multiplier per priority level (P1-P4) |
| 5 | Graph | default_graph_weight | 0.00 | 1-hop-weighted neighbour boost (opt-in) |
Default weights sum to 1.0. When a signal is unavailable (e.g. no
current_cycle → recency skipped, no embeddings → vector skipped),
its weight is redistributed proportionally across active signals.
Keyword query syntax (FTS)
The keyword signal — and the search_fts() method and the MCP search_keywords
tool that expose it directly — runs your text against an SQLite FTS5 index.
engrava normalises the query before handing it to FTS5, with two modes that
switch automatically on what you type:
Bare queries are matched with OR. A plain natural-language query like
what was my sister doing is treated as a bag of words joined with OR, so a
document matches when it shares any word. BM25’s IDF weighting then ranks the
documents that share the most distinctive words first, so common function words
(what, was, my) carry little weight and need no stopword list or stemmer —
this works in any language.
# Bare query -> OR-matched: finds docs sharing any content word, best-ranked first
hits = await store.search_fts("what was my sister doing", top_k=10)
Expert syntax is preserved unchanged. If your query uses FTS5 operators, it is passed through as written:
- quoted phrases —
"machine learning"matches the exact phrase; - uppercase booleans —
AND,OR,NOT(must be uppercase) compose terms, e.g.python AND NOT snake; - prefix — a trailing
*does prefix matching, e.g.neur*; - column filters —
essence:andcontent:restrict a term to that column, e.g.content:berlin.
Punctuation never raises. Unsafe characters split a token into separate terms
rather than breaking the query: a contraction like sister's becomes
sister OR s, so it still matches a stored sister's dog. Pasting a URL or a
timestamp is safe too — only the real essence: / content: column filters are
honoured, so http://example.com and 12:30 are treated as ordinary search
terms. A genuinely malformed full-text expression is degraded to zero FTS hits,
so the rest of a hybrid search still returns results.
Graceful Degradation
- FTS5 unavailable or empty query → FTS skipped.
query_vectorisNoneand no embedding provider → vector skipped.current_cycleisNone→ recency skipped.priority_weightis0.0→ priority skipped.graph_weightis0.0→ graph skipped, zero overhead.- All signals disabled → fallback to
list_thoughts(LIMIT top_k).
Graph-Aware Ranking
The graph signal uses 1-hop-weighted neighbour boost. If a candidate thought’s graph neighbours also match the query, the candidate receives a boost proportional to the neighbour’s semantic score and the connecting edge weight.
Algorithm
For each candidate C in the fusion pool:
neighbours = get_edges(C, direction="BOTH") # then cap to max_neighbors
ordered by edge.weight DESC (deterministic)
For each (edge, neighbour):
neighbour_base = max(fts_score[neighbour], vector_score[neighbour])
boost[C] += edge.weight * neighbour_base * graph_edge_decay
final_score[C] += graph_weight * boost[C]
Key properties:
- Only semantic scores propagate — priority, recency, and graph
scores are excluded from
neighbour_baseto prevent hub-cascade effects. - No new candidates — graph signal re-ranks existing results; it does not add thoughts to the result set.
- Deterministic — neighbours are sorted by
edge.weight DESCbefore themax_neighborscap is applied.
Configuration
search:
default_graph_weight: 0.0 # opt-in (0.0 = disabled)
graph_edge_decay: 0.5 # decay factor for 1-hop distance
max_neighbors_per_candidate: 5 # safety cap
Per-query override:
result = await store.search_hybrid(
"python async",
graph_weight=0.1,
graph_edge_decay=0.3,
)
Performance
When graph_weight=0.0 (default), no graph queries are executed and
there is zero performance impact. When active, the implementation
issues a small number of batched SQL queries (one per ~450 candidates)
and caps each candidate to max_neighbors_per_candidate neighbours, so
per-candidate work is O(max_neighbors_per_candidate).
Observability
When the graph signal contributes to at least one candidate,
"graph" appears in HybridSearchResult.backends_used.
Per-Query Overrides
All weights can be overridden per call via keyword arguments:
result = await store.search_hybrid(
"quantum computing",
query_vector=embedding,
fts_weight=0.4,
vector_weight=0.4,
recency_weight=0.1,
priority_weight=0.05,
graph_weight=0.05,
current_cycle=42,
)
Configuration Reference
See Configuration for the full YAML reference
of SearchConfig fields.
Querying Reflections
After DreamingExtension.run_consolidation() runs its clustering phase,
ThoughtType.REFLECTION meta-thoughts exist in the store. Three knobs
control how hybrid search handles them.
include_reflections (default True)
When False, REFLECTION thoughts are excluded from search_hybrid()
results. Useful when you want raw observations / insights without
higher-order aggregates:
result = await store.search_hybrid(
"machine learning",
query_vector=embedding,
include_reflections=False,
)
reflection_boost (default SearchConfig.reflection_boost = 1.0)
When REFLECTIONs are included, their final score is multiplied by this
factor. The default 1.0 leaves REFLECTIONs on equal footing with
regular thoughts; raise it above 1.0 to give high-level abstractions a
modest upranking so they surface for broad queries without dominating
narrow ones.
# Stronger boost -- reflections rank near the top for broad queries
result = await store.search_hybrid(
"patterns in memory",
query_vector=embedding,
reflection_boost=1.5,
)
# Disable boost -- reflections compete on equal footing
result = await store.search_hybrid(
"specific fact",
query_vector=embedding,
reflection_boost=1.0,
)
Configure the default in YAML:
search:
reflection_boost: 1.0 # applies when reflection_boost not overridden per-call
search_reflections_only()
Convenience helper that returns only REFLECTION thoughts, scored by cosine similarity to the query vector (plus optional recency blend). Designed for queries like “what themes exist in my memory?”:
result = await store.search_reflections_only(
"recurring ideas about learning",
query_vector=embedding,
top_k=5,
current_cycle=42, # optional recency blend
)
for thought_id, score in result.results:
ref = await store.get_thought(thought_id)
print(ref.content) # JSON with member_ids + keywords
Key difference from search_hybrid(include_reflections=True):
search_reflections_only() fetches all REFLECTIONs directly from
the store (no pagination gap) and scores them purely by cosine similarity
to the query. It does not compete against regular thoughts for result slots.