Hybrid Search

engrava’s search_hybrid() combines up to five scoring signals into a single ranked result list.

Signal Model

#	Signal	Weight key	Default	Source
1	FTS5 keyword	`default_fts_weight`	`0.30`	BM25 full-text score (min-max normalized)
2	Vector similarity	`default_vector_weight`	`0.55`	Cosine similarity from embedding search
3	Recency	`default_recency_weight`	`0.10`	Exponential decay based on `current_cycle`
4	Priority	`default_priority_weight`	`0.05`	Boost multiplier per priority level (P1-P4)
5	Graph	`default_graph_weight`	`0.00`	1-hop-weighted neighbour boost (opt-in)

Default weights sum to 1.0. When a signal is unavailable (e.g. no current_cycle → recency skipped, no embeddings → vector skipped), its weight is redistributed proportionally across active signals.

Keyword query syntax (FTS)

The keyword signal — and the search_fts() method and the MCP search_keywords tool that expose it directly — runs your text against an SQLite FTS5 index. engrava normalises the query before handing it to FTS5, with two modes that switch automatically on what you type:

Bare queries are matched with OR. A plain natural-language query like what was my sister doing is treated as a bag of words joined with OR, so a document matches when it shares any word. BM25’s IDF weighting then ranks the documents that share the most distinctive words first, so common function words (what, was, my) carry little weight and need no stopword list or stemmer — this works in any language.

# Bare query -> OR-matched: finds docs sharing any content word, best-ranked first
hits = await store.search_fts("what was my sister doing", top_k=10)

Expert syntax is preserved unchanged. If your query uses FTS5 operators, it is passed through as written:

quoted phrases — "machine learning" matches the exact phrase;
uppercase booleans — AND, OR, NOT (must be uppercase) compose terms, e.g. python AND NOT snake;
prefix — a trailing * does prefix matching, e.g. neur*;
column filters — essence: and content: restrict a term to that column, e.g. content:berlin.

Punctuation never raises. Unsafe characters split a token into separate terms rather than breaking the query: a contraction like sister's becomes sister OR s, so it still matches a stored sister's dog. Pasting a URL or a timestamp is safe too — only the real essence: / content: column filters are honoured, so http://example.com and 12:30 are treated as ordinary search terms. A genuinely malformed full-text expression is degraded to zero FTS hits, so the rest of a hybrid search still returns results.

Graceful Degradation

FTS5 unavailable or empty query → FTS skipped.
query_vector is None and no embedding provider → vector skipped.
current_cycle is None → recency skipped.
priority_weight is 0.0 → priority skipped.
graph_weight is 0.0 → graph skipped, zero overhead.
All signals disabled → fallback to list_thoughts(LIMIT top_k).

Graph-Aware Ranking

The graph signal uses 1-hop-weighted neighbour boost. If a candidate thought’s graph neighbours also match the query, the candidate receives a boost proportional to the neighbour’s semantic score and the connecting edge weight.

Algorithm

For each candidate C in the fusion pool:
  neighbours = get_edges(C, direction="BOTH")   # then cap to max_neighbors
               ordered by edge.weight DESC (deterministic)
  For each (edge, neighbour):
    neighbour_base = max(fts_score[neighbour], vector_score[neighbour])
    boost[C] += edge.weight * neighbour_base * graph_edge_decay
final_score[C] += graph_weight * boost[C]

Key properties:

Only semantic scores propagate — priority, recency, and graph scores are excluded from neighbour_base to prevent hub-cascade effects.
No new candidates — graph signal re-ranks existing results; it does not add thoughts to the result set.
Deterministic — neighbours are sorted by edge.weight DESC before the max_neighbors cap is applied.

Configuration

search:
  default_graph_weight: 0.0          # opt-in (0.0 = disabled)
  graph_edge_decay: 0.5              # decay factor for 1-hop distance
  max_neighbors_per_candidate: 5     # safety cap

Per-query override:

result = await store.search_hybrid(
    "python async",
    graph_weight=0.1,
    graph_edge_decay=0.3,
)

Performance

When graph_weight=0.0 (default), no graph queries are executed and there is zero performance impact. When active, the implementation issues a small number of batched SQL queries (one per ~450 candidates) and caps each candidate to max_neighbors_per_candidate neighbours, so per-candidate work is O(max_neighbors_per_candidate).

Observability

When the graph signal contributes to at least one candidate, "graph" appears in HybridSearchResult.backends_used.

Per-Query Overrides

All weights can be overridden per call via keyword arguments:

result = await store.search_hybrid(
    "quantum computing",
    query_vector=embedding,
    fts_weight=0.4,
    vector_weight=0.4,
    recency_weight=0.1,
    priority_weight=0.05,
    graph_weight=0.05,
    current_cycle=42,
)

Configuration Reference

See Configuration for the full YAML reference of SearchConfig fields.

Querying Reflections

After DreamingExtension.run_consolidation() runs its clustering phase, ThoughtType.REFLECTION meta-thoughts exist in the store. Three knobs control how hybrid search handles them.

`include_reflections` (default `True`)

When False, REFLECTION thoughts are excluded from search_hybrid() results. Useful when you want raw observations / insights without higher-order aggregates:

result = await store.search_hybrid(
    "machine learning",
    query_vector=embedding,
    include_reflections=False,
)

`reflection_boost` (default `SearchConfig.reflection_boost = 1.0`)

When REFLECTIONs are included, their final score is multiplied by this factor. The default 1.0 leaves REFLECTIONs on equal footing with regular thoughts; raise it above 1.0 to give high-level abstractions a modest upranking so they surface for broad queries without dominating narrow ones.

# Stronger boost -- reflections rank near the top for broad queries
result = await store.search_hybrid(
    "patterns in memory",
    query_vector=embedding,
    reflection_boost=1.5,
)

# Disable boost -- reflections compete on equal footing
result = await store.search_hybrid(
    "specific fact",
    query_vector=embedding,
    reflection_boost=1.0,
)

Configure the default in YAML:

search:
  reflection_boost: 1.0   # applies when reflection_boost not overridden per-call

`search_reflections_only()`

Convenience helper that returns only REFLECTION thoughts, scored by cosine similarity to the query vector (plus optional recency blend). Designed for queries like “what themes exist in my memory?”:

result = await store.search_reflections_only(
    "recurring ideas about learning",
    query_vector=embedding,
    top_k=5,
    current_cycle=42,   # optional recency blend
)
for thought_id, score in result.results:
    ref = await store.get_thought(thought_id)
    print(ref.content)  # JSON with member_ids + keywords

Key difference from search_hybrid(include_reflections=True): search_reflections_only() fetches all REFLECTIONs directly from the store (no pagination gap) and scores them purely by cosine similarity to the query. It does not compete against regular thoughts for result slots.