Configuration

engrava supports YAML-based configuration for production deployments. This document covers all configuration options.

Configuration File

Create an engrava.yaml file:

database:
  path: "./engrava.db"
  wal_mode: true

search:
  default_fts_weight: 0.30
  default_vector_weight: 0.55
  default_recency_weight: 0.10
  default_priority_weight: 0.05
  default_graph_weight: 0.00       # opt-in graph signal
  recency_half_life: 50
  priority_boost_p1: 1.0
  priority_boost_p2: 0.6
  priority_boost_p3: 0.3
  priority_boost_p4: 0.0
  graph_edge_decay: 0.5            # 1-hop distance penalty
  max_neighbors_per_candidate: 5   # safety cap

extensions:
  vector:
    backend: numpy
    dimension: 384

  dreaming:
    enabled: true
    schedule_every_n_cycles: 100
    promote_threshold: 0.7
    candidates_limit: 200
    gates:
      min_confirmations: 2
      min_age_cycles: 1
      max_promoted_per_run: 20
      allow_zero_confirmation: true

Loading Configuration

from engrava import load_config, SqliteEngravaCore

config = load_config("engrava.yaml")

async with await SqliteEngravaCore.from_config("engrava.yaml") as store:
    thought = await store.get_thought("abc")

Full Factory Method

from engrava.config import load_config, resolve_embedding_provider

config = load_config("engrava.yaml")
# resolve_embedding_provider takes the EmbeddingConfig, i.e. config.embeddings
provider = resolve_embedding_provider(config.embeddings)

Configuration Reference

database

KeyTypeDefaultDescription
database.pathstrrequiredPath to the SQLite database file (no default — omitting it raises ConfigError)
database.wal_modebooltrueEnable WAL journal mode for concurrent reads

Controls hybrid search behavior (FTS5 + vector + recency + priority).

KeyTypeDefaultDescription
default_fts_weightfloat0.30Weight for FTS5/BM25 text score
default_vector_weightfloat0.55Weight for vector similarity score
default_recency_weightfloat0.10Weight for recency-based score
default_priority_weightfloat0.05Weight for priority signal
default_graph_weightfloat0.0Weight for 1-hop graph signal (opt-in)
recency_half_lifeint50Cycles for recency score to halve
priority_boost_p1float1.0Score multiplier for P1 thoughts
priority_boost_p2float0.6Score multiplier for P2 thoughts
priority_boost_p3float0.3Score multiplier for P3 thoughts
priority_boost_p4float0.0Score multiplier for P4 thoughts
graph_edge_decayfloat0.5Decay factor for 1-hop neighbour boost
max_neighbors_per_candidateint5Max neighbours considered per candidate

Weights are redistributed proportionally when a signal is unavailable (e.g. no current_cycle → recency skipped). Set any weight to 0.0 to disable that signal entirely.

See Hybrid Search for the full 5-signal ranking model.

embeddings

Embedding provider configuration. (The YAML key is embeddings, plural.) The vector dimension lives under extensions.vector.dimension, not here.

KeyTypeDefaultDescription
providerstrnullProvider type: "sentence-transformer", "openai-compatible", "ollama", "huggingface"
modelstrnullModel name or identifier
auto_embedboolfalseAuto-embed on create_thought / update_thought
devicestr"cpu"Compute device for local providers ("cpu", "cuda")
batch_sizeint32Batch encoding size for local providers
base_urlstrnullBase URL for remote providers
api_keystrnullAPI key for remote providers (supports ${ENV_VAR})

dreaming

Memory consolidation configuration.

KeyTypeDefaultDescription
enabledboolfalseEnable dreaming consolidation
schedule_every_n_cyclesint100Consolidation cadence (every N cycles)
promote_thresholdfloat0.7Weighted-score cutoff for promotion
candidates_limitint200Max thoughts to evaluate per pass

dreaming.gates

Gate thresholds — a thought must pass all active gates to be scored.

KeyTypeDefaultDescription
min_confirmationsint2Minimum confirmation count. Bypassed when allow_zero_confirmation is true.
min_age_cyclesint1Minimum current_cycle - created_cycle. Always enforced.
max_promoted_per_runint20Cap on promotions per consolidation run
allow_zero_confirmationbooltrueBypass the confirmation gate for single-write batches. Set to false only when your application explicitly tracks confirmations.

dreaming.edges

Edge creation from dreaming. Promoted thoughts create ASSOCIATED edges to their nearest neighbours.

KeyTypeDefaultDescription
enabledbooltrueCreate edges on promotion
top_kint1Max neighbours to link per promoted thought
min_similarityfloat0.7Cosine threshold for edge creation
edge_weight_factorfloat0.5edge.weight = factor * similarity

See Dreaming for details.

services

Multi-service isolation (one database file per named service, stored under a shared data_dir as <name>.db).

KeyTypeDefaultDescription
data_dirstrrequiredDirectory holding the per-service <name>.db files
default_servicestr"main"Default service name when --service is omitted
configsdict{}Map of service name → per-service config

Each service entry under configs supports a single optional override (there is no per-service db_path — the file is derived as <data_dir>/<name>.db):

KeyTypeDefaultDescription
embeddingsdictPer-service embedding-provider override (same shape as the top-level embeddings section)

journal

The hash-chain audit trail. Off by default; when enabled, every thought/edge mutation is recorded as a hash-linked journal entry you can later verify for tamper-evidence.

KeyTypeDefaultDescription
enabledboolfalseRecord every thought/edge mutation as a hash-linked journal entry
journal:
  enabled: true

For the integrity-verification API, see Observability → What to alert on. The full audit trail reference lives in the OSS docs (audit-trail.md).

ttl

Time-to-live / auto-expiry of thoughts.

KeyTypeDefaultDescription
strategystr"archive"What cleanup_expired does to expired thoughts: "archive" (soft, marks ARCHIVED) or "delete" (hard)
check_every_n_operationsint0Run auto-cleanup every N store operations (0 = manual only, via cleanup_expired() / engrava gc --expired)
default_ttl_secondsint | nullnullDefault TTL applied to new thoughts with no explicit expires_at (null = no default)
ttl:
  strategy: archive          # or "delete"
  check_every_n_operations: 100
  default_ttl_seconds: 2592000   # 30 days

ingest

Ingest-layer behaviour (content-hash deduplication).

KeyTypeDefaultDescription
deduplication_enabledbooltrueWhether ingest pipelines should pass deduplicate=True so identical content collapses into one thought (bumping confirmation_count) instead of a duplicate row

This flag advises ingest-layer callers; the persistence-layer create_thought still defaults to deduplicate=False, so existing callers keep their behaviour unless they read this flag and forward it.

hooks

Wire a custom EngravaHooksProtocol implementation by dotted path. See Extensions.

KeyTypeDefaultDescription
classstr | nullnullDotted import path to a hooks class, last segment is the class name (e.g. "my_package.hooks.MyHooks"), instantiated and used by from_config
hooks:
  class: "my_package.hooks.MyHooks"

The path is split on the final dot (module.path + ClassName) — this is a plain dotted path, not the module.path:ATTRIBUTE colon form used by manifests.paths below.

manifests

Load extension manifests (their hooks + schema migrations). Accepts a plain list of dotted paths, or a mapping with discover / paths. See Extensions.

KeyTypeDefaultDescription
pathslist[str][]Dotted module.path:ATTRIBUTE references to ExtensionManifest objects
discoverboolfalseAlso scan the engrava.extensions entry-point group for manifests
# list form
manifests:
  - "my_plugin.manifest:MANIFEST"

# or mapping form
manifests:
  discover: true
  paths:
    - "my_plugin.manifest:MANIFEST"

The metrics: section (latency window size, enable/disable) is documented in Observability.

Environment Variables

Both are read by the engrava CLI only as fallbacks for the --config / --db flags. They do not affect load_config() or SqliteEngravaCore.from_config(), which read configuration solely from the YAML file you pass them.

VariableDescription
ENGRAVA_CONFIGFallback path to the YAML config file when --config is omitted (--config > ENGRAVA_CONFIG > none)
ENGRAVA_DBFallback database-file path for CLI commands when --db is omitted (--db > ENGRAVA_DB > ./engrava.db)

Multi-Service Usage

from engrava import EngravaManager, load_config

config = load_config("engrava.yaml")

async with EngravaManager.from_config(config.services) as mgr:
    store = await mgr.get_store("main")
    # Use store normally...

See the CLI --service flag for command-line multi-service access.