Configuration Reference¶

How Configuration Works¶

DocBrain uses a config-first architecture with a layered YAML + environment variable system. Understanding this prevents confusion about why a value isn't taking effect.

Loading Order (later = higher priority)¶

config/default.yaml         ← committed to repo — all non-secret defaults
config/{APP_ENV}.yaml       ← environment-specific overrides (development | production)
config/local.yaml           ← gitignored — your secrets and local overrides
Environment variables / .env ← always win — highest priority

Set APP_ENV=production for the production profile (this is the default in the Docker image). The server defaults to APP_ENV=development when running locally without Docker.

What Goes Where¶

Type	Where to put it
Infrastructure secrets (DB URL, LLM API keys, Redis, OpenSearch)	`.env` or environment variables
Ingest source credentials (Confluence token, GitHub token, Slack token, Jira token)	`config/local.yaml` (gitignored)
Deployment-specific values (URLs, ports, CORS origins)	`.env` or environment variables
Tuning (thresholds, intervals, cache TTLs)	`config/local.yaml` or env vars
Team-wide defaults you want committed	`config/default.yaml` (no secrets!)

The key distinction: .env is for infrastructure secrets that the runtime environment must inject (container orchestration, CI/CD, secrets managers). config/local.yaml is for user-managed source credentials and personal overrides — it's gitignored so it never gets committed, but it lives alongside the project where you can edit it easily.

Example `config/local.yaml`¶

# config/local.yaml — never committed (gitignored)
# Configure ingest sources and personal overrides here.

ingest:
  ingest_sources: confluence,github_pr

confluence:
  base_url: https://acme.atlassian.net/wiki
  user_email: you@acme.com
  api_token: ATATT3x...
  space_keys: DOCS,ENG

github_pr:
  token: ghp_...
  repo: acme/platform
  lookback_days: 180

# Local tuning overrides (optional)
autopilot:
  enabled: true
  cluster_threshold: 0.78

rag:
  cache_ttl_hours: 1

YAML Config Structure¶

Every YAML value supports ${ENV_VAR} and ${ENV_VAR:-default} substitution:

database:
  url: "${DATABASE_URL}"     # required — must come from env
  max_connections: "${DB_MAX_CONNECTIONS:-10}"

Custom Config Directory¶

# Mount a ConfigMap in Kubernetes
DOCBRAIN_CONFIG_DIR=/etc/docbrain docbrain-server

# Or pass as CLI argument
docbrain-server --config-dir /etc/docbrain

All configuration is also available via environment variables, set in .env for Docker Compose or via ConfigMap/Secret for Kubernetes. Environment variables always override YAML values.

Infrastructure¶

Variable	Default	Description
`DATABASE_URL`	—	PostgreSQL connection string
`OPENSEARCH_URL`	`http://localhost:9200`	OpenSearch endpoint
`REDIS_URL`	`redis://localhost:6379`	Redis connection string
`SERVER_PORT`	`3000`	API server listen port
`SERVER_BIND`	`0.0.0.0`	API server bind address
`LOG_LEVEL`	`info`	Log verbosity: `trace`, `debug`, `info`, `warn`, `error`
`DB_MAX_CONNECTIONS`	`10`	Maximum PostgreSQL connection pool size
`DB_CONNECT_TIMEOUT_SECS`	`10`	Timeout (seconds) for initial PostgreSQL connection
`DB_ACQUIRE_TIMEOUT_SECS`	`10`	Timeout (seconds) to acquire a connection from the pool
`DB_IDLE_TIMEOUT_SECS`	`300`	Idle connection lifetime (seconds) before cleanup

LLM Provider¶

Variable	Default	Description
`LLM_PROVIDER`	`bedrock`	Provider: `bedrock`, `anthropic`, `openai`, `ollama`, `groq`, `openrouter`, `together`, `deepseek`, `mistral`, `xai`, `gemini`, `azure_openai`, `vertex_ai`, `cohere`
`LLM_MODEL_ID`	varies	Model identifier (provider-specific)
`FAST_MODEL_ID`	—	Fast/cheap model for background side-calls: intent classification, query rewriting, entity extraction. Falls back to `LLM_MODEL_ID` if not set. Recommended: Haiku (Bedrock/Anthropic), `gpt-4o-mini` (OpenAI), `qwen2.5:7b` (Ollama). Alias: `HAIKU_MODEL_ID` (deprecated).
`INGEST_LLM_MODEL_ID`	—	Model used during ingest only for image extraction. Falls back to `LLM_MODEL_ID` if not set. Set this to a cheaper model — image extraction fires for every page with images. Using Opus 4 with `LLM_THINKING_BUDGET` without this override will cause throttling errors during ingest.
`DRAFT_MODEL_ID`	—	Model used for autopilot draft generation (two-phase reasoning + writing). Falls back to `LLM_MODEL_ID` if not set. Use a high-capability model here — drafts benefit from stronger reasoning.
`DRAFT_LLM_PROVIDER`	—	Provider for draft generation. Falls back to `LLM_PROVIDER` if not set. Allows cross-provider drafting — e.g. use Gemini Flash for Q&A but Anthropic Claude for drafts.
`LLM_THINKING_BUDGET`	—	Extended thinking token budget (tokens). Unset or `0` = disabled. Only applies to the primary `LLM_MODEL_ID`, never to `FAST_MODEL_ID` or `INGEST_LLM_MODEL_ID`.
`ANTHROPIC_API_KEY`	—	API key (if `LLM_PROVIDER=anthropic`)
`OPENAI_API_KEY`	—	API key (if `LLM_PROVIDER=openai`)
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL
`OLLAMA_TIMEOUT_SECS`	`120`	HTTP timeout in seconds for Ollama requests. Increase for large/slow models (e.g. 70B) to avoid "error decoding response body" when the model takes longer than 2 minutes. Example: `300` or `600`. Allowed range: 60–900.
`OLLAMA_TLS_VERIFY`	`false`	Set to `true` to enforce TLS certificate validation for Ollama
`OLLAMA_VISION_ENABLED`	`true`	Set to `false` if your Ollama model doesn't support vision (skips image calls)
`AWS_REGION`	—	AWS region for Bedrock (e.g. `us-east-1`)
`AWS_ACCESS_KEY_ID`	—	AWS access key (optional — see credential chain below)
`AWS_SECRET_ACCESS_KEY`	—	AWS secret key (optional — see credential chain below)
`GROQ_API_KEY`	—	API key (if `LLM_PROVIDER=groq`)
`OPENROUTER_API_KEY`	—	API key (if `LLM_PROVIDER=openrouter`)
`TOGETHER_API_KEY`	—	API key (if `LLM_PROVIDER=together`)
`DEEPSEEK_API_KEY`	—	API key (if `LLM_PROVIDER=deepseek`)
`MISTRAL_API_KEY`	—	API key (if `LLM_PROVIDER=mistral`)
`XAI_API_KEY`	—	API key (if `LLM_PROVIDER=xai`)
`GEMINI_API_KEY`	—	API key (if `LLM_PROVIDER=gemini`)
`AZURE_OPENAI_API_KEY`	—	API key (if `LLM_PROVIDER=azure_openai`)
`AZURE_OPENAI_ENDPOINT`	—	Resource endpoint (if `LLM_PROVIDER=azure_openai`). e.g. `https://my-resource.openai.azure.com`
`AZURE_OPENAI_API_VERSION`	`2024-02-01`	API version (if `LLM_PROVIDER=azure_openai`)
`VERTEX_PROJECT`	—	GCP project ID (if `LLM_PROVIDER=vertex_ai`). Required.
`VERTEX_REGION`	`us-central1`	GCP region (if `LLM_PROVIDER=vertex_ai`)
`COHERE_API_KEY`	—	API key (if `LLM_PROVIDER=cohere`)

AWS Credential Chain: Bedrock uses the AWS SDK default credential chain: env vars → ~/.aws/credentials → IRSA (EKS) → EC2 Instance Profile → ECS Task Role. In production, use IRSA or instance profiles — no keys in env. Set serviceAccount.create=true and serviceAccount.annotations.eks.amazonaws.com/role-arn in Helm. The IAM role needs bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions. See providers.md for full setup details.

GCP Credential Chain: Vertex AI uses gcp_auth which resolves credentials in this order: GOOGLE_APPLICATION_CREDENTIALS (service account key file) → Application Default Credentials (gcloud auth application-default login) → GKE Workload Identity → GCE/Cloud Run metadata service. In production on GKE, use Workload Identity — no keys needed in the cluster. See providers.md for Workload Identity setup details.

Ollama: model selection and tuning¶

Only use models with strong instruction-following capabilities. DocBrain's RAG pipeline requires the LLM to stay strictly grounded in retrieved documents. Models that default to training data instead of provided context will produce fabricated answers. Recommended: command-r:35b (purpose-built for RAG). See providers.md for the full model comparison table.

Recommended config: LLM_MODEL_ID=command-r:35b and FAST_MODEL_ID=qwen2.5:7b. The fast model handles intent classification and query rewriting; only the final answer uses the primary model.
"Error decoding response body" after 2–3 minutes: The default HTTP timeout is 120 seconds. If the model takes longer to generate the full response, the connection is cut and you get a decode error. Set OLLAMA_TIMEOUT_SECS=300 (or 600) so the client waits long enough.

Embedding Provider¶

Set EMBED_PROVIDER to choose your embedding model. One of: openai, bedrock, ollama.

Variable	Default	Description
`EMBED_PROVIDER`	`bedrock`	Provider: `bedrock`, `openai`, `ollama`
`EMBED_MODEL_ID`	varies	Embedding model identifier (e.g. `text-embedding-3-small`, `cohere.embed-v4:0`)

Switching Embedding Models¶

When you change EMBED_PROVIDER or EMBED_MODEL_ID to a model with different vector dimensions (e.g. Bedrock Cohere/1024 → Ollama nomic-embed-text/768), the server will refuse to start with a clear error:

Embedding dimension mismatch on index 'docbrain-chunks': existing=1024, required=768.

To migrate:

Set FORCE_REINDEX=true in your environment
Restart the server and run ingest — the old indexes are deleted and recreated
Remove FORCE_REINDEX after the migration completes

Variable	Default	Description
`FORCE_REINDEX`	`false`	Delete and recreate OpenSearch indexes when embedding dimensions change. Set once during migration, then remove.

Document Ingestion¶

Configure sources in config/local.yaml (gitignored). Put only infrastructure secrets in .env.

General¶

Setting (`config/local.yaml` key)	Env var equivalent	Default	Description
`ingest.ingest_sources`	`INGEST_SOURCES`	`local`	Comma-separated list of active sources: `local`, `confluence`, `github`, `github_pr`, `gitlab_mr`, `slack_thread`, `jira`
`ingest.self_ingest`	`DOCBRAIN_SELF_INGEST`	`true`	Auto-ingest DocBrain's own docs
`ingest.image_extraction_enabled`	`IMAGE_EXTRACTION_ENABLED`	`true`	Extract and describe images using vision LLM

Local Files¶

Variable	Default	Description
`LOCAL_DOCS_PATH`	—	Directory path for local file ingestion (set in `.env` or as env var)

Confluence¶

Set credentials in config/local.yaml:

confluence:
  base_url: https://yourco.atlassian.net/wiki
  user_email: you@yourco.com
  api_token: ATATT3x...
  space_keys: ENG,DOCS

Key	Env var	Default	Description
`confluence.base_url`	`CONFLUENCE_BASE_URL`	—	Atlassian instance URL (must include `/wiki`)
`confluence.user_email`	`CONFLUENCE_USER_EMAIL`	—	Auth email (not required for v1 Data Center)
`confluence.api_token`	`CONFLUENCE_API_TOKEN`	—	API token (Cloud) or Personal Access Token (Data Center)
`confluence.space_keys`	`CONFLUENCE_SPACE_KEYS`	—	Comma-separated space keys to ingest
`confluence.page_limit`	`CONFLUENCE_PAGE_LIMIT`	`0` (unlimited)	Max pages per space. `0` = all pages.
`confluence.api_version`	`CONFLUENCE_API_VERSION`	`v2`	`v2` for Cloud, `v1` for Data Center 7.x+
`confluence.tls_verify`	`CONFLUENCE_TLS_VERIFY`	`true`	Set to `false` for self-signed certs
`confluence.webhook_secret`	`CONFLUENCE_WEBHOOK_SECRET`	—	HMAC secret for real-time webhook sync (set as env var)

GitHub Repository¶

# config/local.yaml
github:
  repo_url: https://github.com/your-org/your-docs
  token: ghp_...    # only for private repos
  branch: main

Key	Env var	Default	Description
`github.repo_url`	`GITHUB_REPO_URL`	—	Repository URL to clone and ingest
`github.token`	`GITHUB_TOKEN`	—	Personal access token (optional for public repos)
`github.branch`	`GITHUB_BRANCH`	`main`	Branch to ingest from

GitHub Pull Requests¶

Ingest PR titles, descriptions, and review discussions as searchable knowledge.

# config/local.yaml
github_pr:
  token: ghp_...
  repo: acme/platform
  lookback_days: 365
  min_comments: 1

Key	Env var	Default	Description
`github_pr.token`	`GITHUB_PR_TOKEN`	—	GitHub personal access token (secret — set in `config/local.yaml`)
`github_pr.repo`	`GITHUB_PR_REPO`	—	Owner/repo (e.g. `acme/platform`) — set in `config/local.yaml`
`github_pr.lookback_days`	`GITHUB_PR_LOOKBACK_DAYS`	`365`	How far back to fetch PRs
`github_pr.min_comments`	`GITHUB_PR_MIN_COMMENTS`	`1`	Minimum comments for a PR to be ingested
`github_pr.labels`	`GITHUB_PR_LABELS`	—	Comma-separated label filter (optional)
`github_pr.api_url`	`GITHUB_PR_API_URL`	—	Override for GitHub Enterprise (optional)

GitLab Merge Requests¶

Ingest MR titles, descriptions, and discussion threads.

# config/local.yaml
gitlab_mr:
  token: glpat-...
  project_ids: acme/platform,acme/infra
  lookback_days: 365

Key	Env var	Default	Description
`gitlab_mr.token`	`GITLAB_TOKEN`	—	GitLab personal access token (secret — set in `config/local.yaml`)
`gitlab_mr.base_url`	`GITLAB_BASE_URL`	`https://gitlab.com`	GitLab instance URL
`gitlab_mr.project_ids`	`GITLAB_PROJECT_IDS`	—	Comma-separated namespace/repo paths — set in `config/local.yaml`
`gitlab_mr.lookback_days`	`GITLAB_MR_LOOKBACK_DAYS`	`365`	How far back to fetch MRs
`gitlab_mr.min_notes`	`GITLAB_MR_MIN_NOTES`	`1`	Minimum notes/comments for an MR to be ingested
`gitlab_mr.labels`	`GITLAB_MR_LABELS`	—	Comma-separated label filter (optional)
`gitlab_mr.tls_verify`	`GITLAB_TLS_VERIFY`	`true`	Set to `false` for self-signed certs (batch ingest)
`gitlabCapture.tlsInsecure`	`GITLAB_CAPTURE_TLS_INSECURE`	`false`	Set to `true` for self-signed certs (real-time capture)

Slack Threads¶

Ingest high-signal Slack threads (by reaction count or reply threshold).

# config/local.yaml
slack_ingest:
  token: xoxb-...
  channels: C01234567,C09876543
  min_replies: 3
  reactions: "white_check_mark,bookmark"
  lookback_days: 90

Key	Env var	Default	Description
`slack_ingest.token`	`SLACK_INGEST_TOKEN`	—	Slack bot token (secret — set in `config/local.yaml`)
`slack_ingest.channels`	`SLACK_INGEST_CHANNELS`	—	Comma-separated channel IDs — set in `config/local.yaml`
`slack_ingest.min_replies`	`SLACK_MIN_REPLIES`	`3`	Minimum thread replies to be ingested
`slack_ingest.reactions`	`SLACK_INGEST_REACTIONS`	`white_check_mark,bookmark`	Comma-separated reaction names that flag a thread for ingest
`slack_ingest.lookback_days`	`SLACK_LOOKBACK_DAYS`	`90`	How far back to scan channels

Jira¶

Ingest Jira issues (bugs, stories, tasks, epics) as searchable knowledge.

# config/local.yaml
jira_ingest:
  base_url: https://yourcompany.atlassian.net
  user_email: you@yourcompany.com
  api_token: your-token
  projects: ENG,OPS
  lookback_days: 365

Key	Env var	Default	Description
`jira_ingest.base_url`	`JIRA_BASE_URL`	—	Jira instance URL — set in `config/local.yaml`
`jira_ingest.user_email`	`JIRA_USER_EMAIL`	—	Jira account email — set in `config/local.yaml`
`jira_ingest.api_token`	`JIRA_API_TOKEN`	—	Jira API token (secret — set in `config/local.yaml`)
`jira_ingest.projects`	`JIRA_PROJECTS`	—	Comma-separated project keys — set in `config/local.yaml`
`jira_ingest.jql_filter`	`JIRA_JQL_FILTER`	—	Additional JQL filter (optional)
`jira_ingest.lookback_days`	`JIRA_LOOKBACK_DAYS`	`365`	How far back to fetch issues
`jira_ingest.issue_types`	`JIRA_ISSUE_TYPES`	`Bug,Story,Task,Epic`	Comma-separated issue types to ingest

Rate Limiting¶

DocBrain applies per-IP rate limiting to unauthenticated routes and per-API-key rate limiting to authenticated routes. Rate limiting is enabled by default.

Variable	Default	Description
`RATE_LIMIT_ENABLED`	`true`	Set to `false` to disable all rate limiting (not recommended for production)
`RATE_LIMIT_RPM`	`60`	Requests per minute per IP on unauthenticated routes
`RATE_LIMIT_AUTH_RPM`	`120`	Requests per minute per API key on authenticated routes
`RATE_LIMIT_WEBHOOK_RPM`	`30`	Requests per minute per IP on webhook endpoints (`/github/events`, `/gitlab/events`)

When a rate limit is exceeded, DocBrain returns 429 Too Many Requests with a Retry-After header.

GitLab MR Capture Webhook¶

The GitLab capture feature lets engineers trigger immediate ingestion by commenting @docbrain capture on any merge request.

Variable	Default	Description
`GITLAB_CAPTURE_WEBHOOK_SECRET`	—	HMAC secret shared with GitLab for webhook signature verification
`GITLAB_CAPTURE_TOKEN`	—	GitLab personal access token with `api` scope (fetches MR notes and posts reply comments)
`GITLAB_CAPTURE_BASE_URL`	`https://gitlab.com`	GitLab instance base URL (override for self-hosted)
`GITLAB_CAPTURE_ALLOWED_USERS`	—	Comma-separated GitLab usernames allowed to trigger capture. Empty = all users.
`GITLAB_CAPTURE_ALLOWED_PROJECTS`	—	Comma-separated project paths allowed to trigger capture. Empty = all projects. e.g. `myorg/myrepo`

See Ingestion Guide for full setup instructions.

GitHub Capture Security¶

These optional variables restrict which repos and users can trigger real-time GitHub PR/issue capture via @docbrain capture comments.

Variable	Default	Description
`GITHUB_CAPTURE_ALLOWED_REPOS`	—	Comma-separated `owner/repo` pairs allowed to trigger capture. Empty = all repos. e.g. `myorg/backend,myorg/frontend`
`GITHUB_CAPTURE_ALLOWED_USERS`	—	Comma-separated GitHub usernames allowed to trigger capture. Empty = all users. e.g. `alice,bob`

A 500KB content size guard applies to all capture requests. Oversized threads are rejected with a reply comment.

Confluence Webhooks (Real-Time Sync)¶

Variable	Default	Description
`CONFLUENCE_WEBHOOK_SECRET`	—	HMAC secret shared with Confluence. When set, DocBrain mounts `POST /confluence/events` and auto-ingests page changes in real time. Set as an environment variable (not in `config/local.yaml`).

When configured, DocBrain receives page_created, page_updated, page_restored, page_removed, and page_trashed events from Confluence and syncs changes automatically — no scheduled re-ingest needed.

Requires confluence.base_url and confluence.api_token to also be set in config/local.yaml (DocBrain needs API access to fetch the page content when a webhook fires).

See the Ingestion Guide for setup instructions.

Image Extraction¶

Variable	Default	Description
`IMAGE_EXTRACTION_ENABLED`	`true`	Extract and describe images from Confluence pages using vision LLM. Set to `false` to disable.
`INGEST_LLM_MODEL_ID`	—	Model used for image extraction during ingest. Falls back to `LLM_MODEL_ID` if not set. Set this to a cheaper model (Haiku, `gpt-4o-mini`) to avoid throttling and reduce cost.
`IMAGE_MAX_PER_PAGE`	`20`	Maximum images to process per Confluence page
`IMAGE_MIN_SIZE_BYTES`	`5120`	Skip images smaller than this in bytes (default: 5 KB) — filters out icons and decorative images
`IMAGE_MAX_SIZE_BYTES`	`10485760`	Skip images larger than this in bytes (default: 10 MB)
`IMAGE_DOWNLOAD_TIMEOUT`	`30`	HTTP download timeout in seconds per image
`IMAGE_LLM_TIMEOUT`	`120`	LLM vision call timeout in seconds (needs more time than download)

Image extraction requires a vision-capable LLM. Supported providers: Bedrock, Anthropic, OpenAI, and Ollama (with vision models like llava, llama3.2-vision, moondream). Text-only models (e.g. llama3.1) are auto-detected and images are skipped gracefully — no failures, no errors.

Web UI / CORS¶

Variable	Default	Description
`CORS_ALLOWED_ORIGINS`	`http://localhost:3001`	Comma-separated origins allowed to call the API. Only needed if the web UI is served from a non-default origin (e.g. `http://10.0.0.5:3001`, `https://docbrain.internal`)

Note: The default works out of the box for Docker Compose. You only need this if you access the web UI via a different hostname or port — for example, http://127.0.0.1:3001 is a different origin than http://localhost:3001.

Auth / Sessions¶

Variable	Default	Description
`LOGIN_SESSION_TTL_HOURS`	`720`	Session lifetime after email/password login (default: 720 hours = 30 days). Set to `0` for no expiry.
`MAX_QUERY_LENGTH`	`4000`	Maximum characters allowed for question and description inputs

Slack Integration (Optional)¶

Variable	Default	Description
`SLACK_BOT_TOKEN`	—	Slack bot OAuth token (`xoxb-...`)
`SLACK_SIGNING_SECRET`	—	Slack app signing secret
`SLACK_GAP_NOTIFICATION_CHANNEL`	—	Channel to post critical gap alerts after each analysis run (e.g. `#docs-alerts`). Only fires when new critical-severity gaps are found. Requires `SLACK_BOT_TOKEN`.

Notifications (Optional)¶

Variable	Default	Description
`NOTIFICATION_INTERVAL_HOURS`	`24`	How often to check for stale docs and send owner DMs
`NOTIFICATION_SPACE_FILTER`	—	Comma-separated spaces to limit notifications (e.g. `PLATFORM,SRE`). Empty = all spaces.

Documentation Autopilot (Optional)¶

Variable	Default	Description
`AUTOPILOT_ENABLED`	`false`	Enable the Documentation Autopilot (gap detection + draft generation)
`AUTOPILOT_GAP_ANALYSIS_INTERVAL_HOURS`	`6`	How often the background scheduler runs gap analysis
`AUTOPILOT_LOOKBACK_DAYS`	`30`	Days of query history to analyse for gaps
`AUTOPILOT_CLUSTER_THRESHOLD`	`0.82`	Cosine similarity threshold for grouping queries into a gap cluster (0.65 = loose, 0.85 = strict)
`AUTOPILOT_MIN_CLUSTER_SIZE`	`3`	Minimum episodes in a cluster to be considered a real gap
`AUTOPILOT_MIN_UNIQUE_USERS`	`2`	Minimum distinct users that must hit the same gap topic
`AUTOPILOT_MIN_NEGATIVE_RATIO`	`0.15`	Minimum fraction of queries on a topic that must have negative feedback
`AUTOPILOT_MAX_CLUSTERS`	`50`	Maximum gap clusters to persist per analysis run
`AUTOPILOT_MAX_EPISODES`	`500`	Maximum negative episodes to load per analysis run
`AUTOPILOT_AUTO_DRAFT`	`false`	Automatically generate drafts for qualifying gaps (no human trigger). Set to `true` to enable.
`AUTOPILOT_AUTO_DRAFT_SEVERITY`	`critical`	Minimum gap severity for auto-drafting: `critical`, `high`, `medium`, or `low`
`AUTOPILOT_CRITICAL_USERS`	`5`	Unique users needed for breadth score to reach 1.0. Lower for small teams.
`AUTOPILOT_CRITICAL_SIGNALS`	`15`	Negative signals needed for volume score to reach 1.0. Lower for low-traffic deployments.
`AUTOPILOT_CRITICAL_THRESHOLD`	`0.75`	Composite score cutoff for "critical" severity.
`AUTOPILOT_HIGH_THRESHOLD`	`0.55`	Composite score cutoff for "high" severity.
`AUTOPILOT_MEDIUM_THRESHOLD`	`0.35`	Composite score cutoff for "medium" severity.

When enabled, Autopilot runs on the configured schedule, exposes management endpoints at /api/v1/autopilot/*, and posts critical gap alerts to SLACK_GAP_NOTIFICATION_CHANNEL if configured. See the API Reference for endpoint details.

Small teams / dev environments: Set AUTOPILOT_CRITICAL_USERS=1, AUTOPILOT_CRITICAL_SIGNALS=3, AUTOPILOT_CRITICAL_THRESHOLD=0.3 to see critical gaps with minimal signal. See autopilot.md for a full tuning guide.

Freshness Scoring¶

Variable	Default	Description
`FRESHNESS_SCHEDULER_INTERVAL_HOURS`	`24`	How often freshness scores are recalculated for all documents
`CONTRADICTION_CHECKS_PER_PASS`	`10`	Max documents checked for contradictions per freshness run (LLM cost)
`CONTRADICTION_INCLUDE_RECENT_EVENT_DOCS`	`true`	Include recent Slack/PR/Jira docs in the contradiction pass alongside stalest docs
`CONTRADICTION_EVENT_DOC_MAX_AGE_DAYS`	`90`	Only event-based docs edited within this many days are eligible for contradiction checks

Semantic Quality Scoring¶

LLM-based quality assessment that evaluates documents on four dimensions: accuracy, completeness, clarity, and actionability (each scored 0-25, total 0-100). Runs as a background sweep on documents that have already been structurally scored.

Variable	Default	Description
`SEMANTIC_QUALITY_ENABLED`	`true`	Enable LLM-based semantic quality scoring
`SEMANTIC_QUALITY_INTERVAL_HOURS`	`24`	How often the semantic scoring sweep runs
`SEMANTIC_QUALITY_BUDGET`	`50`	Maximum documents scored per sweep (controls LLM cost)
`SEMANTIC_QUALITY_STRUCTURAL_THRESHOLD`	`40.0`	Minimum structural score required before a document is eligible for semantic scoring

The composite quality score blends structural and semantic scores at 50/50 weighting. Documents below the structural threshold are skipped to avoid wasting LLM calls on obviously poor content.

Capture Lifecycle¶

Captured content (GitHub PRs/issues, GitLab MRs, Slack threads) decays with age — unlike incident records (Jira, PagerDuty, Zendesk) which are permanent historical events. A 5-year-old PR discussing a replaced architecture should score low in freshness; a 2-week-old incident thread is always valid.

Cross-document references: During capture, DocBrain automatically extracts URLs from the description and comments — GitHub PRs, GitLab MRs, Jira tickets, Confluence pages, and other linked resources. These are stored as a reference graph in PostgreSQL and used to enrich RAG context at query time by fetching chunks from referenced documents. GitLab shorthand references (!123 for MRs, #123 for issues) are resolved to full URLs within the same project.

Space assignment: Captures are stored under a meaningful space name derived from the source: - GitHub captures → owner/repo (e.g., myorg/backend) - GitLab captures → group/project (e.g., platform/api) - Slack captures → channel name (e.g., platform-incidents)

This makes allowed_spaces ACL filtering work correctly — a key scoped to ["myorg/backend"] will include GitHub captures from that repo.

Age baseline: Freshness is calculated from the original content creation date (when the PR was opened, when the Slack thread started) — not the time DocBrain captured it. Re-capturing the same thread updates its content but preserves the original creation date as the staleness baseline.

Memory Consolidation¶

Variable	Default	Description
`CONSOLIDATION_INTERVAL_HOURS`	`6`	How often the memory consolidation job runs (merges episodic patterns into semantic/procedural memory)

RAG Pipeline¶

Variable	Default	Description
`RAG_TOP_K`	`10`	Chunks retrieved per query. Higher = more context passed to the LLM, at the cost of more tokens per call. Raise to `15`–`20` if answers are missing obvious information; lower to `5` to reduce cost on simple corpora.
`RAG_BM25_BOOST`	`1.0`	Weight of keyword (BM25) search relative to vector search in hybrid retrieval. Raise to `2.0`–`3.0` for corpora heavy with exact-match queries — error codes, CLI commands, ticket IDs, specific tool names. Leave at `1.0` for general prose documentation.
`SEARCH_MIN_SCORE`	`0.0`	Drop retrieved chunks below this relevance score before sending context to the LLM. `0.0` keeps everything. Set to `0.3`–`0.4` if you notice irrelevant chunks contaminating answers; leave at `0.0` for small corpora where recall matters more than precision.
`RAG_CACHE_TTL_HOURS`	`24`	How long to cache semantically identical answers
`RAG_CACHE_THRESHOLD`	`0.95`	Cosine similarity threshold for a query to count as a cache hit

Chunking¶

Controls how documents are split before embedding. See Ingestion Guide for re-ingest instructions.

Variable	Default	Description
`CHUNK_SIZE`	`1500`	Target chunk size in characters. Dense API refs: `800`–`1200`. General docs: `1500`. Long-form prose: `2000`–`2500`.
`CHUNK_OVERLAP`	`200`	Overlap between adjacent paragraph-split chunks in characters.

OpenSearch Index Names¶

Variable	Default	Description
`OPENSEARCH_INDEX`	`docbrain-chunks`	Index name for document chunks (vectors + BM25)
`OPENSEARCH_EPISODE_INDEX`	`docbrain-episodes`	Index name for episode vectors (used in episodic memory recall)

Only change these if you run multiple DocBrain instances sharing the same OpenSearch cluster, to avoid index collisions.

Data Retention¶

Variable	Default	Description
`EPISODE_RETENTION_DAYS`	`90`	Episode (query history) rows older than this are pruned daily. Set to `0` to disable pruning.
`AUDIT_RETENTION_DAYS`	`365`	Audit log rows older than this are pruned daily. Set to `0` to disable pruning.

Self-Ingest (Optional)¶

Variable	Default	Description
`DOCBRAIN_SELF_INGEST`	`true`	Auto-ingest DocBrain's own docs so it can answer configuration questions about itself
`DOCBRAIN_DOCS_PATH`	`./docs`	Path to DocBrain's own documentation directory

SSO / OIDC (Enterprise)¶

Variable	Default	Description
`OIDC_ISSUER_URL`	—	OIDC provider URL (e.g. `https://accounts.google.com`)
`OIDC_CLIENT_ID`	—	OAuth client ID
`OIDC_CLIENT_SECRET`	—	OAuth client secret
`OIDC_REDIRECT_URI`	—	Callback URI (e.g. `https://docbrain.example.com/api/v1/auth/oidc/callback`)
`OIDC_WEB_UI_URL`	`http://localhost:3001`	Where to redirect after successful login
`OIDC_ACCEPT_INVALID_CERTS`	`false`	Set to `true` to skip TLS verification — use for corporate/self-signed CAs

GitLab OIDC¶

Variable	Default	Description
`GITLAB_OIDC_ISSUER_URL`	—	GitLab instance URL (e.g. `https://gitlab.com` or `https://gitlab.corp.example.com`)
`GITLAB_CLIENT_ID`	—	GitLab OAuth application client ID
`GITLAB_CLIENT_SECRET`	—	GitLab OAuth application client secret
`GITLAB_REDIRECT_URI`	—	Callback URL (e.g. `https://docbrain.example.com/api/v1/auth/gitlab/callback`)

Corporate GitLab: If your self-hosted GitLab uses an internal CA, set OIDC_ACCEPT_INVALID_CERTS=true.

RBAC Role Assignment¶

Role is computed at login time and stored on the user record. The hierarchy is: viewer (1) < editor (2) < analyst (3) < admin (4). Higher-priority rules win.

Variable	Helm key	Description
`OIDC_DEFAULT_ROLE`	`rbac.defaultRole`	Role assigned to new SSO users who match no group rule. Default: `viewer`.
`OIDC_ADMIN_EMAILS`	`rbac.adminEmails`	Comma-separated emails that always receive `admin`.
`OIDC_ADMIN_DOMAIN`	`rbac.adminDomain`	Email domain whose users receive `admin` (e.g. `acme.com`).
`OIDC_ADMIN_GROUPS`	`rbac.adminGroups`	Comma-separated IdP group names → `admin` role.
`OIDC_EDITOR_GROUPS`	`rbac.editorGroups`	Comma-separated IdP group names → `editor` role.
`OIDC_ALLOWED_GROUPS`	`rbac.allowedGroups`	Access gate: only these groups may log in (all others get 403).
`OIDC_ALLOWED_DOMAINS`	`rbac.allowedDomains`	Access gate: only these email domains may log in.

What every engineer can see¶

All authenticated users (including viewer) have full access to the intelligence dashboards:

Page	What it shows
Velocity	Documentation ROI — queries deflected, hours saved, cost saved, per-team breakdown
Predictive	Predicted documentation gaps from code changes, cascade staleness, seasonal patterns, onboarding risks
Maintenance	AI-generated fix proposals with apply/reject workflow
Stream	Live knowledge event feed — incident warnings, freshness decay alerts, trending gaps

These dashboards are visible to every engineer. The insight loop only works if the people who can act on it — the engineers — can actually see it.

Example — typical multi-team setup:

rbac:
  defaultRole: "viewer"
  adminGroups: "platform-team"
  editorGroups: "docs-writers"

# Equivalent env vars
OIDC_DEFAULT_ROLE=viewer
OIDC_ADMIN_GROUPS=platform-team
OIDC_EDITOR_GROUPS=docs-writers

Note: Role is evaluated at login time. Group changes in your IdP take effect on next login.

Documentation Analytics¶

Variable	Default	Description
`VELOCITY_MINUTES_SAVED_PER_QUERY`	`15`	Estimated minutes saved per deflected query
`VELOCITY_HOURLY_RATE`	`75`	Effective hourly rate (USD) for ROI calculation

Knowledge Stream¶

Variable	Default	Description
`STREAM_ENABLED`	`false`	Enable background knowledge stream emission
`STREAM_INTERVAL_MINUTES`	`30`	How often the stream background task runs
`STREAM_INCIDENT_WARNING_MIN_USERS`	`2`	Minimum unique users hitting an unanswered question to emit an incident warning
`STREAM_DECAY_THRESHOLD`	`0.5`	Freshness score below which a decay alert is emitted

Event Bus¶

The event bus is internal pub/sub infrastructure — always enabled, no opt-in required. Every significant action (document ingest, gap detection, draft generation, etc.) emits a typed event that subscribers can react to.

Variable	Default	Description
`EVENT_BUS_CAPACITY`	`4096`	Broadcast channel buffer size. Increase if subscribers lag under high event volume. Max: 65536.
`EVENT_LOG_RETENTION_DAYS`	`90`	Days to retain events in the `event_log` table before purging.

Admin API endpoints:

Method	Path	Description
`GET`	`/api/v1/events`	Query the persistent event log. Supports `?type=gap.detected&since=2026-03-01&limit=100&offset=0`.
`GET`	`/api/v1/events/stream`	SSE stream of real-time events. Max 10 concurrent connections.

Both endpoints require admin role.

Knowledge Fragments¶

Knowledge fragments are first-class units of knowledge — smaller than documents, richer than chunks. They capture decisions, facts, caveats, procedures, and context from PRs, commits, IDE annotations, conversations, CI/CD pipelines, and manual entry.

Fragments are routed by confidence score: high-confidence fragments are auto-indexed into search, medium-confidence go to a review queue, and low-confidence are auto-discarded.

Variable	Default	Description
`FRAGMENT_AUTO_INDEX_THRESHOLD`	`0.7`	Minimum confidence score to auto-index a fragment into OpenSearch.
`FRAGMENT_REVIEW_THRESHOLD`	`0.4`	Minimum confidence for the review queue. Fragments below this are auto-discarded.
`FRAGMENT_MAX_CONTENT_LENGTH`	`10000`	Maximum fragment content length in characters.

Fragment Clustering & Auto-Composition¶

Semantic clustering groups related fragments by topic using embedding similarity (DBSCAN-style greedy algorithm). When a cluster meets composability criteria (5+ fragments, diverse sources, 500+ words), it can be auto-composed into a documentation draft via the API.

Variable	Default	Description
`FRAGMENT_CLUSTERING_ENABLED`	`true`	Enable or disable the fragment clustering endpoint.
`FRAGMENT_CLUSTER_THRESHOLD`	`0.80`	Cosine similarity threshold for grouping fragments (0.60 = loose, 0.90 = strict).
`FRAGMENT_MIN_CLUSTER_SIZE`	`3`	Minimum fragments required to form a cluster.
`FRAGMENT_MIN_SOURCE_DIVERSITY`	`2`	Minimum distinct source types for a cluster to be composable.
`FRAGMENT_MAX_PER_CLUSTERING_RUN`	`2000`	Maximum fragments loaded per clustering run (memory/cost control).

CI/CD Pipeline Capture¶

Automated knowledge extraction from merged PRs and deployments. When enabled, DocBrain provides API endpoints that CI/CD pipelines can call to extract knowledge fragments from pull requests and deployment events. Uses the fast/cheap LLM model to keep costs low at high volume.

Variable	Default	Description
`CI_ANALYZE_ENABLED`	`true`	Enable or disable the CI/CD capture endpoints (`/api/v1/ci/analyze` and `/api/v1/ci/deploy-capture`).

See the API Reference for endpoint details and the GitHub Action setup guide.

Conversation Auto-Distillation¶

Automatically extracts structured knowledge fragments from captured conversations — Slack threads (via /docbrain sync) and GitHub PR discussions (via @docbrain capture). After a successful capture, DocBrain runs LLM-powered distillation in the background to identify decisions, facts, caveats, procedures, and context embedded in the conversation.

Distillation is fire-and-forget: it never affects capture response time. Failures are logged and metriced but don't block the capture path.

Variable	Default	Description
`DISTILLATION_ENABLED`	`true`	Enable or disable conversation auto-distillation.
`DISTILLATION_MAX_CONCURRENT`	`3`	Maximum concurrent LLM distillation calls (bounded by semaphore).
`DISTILLATION_MAX_CONTENT_CHARS`	`8000`	Maximum conversation characters sent to the LLM. Longer conversations are truncated (tail-biased — keeps the most recent messages).
`DISTILLATION_MAX_FRAGMENTS`	`5`	Maximum knowledge fragments extracted per conversation.

Governance SLA Checker¶

The SLA checker runs as a periodic background task that detects breaches across four entity types: gap acknowledgment, gap resolution, draft review, and document freshness. SLA thresholds are stored in the database (per-space overridable via the API) — these settings control the checker's operational behavior.

Variable	Default	Description
`SLA_CHECKER_INTERVAL_HOURS`	`1`	How often the SLA breach checker runs (hours).
`SLA_CHECKER_QUERY_TIMEOUT_SECS`	`30`	Per-entity-type query timeout in seconds.
`SLA_CHECKER_MAX_CANDIDATES`	`5000`	Maximum candidate entities scanned per type per run.
`SLA_CHECKER_MAX_EVENTS_PER_RUN`	`50`	Maximum `SlaBreached` events emitted per run (prevents webhook flooding).

See the API Reference — Governance SLAs for endpoint documentation.

External Connectors (HTTP Connector Protocol)¶

External connectors are stateless HTTP servers that implement a simple REST contract (GET /health, POST /documents/list, POST /documents/fetch). DocBrain calls them on a configurable cron schedule to ingest documents from external systems. Connectors are registered and managed via the admin API.

The connector scheduler runs as a background task, polling every 60 seconds for connectors whose cron schedule is due. A circuit breaker automatically disables connectors after repeated failures.

Variable	Default	Description
`CONNECTOR_ENABLED`	`true`	Enable/disable the connector scheduler
`CONNECTOR_MAX_CONCURRENT_SYNCS`	`3`	Max connectors syncing simultaneously (1-20)
`CONNECTOR_MAX_PAGES_PER_SYNC`	`200`	Max list pages fetched per sync
`CONNECTOR_MAX_DOCUMENTS_PER_SYNC`	`5000`	Max documents ingested per sync
`CONNECTOR_FETCH_BATCH_SIZE`	`50`	Documents fetched per batch (1-200)
`CONNECTOR_REQUEST_TIMEOUT_SECS`	`30`	HTTP timeout for individual connector requests (5-300 seconds)
`CONNECTOR_SYNC_TIMEOUT_SECS`	`3600`	Overall sync timeout per connector (60-7200 seconds)
`CONNECTOR_MAX_RESPONSE_BYTES`	`10485760`	Max response body size from connector (10 MB)
`CONNECTOR_CIRCUIT_BREAKER_THRESHOLD`	`5`	Consecutive failures before auto-disabling a connector
`CONNECTOR_ALLOW_INTERNAL`	`false`	Allow connector URLs on private/internal IP addresses. Not recommended for production.

See the API Reference — Connectors for endpoint documentation and the connector protocol spec.

Webhooks (Outbound)¶

Outbound webhook subscriptions let you push DocBrain events to external systems — Slack bots, CI/CD pipelines, PagerDuty, custom dashboards, etc. DocBrain signs every delivery with HMAC-SHA256, retries with exponential backoff, and automatically disables subscriptions that fail repeatedly (circuit breaker).

Variable	Default	Description
`WEBHOOK_DELIVERY_TIMEOUT_SECONDS`	`10`	HTTP timeout per webhook delivery attempt (1-60 seconds)
`WEBHOOK_MAX_RETRIES`	`4`	Maximum delivery attempts before giving up (1-10)
`WEBHOOK_CIRCUIT_BREAKER_THRESHOLD`	`10`	Consecutive failures before auto-disabling a subscription (3-100)
`ALLOW_INTERNAL_WEBHOOKS`	`false`	Allow delivery to private/internal IP addresses (10.x, 172.16.x, 192.168.x). Not recommended for production.

See the API Reference — Webhooks for endpoint documentation and event types.

Style Rules Engine¶

The style rules engine provides configurable linting for documentation consistency. Rules are always enabled — no opt-in required. Rules are managed via the API (CRUD + YAML import/export) and stored in PostgreSQL.

Rules are scoped either globally (space = null) or per-space. When linting, global rules apply to all content, and space-specific rules override global rules with the same (rule_type, name) key.

Five default rules are seeded on first migration:

Rule	Type	Default Severity
`avoid-simple`	terminology	warning
`avoid-just`	terminology	warning
`max-heading-depth` (H4)	formatting	warning
`max-sentence-length` (40 words)	formatting	info
`require-intro`	structure	warning

API endpoints: See API Reference — Style Rules Engine for full endpoint documentation.

There are no environment variables for the style rules engine — all limits are compile-time constants. Custom rules are created and managed entirely through the API.