Skip to content

Configuration Reference

How Configuration Works

DocBrain uses a config-first architecture with a layered YAML + environment variable system. Understanding this prevents confusion about why a value isn't taking effect.

Loading Order (later = higher priority)

config/default.yaml         ← committed to repo — all non-secret defaults
config/{APP_ENV}.yaml       ← environment-specific overrides (development | production)
config/local.yaml           ← gitignored — your secrets and local overrides
Environment variables / .env ← always win — highest priority

Set APP_ENV=production for the production profile (this is the default in the Docker image). The server defaults to APP_ENV=development when running locally without Docker.

What Goes Where

Type Where to put it
Infrastructure secrets (DB URL, LLM API keys, Redis, OpenSearch) .env or environment variables
Ingest source credentials (Confluence token, GitHub token, Slack token, Jira token) config/local.yaml (gitignored)
Deployment-specific values (URLs, ports, CORS origins) .env or environment variables
Tuning (thresholds, intervals, cache TTLs) config/local.yaml or env vars
Team-wide defaults you want committed config/default.yaml (no secrets!)

The key distinction: .env is for infrastructure secrets that the runtime environment must inject (container orchestration, CI/CD, secrets managers). config/local.yaml is for user-managed source credentials and personal overrides — it's gitignored so it never gets committed, but it lives alongside the project where you can edit it easily.

Example config/local.yaml

# config/local.yaml — never committed (gitignored)
# Configure ingest sources and personal overrides here.

ingest:
  ingest_sources: confluence,github_pr

confluence:
  base_url: https://acme.atlassian.net/wiki
  user_email: you@acme.com
  api_token: ATATT3x...
  space_keys: DOCS,ENG

github_pr:
  token: ghp_...
  repo: acme/platform
  lookback_days: 180

# Local tuning overrides (optional)
autopilot:
  enabled: true
  cluster_threshold: 0.78

rag:
  cache_ttl_hours: 1

YAML Config Structure

Every YAML value supports ${ENV_VAR} and ${ENV_VAR:-default} substitution:

database:
  url: "${DATABASE_URL}"     # required — must come from env
  max_connections: "${DB_MAX_CONNECTIONS:-10}"

Custom Config Directory

# Mount a ConfigMap in Kubernetes
DOCBRAIN_CONFIG_DIR=/etc/docbrain docbrain-server

# Or pass as CLI argument
docbrain-server --config-dir /etc/docbrain

All configuration is also available via environment variables, set in .env for Docker Compose or via ConfigMap/Secret for Kubernetes. Environment variables always override YAML values.

Infrastructure

Variable Default Description
DATABASE_URL PostgreSQL connection string
OPENSEARCH_URL http://localhost:9200 OpenSearch endpoint
REDIS_URL redis://localhost:6379 Redis connection string
SERVER_PORT 3000 API server listen port
SERVER_BIND 0.0.0.0 API server bind address
LOG_LEVEL info Log verbosity: trace, debug, info, warn, error
DB_MAX_CONNECTIONS 10 Maximum PostgreSQL connection pool size
DB_CONNECT_TIMEOUT_SECS 10 Timeout (seconds) for initial PostgreSQL connection
DB_ACQUIRE_TIMEOUT_SECS 10 Timeout (seconds) to acquire a connection from the pool
DB_IDLE_TIMEOUT_SECS 300 Idle connection lifetime (seconds) before cleanup

LLM Provider

Variable Default Description
LLM_PROVIDER bedrock Provider: bedrock, anthropic, openai, ollama, groq, openrouter, together, deepseek, mistral, xai, gemini, azure_openai, vertex_ai, cohere
LLM_MODEL_ID varies Model identifier (provider-specific)
FAST_MODEL_ID Fast/cheap model for background side-calls: intent classification, query rewriting, entity extraction. Falls back to LLM_MODEL_ID if not set. Recommended: Haiku (Bedrock/Anthropic), gpt-4o-mini (OpenAI), qwen2.5:7b (Ollama). Alias: HAIKU_MODEL_ID (deprecated).
INGEST_LLM_MODEL_ID Model used during ingest only for image extraction. Falls back to LLM_MODEL_ID if not set. Set this to a cheaper model — image extraction fires for every page with images. Using Opus 4 with LLM_THINKING_BUDGET without this override will cause throttling errors during ingest.
DRAFT_MODEL_ID Model used for autopilot draft generation (two-phase reasoning + writing). Falls back to LLM_MODEL_ID if not set. Use a high-capability model here — drafts benefit from stronger reasoning.
DRAFT_LLM_PROVIDER Provider for draft generation. Falls back to LLM_PROVIDER if not set. Allows cross-provider drafting — e.g. use Gemini Flash for Q&A but Anthropic Claude for drafts.
LLM_THINKING_BUDGET Extended thinking token budget (tokens). Unset or 0 = disabled. Only applies to the primary LLM_MODEL_ID, never to FAST_MODEL_ID or INGEST_LLM_MODEL_ID.
ANTHROPIC_API_KEY API key (if LLM_PROVIDER=anthropic)
OPENAI_API_KEY API key (if LLM_PROVIDER=openai)
OLLAMA_BASE_URL http://localhost:11434 Ollama server URL
OLLAMA_TIMEOUT_SECS 120 HTTP timeout in seconds for Ollama requests. Increase for large/slow models (e.g. 70B) to avoid "error decoding response body" when the model takes longer than 2 minutes. Example: 300 or 600. Allowed range: 60–900.
OLLAMA_TLS_VERIFY false Set to true to enforce TLS certificate validation for Ollama
OLLAMA_VISION_ENABLED true Set to false if your Ollama model doesn't support vision (skips image calls)
AWS_REGION AWS region for Bedrock (e.g. us-east-1)
AWS_ACCESS_KEY_ID AWS access key (optional — see credential chain below)
AWS_SECRET_ACCESS_KEY AWS secret key (optional — see credential chain below)
GROQ_API_KEY API key (if LLM_PROVIDER=groq)
OPENROUTER_API_KEY API key (if LLM_PROVIDER=openrouter)
TOGETHER_API_KEY API key (if LLM_PROVIDER=together)
DEEPSEEK_API_KEY API key (if LLM_PROVIDER=deepseek)
MISTRAL_API_KEY API key (if LLM_PROVIDER=mistral)
XAI_API_KEY API key (if LLM_PROVIDER=xai)
GEMINI_API_KEY API key (if LLM_PROVIDER=gemini)
AZURE_OPENAI_API_KEY API key (if LLM_PROVIDER=azure_openai)
AZURE_OPENAI_ENDPOINT Resource endpoint (if LLM_PROVIDER=azure_openai). e.g. https://my-resource.openai.azure.com
AZURE_OPENAI_API_VERSION 2024-02-01 API version (if LLM_PROVIDER=azure_openai)
VERTEX_PROJECT GCP project ID (if LLM_PROVIDER=vertex_ai). Required.
VERTEX_REGION us-central1 GCP region (if LLM_PROVIDER=vertex_ai)
COHERE_API_KEY API key (if LLM_PROVIDER=cohere)

AWS Credential Chain: Bedrock uses the AWS SDK default credential chain: env vars → ~/.aws/credentials → IRSA (EKS) → EC2 Instance Profile → ECS Task Role. In production, use IRSA or instance profiles — no keys in env. Set serviceAccount.create=true and serviceAccount.annotations.eks.amazonaws.com/role-arn in Helm. The IAM role needs bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions. See providers.md for full setup details.

GCP Credential Chain: Vertex AI uses gcp_auth which resolves credentials in this order: GOOGLE_APPLICATION_CREDENTIALS (service account key file) → Application Default Credentials (gcloud auth application-default login) → GKE Workload Identity → GCE/Cloud Run metadata service. In production on GKE, use Workload Identity — no keys needed in the cluster. See providers.md for Workload Identity setup details.

Ollama: model selection and tuning

Only use models with strong instruction-following capabilities. DocBrain's RAG pipeline requires the LLM to stay strictly grounded in retrieved documents. Models that default to training data instead of provided context will produce fabricated answers. Recommended: command-r:35b (purpose-built for RAG). See providers.md for the full model comparison table.

  • Recommended config: LLM_MODEL_ID=command-r:35b and FAST_MODEL_ID=qwen2.5:7b. The fast model handles intent classification and query rewriting; only the final answer uses the primary model.
  • "Error decoding response body" after 2–3 minutes: The default HTTP timeout is 120 seconds. If the model takes longer to generate the full response, the connection is cut and you get a decode error. Set OLLAMA_TIMEOUT_SECS=300 (or 600) so the client waits long enough.

Embedding Provider

Set EMBED_PROVIDER to choose your embedding model. One of: openai, bedrock, ollama.

Variable Default Description
EMBED_PROVIDER bedrock Provider: bedrock, openai, ollama
EMBED_MODEL_ID varies Embedding model identifier (e.g. text-embedding-3-small, cohere.embed-v4:0)

Switching Embedding Models

When you change EMBED_PROVIDER or EMBED_MODEL_ID to a model with different vector dimensions (e.g. Bedrock Cohere/1024 → Ollama nomic-embed-text/768), the server will refuse to start with a clear error:

Embedding dimension mismatch on index 'docbrain-chunks': existing=1024, required=768.

To migrate:

  1. Set FORCE_REINDEX=true in your environment
  2. Restart the server and run ingest — the old indexes are deleted and recreated
  3. Remove FORCE_REINDEX after the migration completes
Variable Default Description
FORCE_REINDEX false Delete and recreate OpenSearch indexes when embedding dimensions change. Set once during migration, then remove.

Document Ingestion

Configure sources in config/local.yaml (gitignored). Put only infrastructure secrets in .env.

General

Setting (config/local.yaml key) Env var equivalent Default Description
ingest.ingest_sources INGEST_SOURCES local Comma-separated list of active sources: local, confluence, github, github_pr, gitlab_mr, slack_thread, jira
ingest.self_ingest DOCBRAIN_SELF_INGEST true Auto-ingest DocBrain's own docs
ingest.image_extraction_enabled IMAGE_EXTRACTION_ENABLED true Extract and describe images using vision LLM

Local Files

Variable Default Description
LOCAL_DOCS_PATH Directory path for local file ingestion (set in .env or as env var)

Confluence

Set credentials in config/local.yaml:

confluence:
  base_url: https://yourco.atlassian.net/wiki
  user_email: you@yourco.com
  api_token: ATATT3x...
  space_keys: ENG,DOCS
Key Env var Default Description
confluence.base_url CONFLUENCE_BASE_URL Atlassian instance URL (must include /wiki)
confluence.user_email CONFLUENCE_USER_EMAIL Auth email (not required for v1 Data Center)
confluence.api_token CONFLUENCE_API_TOKEN API token (Cloud) or Personal Access Token (Data Center)
confluence.space_keys CONFLUENCE_SPACE_KEYS Comma-separated space keys to ingest
confluence.page_limit CONFLUENCE_PAGE_LIMIT 0 (unlimited) Max pages per space. 0 = all pages.
confluence.api_version CONFLUENCE_API_VERSION v2 v2 for Cloud, v1 for Data Center 7.x+
confluence.tls_verify CONFLUENCE_TLS_VERIFY true Set to false for self-signed certs
confluence.webhook_secret CONFLUENCE_WEBHOOK_SECRET HMAC secret for real-time webhook sync (set as env var)

GitHub Repository

# config/local.yaml
github:
  repo_url: https://github.com/your-org/your-docs
  token: ghp_...    # only for private repos
  branch: main
Key Env var Default Description
github.repo_url GITHUB_REPO_URL Repository URL to clone and ingest
github.token GITHUB_TOKEN Personal access token (optional for public repos)
github.branch GITHUB_BRANCH main Branch to ingest from

GitHub Pull Requests

Ingest PR titles, descriptions, and review discussions as searchable knowledge.

# config/local.yaml
github_pr:
  token: ghp_...
  repo: acme/platform
  lookback_days: 365
  min_comments: 1
Key Env var Default Description
github_pr.token GITHUB_PR_TOKEN GitHub personal access token (secret — set in config/local.yaml)
github_pr.repo GITHUB_PR_REPO Owner/repo (e.g. acme/platform) — set in config/local.yaml
github_pr.lookback_days GITHUB_PR_LOOKBACK_DAYS 365 How far back to fetch PRs
github_pr.min_comments GITHUB_PR_MIN_COMMENTS 1 Minimum comments for a PR to be ingested
github_pr.labels GITHUB_PR_LABELS Comma-separated label filter (optional)
github_pr.api_url GITHUB_PR_API_URL Override for GitHub Enterprise (optional)

GitLab Merge Requests

Ingest MR titles, descriptions, and discussion threads.

# config/local.yaml
gitlab_mr:
  token: glpat-...
  project_ids: acme/platform,acme/infra
  lookback_days: 365
Key Env var Default Description
gitlab_mr.token GITLAB_TOKEN GitLab personal access token (secret — set in config/local.yaml)
gitlab_mr.base_url GITLAB_BASE_URL https://gitlab.com GitLab instance URL
gitlab_mr.project_ids GITLAB_PROJECT_IDS Comma-separated namespace/repo paths — set in config/local.yaml
gitlab_mr.lookback_days GITLAB_MR_LOOKBACK_DAYS 365 How far back to fetch MRs
gitlab_mr.min_notes GITLAB_MR_MIN_NOTES 1 Minimum notes/comments for an MR to be ingested
gitlab_mr.labels GITLAB_MR_LABELS Comma-separated label filter (optional)
gitlab_mr.tls_verify GITLAB_TLS_VERIFY true Set to false for self-signed certs (batch ingest)
gitlabCapture.tlsInsecure GITLAB_CAPTURE_TLS_INSECURE false Set to true for self-signed certs (real-time capture)

Slack Threads

Ingest high-signal Slack threads (by reaction count or reply threshold).

# config/local.yaml
slack_ingest:
  token: xoxb-...
  channels: C01234567,C09876543
  min_replies: 3
  reactions: "white_check_mark,bookmark"
  lookback_days: 90
Key Env var Default Description
slack_ingest.token SLACK_INGEST_TOKEN Slack bot token (secret — set in config/local.yaml)
slack_ingest.channels SLACK_INGEST_CHANNELS Comma-separated channel IDs — set in config/local.yaml
slack_ingest.min_replies SLACK_MIN_REPLIES 3 Minimum thread replies to be ingested
slack_ingest.reactions SLACK_INGEST_REACTIONS white_check_mark,bookmark Comma-separated reaction names that flag a thread for ingest
slack_ingest.lookback_days SLACK_LOOKBACK_DAYS 90 How far back to scan channels

Jira

Ingest Jira issues (bugs, stories, tasks, epics) as searchable knowledge.

# config/local.yaml
jira_ingest:
  base_url: https://yourcompany.atlassian.net
  user_email: you@yourcompany.com
  api_token: your-token
  projects: ENG,OPS
  lookback_days: 365
Key Env var Default Description
jira_ingest.base_url JIRA_BASE_URL Jira instance URL — set in config/local.yaml
jira_ingest.user_email JIRA_USER_EMAIL Jira account email — set in config/local.yaml
jira_ingest.api_token JIRA_API_TOKEN Jira API token (secret — set in config/local.yaml)
jira_ingest.projects JIRA_PROJECTS Comma-separated project keys — set in config/local.yaml
jira_ingest.jql_filter JIRA_JQL_FILTER Additional JQL filter (optional)
jira_ingest.lookback_days JIRA_LOOKBACK_DAYS 365 How far back to fetch issues
jira_ingest.issue_types JIRA_ISSUE_TYPES Bug,Story,Task,Epic Comma-separated issue types to ingest

Rate Limiting

DocBrain applies per-IP rate limiting to unauthenticated routes and per-API-key rate limiting to authenticated routes. Rate limiting is enabled by default.

Variable Default Description
RATE_LIMIT_ENABLED true Set to false to disable all rate limiting (not recommended for production)
RATE_LIMIT_RPM 60 Requests per minute per IP on unauthenticated routes
RATE_LIMIT_AUTH_RPM 120 Requests per minute per API key on authenticated routes
RATE_LIMIT_WEBHOOK_RPM 30 Requests per minute per IP on webhook endpoints (/github/events, /gitlab/events)

When a rate limit is exceeded, DocBrain returns 429 Too Many Requests with a Retry-After header.

GitLab MR Capture Webhook

The GitLab capture feature lets engineers trigger immediate ingestion by commenting @docbrain capture on any merge request.

Variable Default Description
GITLAB_CAPTURE_WEBHOOK_SECRET HMAC secret shared with GitLab for webhook signature verification
GITLAB_CAPTURE_TOKEN GitLab personal access token with api scope (fetches MR notes and posts reply comments)
GITLAB_CAPTURE_BASE_URL https://gitlab.com GitLab instance base URL (override for self-hosted)
GITLAB_CAPTURE_ALLOWED_USERS Comma-separated GitLab usernames allowed to trigger capture. Empty = all users.
GITLAB_CAPTURE_ALLOWED_PROJECTS Comma-separated project paths allowed to trigger capture. Empty = all projects. e.g. myorg/myrepo

See Ingestion Guide for full setup instructions.

GitHub Capture Security

These optional variables restrict which repos and users can trigger real-time GitHub PR/issue capture via @docbrain capture comments.

Variable Default Description
GITHUB_CAPTURE_ALLOWED_REPOS Comma-separated owner/repo pairs allowed to trigger capture. Empty = all repos. e.g. myorg/backend,myorg/frontend
GITHUB_CAPTURE_ALLOWED_USERS Comma-separated GitHub usernames allowed to trigger capture. Empty = all users. e.g. alice,bob

A 500KB content size guard applies to all capture requests. Oversized threads are rejected with a reply comment.

Confluence Webhooks (Real-Time Sync)

Variable Default Description
CONFLUENCE_WEBHOOK_SECRET HMAC secret shared with Confluence. When set, DocBrain mounts POST /confluence/events and auto-ingests page changes in real time. Set as an environment variable (not in config/local.yaml).

When configured, DocBrain receives page_created, page_updated, page_restored, page_removed, and page_trashed events from Confluence and syncs changes automatically — no scheduled re-ingest needed.

Requires confluence.base_url and confluence.api_token to also be set in config/local.yaml (DocBrain needs API access to fetch the page content when a webhook fires).

See the Ingestion Guide for setup instructions.

Image Extraction

Variable Default Description
IMAGE_EXTRACTION_ENABLED true Extract and describe images from Confluence pages using vision LLM. Set to false to disable.
INGEST_LLM_MODEL_ID Model used for image extraction during ingest. Falls back to LLM_MODEL_ID if not set. Set this to a cheaper model (Haiku, gpt-4o-mini) to avoid throttling and reduce cost.
IMAGE_MAX_PER_PAGE 20 Maximum images to process per Confluence page
IMAGE_MIN_SIZE_BYTES 5120 Skip images smaller than this in bytes (default: 5 KB) — filters out icons and decorative images
IMAGE_MAX_SIZE_BYTES 10485760 Skip images larger than this in bytes (default: 10 MB)
IMAGE_DOWNLOAD_TIMEOUT 30 HTTP download timeout in seconds per image
IMAGE_LLM_TIMEOUT 120 LLM vision call timeout in seconds (needs more time than download)

Image extraction requires a vision-capable LLM. Supported providers: Bedrock, Anthropic, OpenAI, and Ollama (with vision models like llava, llama3.2-vision, moondream). Text-only models (e.g. llama3.1) are auto-detected and images are skipped gracefully — no failures, no errors.

Web UI / CORS

Variable Default Description
CORS_ALLOWED_ORIGINS http://localhost:3001 Comma-separated origins allowed to call the API. Only needed if the web UI is served from a non-default origin (e.g. http://10.0.0.5:3001, https://docbrain.internal)

Note: The default works out of the box for Docker Compose. You only need this if you access the web UI via a different hostname or port — for example, http://127.0.0.1:3001 is a different origin than http://localhost:3001.

Auth / Sessions

Variable Default Description
LOGIN_SESSION_TTL_HOURS 720 Session lifetime after email/password login (default: 720 hours = 30 days). Set to 0 for no expiry.
MAX_QUERY_LENGTH 4000 Maximum characters allowed for question and description inputs

Slack Integration (Optional)

Variable Default Description
SLACK_BOT_TOKEN Slack bot OAuth token (xoxb-...)
SLACK_SIGNING_SECRET Slack app signing secret
SLACK_GAP_NOTIFICATION_CHANNEL Channel to post critical gap alerts after each analysis run (e.g. #docs-alerts). Only fires when new critical-severity gaps are found. Requires SLACK_BOT_TOKEN.

Notifications (Optional)

Variable Default Description
NOTIFICATION_INTERVAL_HOURS 24 How often to check for stale docs and send owner DMs
NOTIFICATION_SPACE_FILTER Comma-separated spaces to limit notifications (e.g. PLATFORM,SRE). Empty = all spaces.

Documentation Autopilot (Optional)

Variable Default Description
AUTOPILOT_ENABLED false Enable the Documentation Autopilot (gap detection + draft generation)
AUTOPILOT_GAP_ANALYSIS_INTERVAL_HOURS 6 How often the background scheduler runs gap analysis
AUTOPILOT_LOOKBACK_DAYS 30 Days of query history to analyse for gaps
AUTOPILOT_CLUSTER_THRESHOLD 0.82 Cosine similarity threshold for grouping queries into a gap cluster (0.65 = loose, 0.85 = strict)
AUTOPILOT_MIN_CLUSTER_SIZE 3 Minimum episodes in a cluster to be considered a real gap
AUTOPILOT_MIN_UNIQUE_USERS 2 Minimum distinct users that must hit the same gap topic
AUTOPILOT_MIN_NEGATIVE_RATIO 0.15 Minimum fraction of queries on a topic that must have negative feedback
AUTOPILOT_MAX_CLUSTERS 50 Maximum gap clusters to persist per analysis run
AUTOPILOT_MAX_EPISODES 500 Maximum negative episodes to load per analysis run
AUTOPILOT_AUTO_DRAFT false Automatically generate drafts for qualifying gaps (no human trigger). Set to true to enable.
AUTOPILOT_AUTO_DRAFT_SEVERITY critical Minimum gap severity for auto-drafting: critical, high, medium, or low
AUTOPILOT_CRITICAL_USERS 5 Unique users needed for breadth score to reach 1.0. Lower for small teams.
AUTOPILOT_CRITICAL_SIGNALS 15 Negative signals needed for volume score to reach 1.0. Lower for low-traffic deployments.
AUTOPILOT_CRITICAL_THRESHOLD 0.75 Composite score cutoff for "critical" severity.
AUTOPILOT_HIGH_THRESHOLD 0.55 Composite score cutoff for "high" severity.
AUTOPILOT_MEDIUM_THRESHOLD 0.35 Composite score cutoff for "medium" severity.

When enabled, Autopilot runs on the configured schedule, exposes management endpoints at /api/v1/autopilot/*, and posts critical gap alerts to SLACK_GAP_NOTIFICATION_CHANNEL if configured. See the API Reference for endpoint details.

Small teams / dev environments: Set AUTOPILOT_CRITICAL_USERS=1, AUTOPILOT_CRITICAL_SIGNALS=3, AUTOPILOT_CRITICAL_THRESHOLD=0.3 to see critical gaps with minimal signal. See autopilot.md for a full tuning guide.

Freshness Scoring

Variable Default Description
FRESHNESS_SCHEDULER_INTERVAL_HOURS 24 How often freshness scores are recalculated for all documents
CONTRADICTION_CHECKS_PER_PASS 10 Max documents checked for contradictions per freshness run (LLM cost)
CONTRADICTION_INCLUDE_RECENT_EVENT_DOCS true Include recent Slack/PR/Jira docs in the contradiction pass alongside stalest docs
CONTRADICTION_EVENT_DOC_MAX_AGE_DAYS 90 Only event-based docs edited within this many days are eligible for contradiction checks

Semantic Quality Scoring

LLM-based quality assessment that evaluates documents on four dimensions: accuracy, completeness, clarity, and actionability (each scored 0-25, total 0-100). Runs as a background sweep on documents that have already been structurally scored.

Variable Default Description
SEMANTIC_QUALITY_ENABLED true Enable LLM-based semantic quality scoring
SEMANTIC_QUALITY_INTERVAL_HOURS 24 How often the semantic scoring sweep runs
SEMANTIC_QUALITY_BUDGET 50 Maximum documents scored per sweep (controls LLM cost)
SEMANTIC_QUALITY_STRUCTURAL_THRESHOLD 40.0 Minimum structural score required before a document is eligible for semantic scoring

The composite quality score blends structural and semantic scores at 50/50 weighting. Documents below the structural threshold are skipped to avoid wasting LLM calls on obviously poor content.

Capture Lifecycle

Captured content (GitHub PRs/issues, GitLab MRs, Slack threads) decays with age — unlike incident records (Jira, PagerDuty, Zendesk) which are permanent historical events. A 5-year-old PR discussing a replaced architecture should score low in freshness; a 2-week-old incident thread is always valid.

Cross-document references: During capture, DocBrain automatically extracts URLs from the description and comments — GitHub PRs, GitLab MRs, Jira tickets, Confluence pages, and other linked resources. These are stored as a reference graph in PostgreSQL and used to enrich RAG context at query time by fetching chunks from referenced documents. GitLab shorthand references (!123 for MRs, #123 for issues) are resolved to full URLs within the same project.

Space assignment: Captures are stored under a meaningful space name derived from the source: - GitHub captures → owner/repo (e.g., myorg/backend) - GitLab captures → group/project (e.g., platform/api) - Slack captures → channel name (e.g., platform-incidents)

This makes allowed_spaces ACL filtering work correctly — a key scoped to ["myorg/backend"] will include GitHub captures from that repo.

Age baseline: Freshness is calculated from the original content creation date (when the PR was opened, when the Slack thread started) — not the time DocBrain captured it. Re-capturing the same thread updates its content but preserves the original creation date as the staleness baseline.

Memory Consolidation

Variable Default Description
CONSOLIDATION_INTERVAL_HOURS 6 How often the memory consolidation job runs (merges episodic patterns into semantic/procedural memory)

RAG Pipeline

Variable Default Description
RAG_TOP_K 10 Chunks retrieved per query. Higher = more context passed to the LLM, at the cost of more tokens per call. Raise to 1520 if answers are missing obvious information; lower to 5 to reduce cost on simple corpora.
RAG_BM25_BOOST 1.0 Weight of keyword (BM25) search relative to vector search in hybrid retrieval. Raise to 2.03.0 for corpora heavy with exact-match queries — error codes, CLI commands, ticket IDs, specific tool names. Leave at 1.0 for general prose documentation.
SEARCH_MIN_SCORE 0.0 Drop retrieved chunks below this relevance score before sending context to the LLM. 0.0 keeps everything. Set to 0.30.4 if you notice irrelevant chunks contaminating answers; leave at 0.0 for small corpora where recall matters more than precision.
RAG_CACHE_TTL_HOURS 24 How long to cache semantically identical answers
RAG_CACHE_THRESHOLD 0.95 Cosine similarity threshold for a query to count as a cache hit

Chunking

Controls how documents are split before embedding. See Ingestion Guide for re-ingest instructions.

Variable Default Description
CHUNK_SIZE 1500 Target chunk size in characters. Dense API refs: 8001200. General docs: 1500. Long-form prose: 20002500.
CHUNK_OVERLAP 200 Overlap between adjacent paragraph-split chunks in characters.

OpenSearch Index Names

Variable Default Description
OPENSEARCH_INDEX docbrain-chunks Index name for document chunks (vectors + BM25)
OPENSEARCH_EPISODE_INDEX docbrain-episodes Index name for episode vectors (used in episodic memory recall)

Only change these if you run multiple DocBrain instances sharing the same OpenSearch cluster, to avoid index collisions.

Data Retention

Variable Default Description
EPISODE_RETENTION_DAYS 90 Episode (query history) rows older than this are pruned daily. Set to 0 to disable pruning.
AUDIT_RETENTION_DAYS 365 Audit log rows older than this are pruned daily. Set to 0 to disable pruning.

Self-Ingest (Optional)

Variable Default Description
DOCBRAIN_SELF_INGEST true Auto-ingest DocBrain's own docs so it can answer configuration questions about itself
DOCBRAIN_DOCS_PATH ./docs Path to DocBrain's own documentation directory

SSO / OIDC (Enterprise)

Variable Default Description
OIDC_ISSUER_URL OIDC provider URL (e.g. https://accounts.google.com)
OIDC_CLIENT_ID OAuth client ID
OIDC_CLIENT_SECRET OAuth client secret
OIDC_REDIRECT_URI Callback URI (e.g. https://docbrain.example.com/api/v1/auth/oidc/callback)
OIDC_WEB_UI_URL http://localhost:3001 Where to redirect after successful login
OIDC_ACCEPT_INVALID_CERTS false Set to true to skip TLS verification — use for corporate/self-signed CAs

GitLab OIDC

Variable Default Description
GITLAB_OIDC_ISSUER_URL GitLab instance URL (e.g. https://gitlab.com or https://gitlab.corp.example.com)
GITLAB_CLIENT_ID GitLab OAuth application client ID
GITLAB_CLIENT_SECRET GitLab OAuth application client secret
GITLAB_REDIRECT_URI Callback URL (e.g. https://docbrain.example.com/api/v1/auth/gitlab/callback)

Corporate GitLab: If your self-hosted GitLab uses an internal CA, set OIDC_ACCEPT_INVALID_CERTS=true.


RBAC Role Assignment

Role is computed at login time and stored on the user record. The hierarchy is: viewer (1) < editor (2) < analyst (3) < admin (4). Higher-priority rules win.

Variable Helm key Description
OIDC_DEFAULT_ROLE rbac.defaultRole Role assigned to new SSO users who match no group rule. Default: viewer.
OIDC_ADMIN_EMAILS rbac.adminEmails Comma-separated emails that always receive admin.
OIDC_ADMIN_DOMAIN rbac.adminDomain Email domain whose users receive admin (e.g. acme.com).
OIDC_ADMIN_GROUPS rbac.adminGroups Comma-separated IdP group names → admin role.
OIDC_EDITOR_GROUPS rbac.editorGroups Comma-separated IdP group names → editor role.
OIDC_ALLOWED_GROUPS rbac.allowedGroups Access gate: only these groups may log in (all others get 403).
OIDC_ALLOWED_DOMAINS rbac.allowedDomains Access gate: only these email domains may log in.

What every engineer can see

All authenticated users (including viewer) have full access to the intelligence dashboards:

Page What it shows
Velocity Documentation ROI — queries deflected, hours saved, cost saved, per-team breakdown
Predictive Predicted documentation gaps from code changes, cascade staleness, seasonal patterns, onboarding risks
Maintenance AI-generated fix proposals with apply/reject workflow
Stream Live knowledge event feed — incident warnings, freshness decay alerts, trending gaps

These dashboards are visible to every engineer. The insight loop only works if the people who can act on it — the engineers — can actually see it.

Example — typical multi-team setup:

rbac:
  defaultRole: "viewer"
  adminGroups: "platform-team"
  editorGroups: "docs-writers"
# Equivalent env vars
OIDC_DEFAULT_ROLE=viewer
OIDC_ADMIN_GROUPS=platform-team
OIDC_EDITOR_GROUPS=docs-writers

Note: Role is evaluated at login time. Group changes in your IdP take effect on next login.


Documentation Analytics

Variable Default Description
VELOCITY_MINUTES_SAVED_PER_QUERY 15 Estimated minutes saved per deflected query
VELOCITY_HOURLY_RATE 75 Effective hourly rate (USD) for ROI calculation

Knowledge Stream

Variable Default Description
STREAM_ENABLED false Enable background knowledge stream emission
STREAM_INTERVAL_MINUTES 30 How often the stream background task runs
STREAM_INCIDENT_WARNING_MIN_USERS 2 Minimum unique users hitting an unanswered question to emit an incident warning
STREAM_DECAY_THRESHOLD 0.5 Freshness score below which a decay alert is emitted

Event Bus

The event bus is internal pub/sub infrastructure — always enabled, no opt-in required. Every significant action (document ingest, gap detection, draft generation, etc.) emits a typed event that subscribers can react to.

Variable Default Description
EVENT_BUS_CAPACITY 4096 Broadcast channel buffer size. Increase if subscribers lag under high event volume. Max: 65536.
EVENT_LOG_RETENTION_DAYS 90 Days to retain events in the event_log table before purging.

Admin API endpoints:

Method Path Description
GET /api/v1/events Query the persistent event log. Supports ?type=gap.detected&since=2026-03-01&limit=100&offset=0.
GET /api/v1/events/stream SSE stream of real-time events. Max 10 concurrent connections.

Both endpoints require admin role.

Knowledge Fragments

Knowledge fragments are first-class units of knowledge — smaller than documents, richer than chunks. They capture decisions, facts, caveats, procedures, and context from PRs, commits, IDE annotations, conversations, CI/CD pipelines, and manual entry.

Fragments are routed by confidence score: high-confidence fragments are auto-indexed into search, medium-confidence go to a review queue, and low-confidence are auto-discarded.

Variable Default Description
FRAGMENT_AUTO_INDEX_THRESHOLD 0.7 Minimum confidence score to auto-index a fragment into OpenSearch.
FRAGMENT_REVIEW_THRESHOLD 0.4 Minimum confidence for the review queue. Fragments below this are auto-discarded.
FRAGMENT_MAX_CONTENT_LENGTH 10000 Maximum fragment content length in characters.

Fragment Clustering & Auto-Composition

Semantic clustering groups related fragments by topic using embedding similarity (DBSCAN-style greedy algorithm). When a cluster meets composability criteria (5+ fragments, diverse sources, 500+ words), it can be auto-composed into a documentation draft via the API.

Variable Default Description
FRAGMENT_CLUSTERING_ENABLED true Enable or disable the fragment clustering endpoint.
FRAGMENT_CLUSTER_THRESHOLD 0.80 Cosine similarity threshold for grouping fragments (0.60 = loose, 0.90 = strict).
FRAGMENT_MIN_CLUSTER_SIZE 3 Minimum fragments required to form a cluster.
FRAGMENT_MIN_SOURCE_DIVERSITY 2 Minimum distinct source types for a cluster to be composable.
FRAGMENT_MAX_PER_CLUSTERING_RUN 2000 Maximum fragments loaded per clustering run (memory/cost control).

CI/CD Pipeline Capture

Automated knowledge extraction from merged PRs and deployments. When enabled, DocBrain provides API endpoints that CI/CD pipelines can call to extract knowledge fragments from pull requests and deployment events. Uses the fast/cheap LLM model to keep costs low at high volume.

Variable Default Description
CI_ANALYZE_ENABLED true Enable or disable the CI/CD capture endpoints (/api/v1/ci/analyze and /api/v1/ci/deploy-capture).

See the API Reference for endpoint details and the GitHub Action setup guide.

Conversation Auto-Distillation

Automatically extracts structured knowledge fragments from captured conversations — Slack threads (via /docbrain sync) and GitHub PR discussions (via @docbrain capture). After a successful capture, DocBrain runs LLM-powered distillation in the background to identify decisions, facts, caveats, procedures, and context embedded in the conversation.

Distillation is fire-and-forget: it never affects capture response time. Failures are logged and metriced but don't block the capture path.

Variable Default Description
DISTILLATION_ENABLED true Enable or disable conversation auto-distillation.
DISTILLATION_MAX_CONCURRENT 3 Maximum concurrent LLM distillation calls (bounded by semaphore).
DISTILLATION_MAX_CONTENT_CHARS 8000 Maximum conversation characters sent to the LLM. Longer conversations are truncated (tail-biased — keeps the most recent messages).
DISTILLATION_MAX_FRAGMENTS 5 Maximum knowledge fragments extracted per conversation.

Governance SLA Checker

The SLA checker runs as a periodic background task that detects breaches across four entity types: gap acknowledgment, gap resolution, draft review, and document freshness. SLA thresholds are stored in the database (per-space overridable via the API) — these settings control the checker's operational behavior.

Variable Default Description
SLA_CHECKER_INTERVAL_HOURS 1 How often the SLA breach checker runs (hours).
SLA_CHECKER_QUERY_TIMEOUT_SECS 30 Per-entity-type query timeout in seconds.
SLA_CHECKER_MAX_CANDIDATES 5000 Maximum candidate entities scanned per type per run.
SLA_CHECKER_MAX_EVENTS_PER_RUN 50 Maximum SlaBreached events emitted per run (prevents webhook flooding).

See the API Reference — Governance SLAs for endpoint documentation.

External Connectors (HTTP Connector Protocol)

External connectors are stateless HTTP servers that implement a simple REST contract (GET /health, POST /documents/list, POST /documents/fetch). DocBrain calls them on a configurable cron schedule to ingest documents from external systems. Connectors are registered and managed via the admin API.

The connector scheduler runs as a background task, polling every 60 seconds for connectors whose cron schedule is due. A circuit breaker automatically disables connectors after repeated failures.

Variable Default Description
CONNECTOR_ENABLED true Enable/disable the connector scheduler
CONNECTOR_MAX_CONCURRENT_SYNCS 3 Max connectors syncing simultaneously (1-20)
CONNECTOR_MAX_PAGES_PER_SYNC 200 Max list pages fetched per sync
CONNECTOR_MAX_DOCUMENTS_PER_SYNC 5000 Max documents ingested per sync
CONNECTOR_FETCH_BATCH_SIZE 50 Documents fetched per batch (1-200)
CONNECTOR_REQUEST_TIMEOUT_SECS 30 HTTP timeout for individual connector requests (5-300 seconds)
CONNECTOR_SYNC_TIMEOUT_SECS 3600 Overall sync timeout per connector (60-7200 seconds)
CONNECTOR_MAX_RESPONSE_BYTES 10485760 Max response body size from connector (10 MB)
CONNECTOR_CIRCUIT_BREAKER_THRESHOLD 5 Consecutive failures before auto-disabling a connector
CONNECTOR_ALLOW_INTERNAL false Allow connector URLs on private/internal IP addresses. Not recommended for production.

See the API Reference — Connectors for endpoint documentation and the connector protocol spec.

Webhooks (Outbound)

Outbound webhook subscriptions let you push DocBrain events to external systems — Slack bots, CI/CD pipelines, PagerDuty, custom dashboards, etc. DocBrain signs every delivery with HMAC-SHA256, retries with exponential backoff, and automatically disables subscriptions that fail repeatedly (circuit breaker).

Variable Default Description
WEBHOOK_DELIVERY_TIMEOUT_SECONDS 10 HTTP timeout per webhook delivery attempt (1-60 seconds)
WEBHOOK_MAX_RETRIES 4 Maximum delivery attempts before giving up (1-10)
WEBHOOK_CIRCUIT_BREAKER_THRESHOLD 10 Consecutive failures before auto-disabling a subscription (3-100)
ALLOW_INTERNAL_WEBHOOKS false Allow delivery to private/internal IP addresses (10.x, 172.16.x, 192.168.x). Not recommended for production.

See the API Reference — Webhooks for endpoint documentation and event types.

Style Rules Engine

The style rules engine provides configurable linting for documentation consistency. Rules are always enabled — no opt-in required. Rules are managed via the API (CRUD + YAML import/export) and stored in PostgreSQL.

Rules are scoped either globally (space = null) or per-space. When linting, global rules apply to all content, and space-specific rules override global rules with the same (rule_type, name) key.

Five default rules are seeded on first migration:

Rule Type Default Severity
avoid-simple terminology warning
avoid-just terminology warning
max-heading-depth (H4) formatting warning
max-sentence-length (40 words) formatting info
require-intro structure warning

API endpoints: See API Reference — Style Rules Engine for full endpoint documentation.

There are no environment variables for the style rules engine — all limits are compile-time constants. Custom rules are created and managed entirely through the API.