Configuration Reference¶
How Configuration Works¶
DocBrain uses a config-first architecture with a layered YAML + environment variable system. Understanding this prevents confusion about why a value isn't taking effect.
Loading Order (later = higher priority)¶
config/default.yaml ← committed to repo — all non-secret defaults
config/{APP_ENV}.yaml ← environment-specific overrides (development | production)
config/local.yaml ← gitignored — your secrets and local overrides
Environment variables / .env ← always win — highest priority
Set APP_ENV=production for the production profile (this is the default in the Docker image). The server defaults to APP_ENV=development when running locally without Docker.
What Goes Where¶
| Type | Where to put it |
|---|---|
| Infrastructure secrets (DB URL, LLM API keys, Redis, OpenSearch) | .env or environment variables |
| Ingest source credentials (Confluence token, GitHub token, Slack token, Jira token) | config/local.yaml (gitignored) |
| Deployment-specific values (URLs, ports, CORS origins) | .env or environment variables |
| Tuning (thresholds, intervals, cache TTLs) | config/local.yaml or env vars |
| Team-wide defaults you want committed | config/default.yaml (no secrets!) |
The key distinction: .env is for infrastructure secrets that the runtime environment must inject (container orchestration, CI/CD, secrets managers). config/local.yaml is for user-managed source credentials and personal overrides — it's gitignored so it never gets committed, but it lives alongside the project where you can edit it easily.
Example config/local.yaml¶
# config/local.yaml — never committed (gitignored)
# Configure ingest sources and personal overrides here.
ingest:
ingest_sources: confluence,github_pr
confluence:
base_url: https://acme.atlassian.net/wiki
user_email: you@acme.com
api_token: ATATT3x...
space_keys: DOCS,ENG
github_pr:
token: ghp_...
repo: acme/platform
lookback_days: 180
# Local tuning overrides (optional)
autopilot:
enabled: true
cluster_threshold: 0.78
rag:
cache_ttl_hours: 1
YAML Config Structure¶
Every YAML value supports ${ENV_VAR} and ${ENV_VAR:-default} substitution:
database:
url: "${DATABASE_URL}" # required — must come from env
max_connections: "${DB_MAX_CONNECTIONS:-10}"
Custom Config Directory¶
# Mount a ConfigMap in Kubernetes
DOCBRAIN_CONFIG_DIR=/etc/docbrain docbrain-server
# Or pass as CLI argument
docbrain-server --config-dir /etc/docbrain
All configuration is also available via environment variables, set in .env for Docker Compose or via ConfigMap/Secret for Kubernetes. Environment variables always override YAML values.
Infrastructure¶
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
— | PostgreSQL connection string |
OPENSEARCH_URL |
http://localhost:9200 |
OpenSearch endpoint |
REDIS_URL |
redis://localhost:6379 |
Redis connection string |
SERVER_PORT |
3000 |
API server listen port |
SERVER_BIND |
0.0.0.0 |
API server bind address |
LOG_LEVEL |
info |
Log verbosity: trace, debug, info, warn, error |
DB_MAX_CONNECTIONS |
10 |
Maximum PostgreSQL connection pool size |
DB_CONNECT_TIMEOUT_SECS |
10 |
Timeout (seconds) for initial PostgreSQL connection |
DB_ACQUIRE_TIMEOUT_SECS |
10 |
Timeout (seconds) to acquire a connection from the pool |
DB_IDLE_TIMEOUT_SECS |
300 |
Idle connection lifetime (seconds) before cleanup |
LLM Provider¶
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
bedrock |
Provider: bedrock, anthropic, openai, ollama, groq, openrouter, together, deepseek, mistral, xai, gemini, azure_openai, vertex_ai, cohere |
LLM_MODEL_ID |
varies | Model identifier (provider-specific) |
FAST_MODEL_ID |
— | Fast/cheap model for background side-calls: intent classification, query rewriting, entity extraction. Falls back to LLM_MODEL_ID if not set. Recommended: Haiku (Bedrock/Anthropic), gpt-4o-mini (OpenAI), qwen2.5:7b (Ollama). Alias: HAIKU_MODEL_ID (deprecated). |
INGEST_LLM_MODEL_ID |
— | Model used during ingest only for image extraction. Falls back to LLM_MODEL_ID if not set. Set this to a cheaper model — image extraction fires for every page with images. Using Opus 4 with LLM_THINKING_BUDGET without this override will cause throttling errors during ingest. |
DRAFT_MODEL_ID |
— | Model used for autopilot draft generation (two-phase reasoning + writing). Falls back to LLM_MODEL_ID if not set. Use a high-capability model here — drafts benefit from stronger reasoning. |
DRAFT_LLM_PROVIDER |
— | Provider for draft generation. Falls back to LLM_PROVIDER if not set. Allows cross-provider drafting — e.g. use Gemini Flash for Q&A but Anthropic Claude for drafts. |
LLM_THINKING_BUDGET |
— | Extended thinking token budget (tokens). Unset or 0 = disabled. Only applies to the primary LLM_MODEL_ID, never to FAST_MODEL_ID or INGEST_LLM_MODEL_ID. |
ANTHROPIC_API_KEY |
— | API key (if LLM_PROVIDER=anthropic) |
OPENAI_API_KEY |
— | API key (if LLM_PROVIDER=openai) |
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server URL |
OLLAMA_TIMEOUT_SECS |
120 |
HTTP timeout in seconds for Ollama requests. Increase for large/slow models (e.g. 70B) to avoid "error decoding response body" when the model takes longer than 2 minutes. Example: 300 or 600. Allowed range: 60–900. |
OLLAMA_TLS_VERIFY |
false |
Set to true to enforce TLS certificate validation for Ollama |
OLLAMA_VISION_ENABLED |
true |
Set to false if your Ollama model doesn't support vision (skips image calls) |
AWS_REGION |
— | AWS region for Bedrock (e.g. us-east-1) |
AWS_ACCESS_KEY_ID |
— | AWS access key (optional — see credential chain below) |
AWS_SECRET_ACCESS_KEY |
— | AWS secret key (optional — see credential chain below) |
GROQ_API_KEY |
— | API key (if LLM_PROVIDER=groq) |
OPENROUTER_API_KEY |
— | API key (if LLM_PROVIDER=openrouter) |
TOGETHER_API_KEY |
— | API key (if LLM_PROVIDER=together) |
DEEPSEEK_API_KEY |
— | API key (if LLM_PROVIDER=deepseek) |
MISTRAL_API_KEY |
— | API key (if LLM_PROVIDER=mistral) |
XAI_API_KEY |
— | API key (if LLM_PROVIDER=xai) |
GEMINI_API_KEY |
— | API key (if LLM_PROVIDER=gemini) |
AZURE_OPENAI_API_KEY |
— | API key (if LLM_PROVIDER=azure_openai) |
AZURE_OPENAI_ENDPOINT |
— | Resource endpoint (if LLM_PROVIDER=azure_openai). e.g. https://my-resource.openai.azure.com |
AZURE_OPENAI_API_VERSION |
2024-02-01 |
API version (if LLM_PROVIDER=azure_openai) |
VERTEX_PROJECT |
— | GCP project ID (if LLM_PROVIDER=vertex_ai). Required. |
VERTEX_REGION |
us-central1 |
GCP region (if LLM_PROVIDER=vertex_ai) |
COHERE_API_KEY |
— | API key (if LLM_PROVIDER=cohere) |
AWS Credential Chain: Bedrock uses the AWS SDK default credential chain: env vars →
~/.aws/credentials→ IRSA (EKS) → EC2 Instance Profile → ECS Task Role. In production, use IRSA or instance profiles — no keys in env. SetserviceAccount.create=trueandserviceAccount.annotations.eks.amazonaws.com/role-arnin Helm. The IAM role needsbedrock:InvokeModelandbedrock:InvokeModelWithResponseStreampermissions. See providers.md for full setup details.GCP Credential Chain: Vertex AI uses
gcp_authwhich resolves credentials in this order:GOOGLE_APPLICATION_CREDENTIALS(service account key file) → Application Default Credentials (gcloud auth application-default login) → GKE Workload Identity → GCE/Cloud Run metadata service. In production on GKE, use Workload Identity — no keys needed in the cluster. See providers.md for Workload Identity setup details.
Ollama: model selection and tuning¶
Only use models with strong instruction-following capabilities. DocBrain's RAG pipeline requires the LLM to stay strictly grounded in retrieved documents. Models that default to training data instead of provided context will produce fabricated answers. Recommended: command-r:35b (purpose-built for RAG). See providers.md for the full model comparison table.
- Recommended config:
LLM_MODEL_ID=command-r:35bandFAST_MODEL_ID=qwen2.5:7b. The fast model handles intent classification and query rewriting; only the final answer uses the primary model. - "Error decoding response body" after 2–3 minutes: The default HTTP timeout is 120 seconds. If the model takes longer to generate the full response, the connection is cut and you get a decode error. Set
OLLAMA_TIMEOUT_SECS=300(or600) so the client waits long enough.
Embedding Provider¶
Set EMBED_PROVIDER to choose your embedding model. One of: openai, bedrock, ollama.
| Variable | Default | Description |
|---|---|---|
EMBED_PROVIDER |
bedrock |
Provider: bedrock, openai, ollama |
EMBED_MODEL_ID |
varies | Embedding model identifier (e.g. text-embedding-3-small, cohere.embed-v4:0) |
Switching Embedding Models¶
When you change EMBED_PROVIDER or EMBED_MODEL_ID to a model with different vector dimensions (e.g. Bedrock Cohere/1024 → Ollama nomic-embed-text/768), the server will refuse to start with a clear error:
To migrate:
- Set
FORCE_REINDEX=truein your environment - Restart the server and run ingest — the old indexes are deleted and recreated
- Remove
FORCE_REINDEXafter the migration completes
| Variable | Default | Description |
|---|---|---|
FORCE_REINDEX |
false |
Delete and recreate OpenSearch indexes when embedding dimensions change. Set once during migration, then remove. |
Document Ingestion¶
Configure sources in config/local.yaml (gitignored). Put only infrastructure secrets in .env.
General¶
Setting (config/local.yaml key) |
Env var equivalent | Default | Description |
|---|---|---|---|
ingest.ingest_sources |
INGEST_SOURCES |
local |
Comma-separated list of active sources: local, confluence, github, github_pr, gitlab_mr, slack_thread, jira |
ingest.self_ingest |
DOCBRAIN_SELF_INGEST |
true |
Auto-ingest DocBrain's own docs |
ingest.image_extraction_enabled |
IMAGE_EXTRACTION_ENABLED |
true |
Extract and describe images using vision LLM |
Local Files¶
| Variable | Default | Description |
|---|---|---|
LOCAL_DOCS_PATH |
— | Directory path for local file ingestion (set in .env or as env var) |
Confluence¶
Set credentials in config/local.yaml:
confluence:
base_url: https://yourco.atlassian.net/wiki
user_email: you@yourco.com
api_token: ATATT3x...
space_keys: ENG,DOCS
| Key | Env var | Default | Description |
|---|---|---|---|
confluence.base_url |
CONFLUENCE_BASE_URL |
— | Atlassian instance URL (must include /wiki) |
confluence.user_email |
CONFLUENCE_USER_EMAIL |
— | Auth email (not required for v1 Data Center) |
confluence.api_token |
CONFLUENCE_API_TOKEN |
— | API token (Cloud) or Personal Access Token (Data Center) |
confluence.space_keys |
CONFLUENCE_SPACE_KEYS |
— | Comma-separated space keys to ingest |
confluence.page_limit |
CONFLUENCE_PAGE_LIMIT |
0 (unlimited) |
Max pages per space. 0 = all pages. |
confluence.api_version |
CONFLUENCE_API_VERSION |
v2 |
v2 for Cloud, v1 for Data Center 7.x+ |
confluence.tls_verify |
CONFLUENCE_TLS_VERIFY |
true |
Set to false for self-signed certs |
confluence.webhook_secret |
CONFLUENCE_WEBHOOK_SECRET |
— | HMAC secret for real-time webhook sync (set as env var) |
GitHub Repository¶
# config/local.yaml
github:
repo_url: https://github.com/your-org/your-docs
token: ghp_... # only for private repos
branch: main
| Key | Env var | Default | Description |
|---|---|---|---|
github.repo_url |
GITHUB_REPO_URL |
— | Repository URL to clone and ingest |
github.token |
GITHUB_TOKEN |
— | Personal access token (optional for public repos) |
github.branch |
GITHUB_BRANCH |
main |
Branch to ingest from |
GitHub Pull Requests¶
Ingest PR titles, descriptions, and review discussions as searchable knowledge.
# config/local.yaml
github_pr:
token: ghp_...
repo: acme/platform
lookback_days: 365
min_comments: 1
| Key | Env var | Default | Description |
|---|---|---|---|
github_pr.token |
GITHUB_PR_TOKEN |
— | GitHub personal access token (secret — set in config/local.yaml) |
github_pr.repo |
GITHUB_PR_REPO |
— | Owner/repo (e.g. acme/platform) — set in config/local.yaml |
github_pr.lookback_days |
GITHUB_PR_LOOKBACK_DAYS |
365 |
How far back to fetch PRs |
github_pr.min_comments |
GITHUB_PR_MIN_COMMENTS |
1 |
Minimum comments for a PR to be ingested |
github_pr.labels |
GITHUB_PR_LABELS |
— | Comma-separated label filter (optional) |
github_pr.api_url |
GITHUB_PR_API_URL |
— | Override for GitHub Enterprise (optional) |
GitLab Merge Requests¶
Ingest MR titles, descriptions, and discussion threads.
# config/local.yaml
gitlab_mr:
token: glpat-...
project_ids: acme/platform,acme/infra
lookback_days: 365
| Key | Env var | Default | Description |
|---|---|---|---|
gitlab_mr.token |
GITLAB_TOKEN |
— | GitLab personal access token (secret — set in config/local.yaml) |
gitlab_mr.base_url |
GITLAB_BASE_URL |
https://gitlab.com |
GitLab instance URL |
gitlab_mr.project_ids |
GITLAB_PROJECT_IDS |
— | Comma-separated namespace/repo paths — set in config/local.yaml |
gitlab_mr.lookback_days |
GITLAB_MR_LOOKBACK_DAYS |
365 |
How far back to fetch MRs |
gitlab_mr.min_notes |
GITLAB_MR_MIN_NOTES |
1 |
Minimum notes/comments for an MR to be ingested |
gitlab_mr.labels |
GITLAB_MR_LABELS |
— | Comma-separated label filter (optional) |
gitlab_mr.tls_verify |
GITLAB_TLS_VERIFY |
true |
Set to false for self-signed certs (batch ingest) |
gitlabCapture.tlsInsecure |
GITLAB_CAPTURE_TLS_INSECURE |
false |
Set to true for self-signed certs (real-time capture) |
Slack Threads¶
Ingest high-signal Slack threads (by reaction count or reply threshold).
# config/local.yaml
slack_ingest:
token: xoxb-...
channels: C01234567,C09876543
min_replies: 3
reactions: "white_check_mark,bookmark"
lookback_days: 90
| Key | Env var | Default | Description |
|---|---|---|---|
slack_ingest.token |
SLACK_INGEST_TOKEN |
— | Slack bot token (secret — set in config/local.yaml) |
slack_ingest.channels |
SLACK_INGEST_CHANNELS |
— | Comma-separated channel IDs — set in config/local.yaml |
slack_ingest.min_replies |
SLACK_MIN_REPLIES |
3 |
Minimum thread replies to be ingested |
slack_ingest.reactions |
SLACK_INGEST_REACTIONS |
white_check_mark,bookmark |
Comma-separated reaction names that flag a thread for ingest |
slack_ingest.lookback_days |
SLACK_LOOKBACK_DAYS |
90 |
How far back to scan channels |
Jira¶
Ingest Jira issues (bugs, stories, tasks, epics) as searchable knowledge.
# config/local.yaml
jira_ingest:
base_url: https://yourcompany.atlassian.net
user_email: you@yourcompany.com
api_token: your-token
projects: ENG,OPS
lookback_days: 365
| Key | Env var | Default | Description |
|---|---|---|---|
jira_ingest.base_url |
JIRA_BASE_URL |
— | Jira instance URL — set in config/local.yaml |
jira_ingest.user_email |
JIRA_USER_EMAIL |
— | Jira account email — set in config/local.yaml |
jira_ingest.api_token |
JIRA_API_TOKEN |
— | Jira API token (secret — set in config/local.yaml) |
jira_ingest.projects |
JIRA_PROJECTS |
— | Comma-separated project keys — set in config/local.yaml |
jira_ingest.jql_filter |
JIRA_JQL_FILTER |
— | Additional JQL filter (optional) |
jira_ingest.lookback_days |
JIRA_LOOKBACK_DAYS |
365 |
How far back to fetch issues |
jira_ingest.issue_types |
JIRA_ISSUE_TYPES |
Bug,Story,Task,Epic |
Comma-separated issue types to ingest |
Rate Limiting¶
DocBrain applies per-IP rate limiting to unauthenticated routes and per-API-key rate limiting to authenticated routes. Rate limiting is enabled by default.
| Variable | Default | Description |
|---|---|---|
RATE_LIMIT_ENABLED |
true |
Set to false to disable all rate limiting (not recommended for production) |
RATE_LIMIT_RPM |
60 |
Requests per minute per IP on unauthenticated routes |
RATE_LIMIT_AUTH_RPM |
120 |
Requests per minute per API key on authenticated routes |
RATE_LIMIT_WEBHOOK_RPM |
30 |
Requests per minute per IP on webhook endpoints (/github/events, /gitlab/events) |
When a rate limit is exceeded, DocBrain returns 429 Too Many Requests with a Retry-After header.
GitLab MR Capture Webhook¶
The GitLab capture feature lets engineers trigger immediate ingestion by commenting @docbrain capture on any merge request.
| Variable | Default | Description |
|---|---|---|
GITLAB_CAPTURE_WEBHOOK_SECRET |
— | HMAC secret shared with GitLab for webhook signature verification |
GITLAB_CAPTURE_TOKEN |
— | GitLab personal access token with api scope (fetches MR notes and posts reply comments) |
GITLAB_CAPTURE_BASE_URL |
https://gitlab.com |
GitLab instance base URL (override for self-hosted) |
GITLAB_CAPTURE_ALLOWED_USERS |
— | Comma-separated GitLab usernames allowed to trigger capture. Empty = all users. |
GITLAB_CAPTURE_ALLOWED_PROJECTS |
— | Comma-separated project paths allowed to trigger capture. Empty = all projects. e.g. myorg/myrepo |
See Ingestion Guide for full setup instructions.
GitHub Capture Security¶
These optional variables restrict which repos and users can trigger real-time GitHub PR/issue capture via @docbrain capture comments.
| Variable | Default | Description |
|---|---|---|
GITHUB_CAPTURE_ALLOWED_REPOS |
— | Comma-separated owner/repo pairs allowed to trigger capture. Empty = all repos. e.g. myorg/backend,myorg/frontend |
GITHUB_CAPTURE_ALLOWED_USERS |
— | Comma-separated GitHub usernames allowed to trigger capture. Empty = all users. e.g. alice,bob |
A 500KB content size guard applies to all capture requests. Oversized threads are rejected with a reply comment.
Confluence Webhooks (Real-Time Sync)¶
| Variable | Default | Description |
|---|---|---|
CONFLUENCE_WEBHOOK_SECRET |
— | HMAC secret shared with Confluence. When set, DocBrain mounts POST /confluence/events and auto-ingests page changes in real time. Set as an environment variable (not in config/local.yaml). |
When configured, DocBrain receives page_created, page_updated, page_restored, page_removed, and page_trashed events from Confluence and syncs changes automatically — no scheduled re-ingest needed.
Requires confluence.base_url and confluence.api_token to also be set in config/local.yaml (DocBrain needs API access to fetch the page content when a webhook fires).
See the Ingestion Guide for setup instructions.
Image Extraction¶
| Variable | Default | Description |
|---|---|---|
IMAGE_EXTRACTION_ENABLED |
true |
Extract and describe images from Confluence pages using vision LLM. Set to false to disable. |
INGEST_LLM_MODEL_ID |
— | Model used for image extraction during ingest. Falls back to LLM_MODEL_ID if not set. Set this to a cheaper model (Haiku, gpt-4o-mini) to avoid throttling and reduce cost. |
IMAGE_MAX_PER_PAGE |
20 |
Maximum images to process per Confluence page |
IMAGE_MIN_SIZE_BYTES |
5120 |
Skip images smaller than this in bytes (default: 5 KB) — filters out icons and decorative images |
IMAGE_MAX_SIZE_BYTES |
10485760 |
Skip images larger than this in bytes (default: 10 MB) |
IMAGE_DOWNLOAD_TIMEOUT |
30 |
HTTP download timeout in seconds per image |
IMAGE_LLM_TIMEOUT |
120 |
LLM vision call timeout in seconds (needs more time than download) |
Image extraction requires a vision-capable LLM. Supported providers: Bedrock, Anthropic, OpenAI, and Ollama (with vision models like llava, llama3.2-vision, moondream). Text-only models (e.g. llama3.1) are auto-detected and images are skipped gracefully — no failures, no errors.
Web UI / CORS¶
| Variable | Default | Description |
|---|---|---|
CORS_ALLOWED_ORIGINS |
http://localhost:3001 |
Comma-separated origins allowed to call the API. Only needed if the web UI is served from a non-default origin (e.g. http://10.0.0.5:3001, https://docbrain.internal) |
Note: The default works out of the box for Docker Compose. You only need this if you access the web UI via a different hostname or port — for example,
http://127.0.0.1:3001is a different origin thanhttp://localhost:3001.
Auth / Sessions¶
| Variable | Default | Description |
|---|---|---|
LOGIN_SESSION_TTL_HOURS |
720 |
Session lifetime after email/password login (default: 720 hours = 30 days). Set to 0 for no expiry. |
MAX_QUERY_LENGTH |
4000 |
Maximum characters allowed for question and description inputs |
Slack Integration (Optional)¶
| Variable | Default | Description |
|---|---|---|
SLACK_BOT_TOKEN |
— | Slack bot OAuth token (xoxb-...) |
SLACK_SIGNING_SECRET |
— | Slack app signing secret |
SLACK_GAP_NOTIFICATION_CHANNEL |
— | Channel to post critical gap alerts after each analysis run (e.g. #docs-alerts). Only fires when new critical-severity gaps are found. Requires SLACK_BOT_TOKEN. |
Notifications (Optional)¶
| Variable | Default | Description |
|---|---|---|
NOTIFICATION_INTERVAL_HOURS |
24 |
How often to check for stale docs and send owner DMs |
NOTIFICATION_SPACE_FILTER |
— | Comma-separated spaces to limit notifications (e.g. PLATFORM,SRE). Empty = all spaces. |
Documentation Autopilot (Optional)¶
| Variable | Default | Description |
|---|---|---|
AUTOPILOT_ENABLED |
false |
Enable the Documentation Autopilot (gap detection + draft generation) |
AUTOPILOT_GAP_ANALYSIS_INTERVAL_HOURS |
6 |
How often the background scheduler runs gap analysis |
AUTOPILOT_LOOKBACK_DAYS |
30 |
Days of query history to analyse for gaps |
AUTOPILOT_CLUSTER_THRESHOLD |
0.82 |
Cosine similarity threshold for grouping queries into a gap cluster (0.65 = loose, 0.85 = strict) |
AUTOPILOT_MIN_CLUSTER_SIZE |
3 |
Minimum episodes in a cluster to be considered a real gap |
AUTOPILOT_MIN_UNIQUE_USERS |
2 |
Minimum distinct users that must hit the same gap topic |
AUTOPILOT_MIN_NEGATIVE_RATIO |
0.15 |
Minimum fraction of queries on a topic that must have negative feedback |
AUTOPILOT_MAX_CLUSTERS |
50 |
Maximum gap clusters to persist per analysis run |
AUTOPILOT_MAX_EPISODES |
500 |
Maximum negative episodes to load per analysis run |
AUTOPILOT_AUTO_DRAFT |
false |
Automatically generate drafts for qualifying gaps (no human trigger). Set to true to enable. |
AUTOPILOT_AUTO_DRAFT_SEVERITY |
critical |
Minimum gap severity for auto-drafting: critical, high, medium, or low |
AUTOPILOT_CRITICAL_USERS |
5 |
Unique users needed for breadth score to reach 1.0. Lower for small teams. |
AUTOPILOT_CRITICAL_SIGNALS |
15 |
Negative signals needed for volume score to reach 1.0. Lower for low-traffic deployments. |
AUTOPILOT_CRITICAL_THRESHOLD |
0.75 |
Composite score cutoff for "critical" severity. |
AUTOPILOT_HIGH_THRESHOLD |
0.55 |
Composite score cutoff for "high" severity. |
AUTOPILOT_MEDIUM_THRESHOLD |
0.35 |
Composite score cutoff for "medium" severity. |
When enabled, Autopilot runs on the configured schedule, exposes management endpoints at /api/v1/autopilot/*, and posts critical gap alerts to SLACK_GAP_NOTIFICATION_CHANNEL if configured. See the API Reference for endpoint details.
Small teams / dev environments: Set
AUTOPILOT_CRITICAL_USERS=1,AUTOPILOT_CRITICAL_SIGNALS=3,AUTOPILOT_CRITICAL_THRESHOLD=0.3to see critical gaps with minimal signal. See autopilot.md for a full tuning guide.
Freshness Scoring¶
| Variable | Default | Description |
|---|---|---|
FRESHNESS_SCHEDULER_INTERVAL_HOURS |
24 |
How often freshness scores are recalculated for all documents |
CONTRADICTION_CHECKS_PER_PASS |
10 |
Max documents checked for contradictions per freshness run (LLM cost) |
CONTRADICTION_INCLUDE_RECENT_EVENT_DOCS |
true |
Include recent Slack/PR/Jira docs in the contradiction pass alongside stalest docs |
CONTRADICTION_EVENT_DOC_MAX_AGE_DAYS |
90 |
Only event-based docs edited within this many days are eligible for contradiction checks |
Semantic Quality Scoring¶
LLM-based quality assessment that evaluates documents on four dimensions: accuracy, completeness, clarity, and actionability (each scored 0-25, total 0-100). Runs as a background sweep on documents that have already been structurally scored.
| Variable | Default | Description |
|---|---|---|
SEMANTIC_QUALITY_ENABLED |
true |
Enable LLM-based semantic quality scoring |
SEMANTIC_QUALITY_INTERVAL_HOURS |
24 |
How often the semantic scoring sweep runs |
SEMANTIC_QUALITY_BUDGET |
50 |
Maximum documents scored per sweep (controls LLM cost) |
SEMANTIC_QUALITY_STRUCTURAL_THRESHOLD |
40.0 |
Minimum structural score required before a document is eligible for semantic scoring |
The composite quality score blends structural and semantic scores at 50/50 weighting. Documents below the structural threshold are skipped to avoid wasting LLM calls on obviously poor content.
Capture Lifecycle¶
Captured content (GitHub PRs/issues, GitLab MRs, Slack threads) decays with age — unlike incident records (Jira, PagerDuty, Zendesk) which are permanent historical events. A 5-year-old PR discussing a replaced architecture should score low in freshness; a 2-week-old incident thread is always valid.
Cross-document references: During capture, DocBrain automatically extracts URLs from the description and comments — GitHub PRs, GitLab MRs, Jira tickets, Confluence pages, and other linked resources. These are stored as a reference graph in PostgreSQL and used to enrich RAG context at query time by fetching chunks from referenced documents. GitLab shorthand references (!123 for MRs, #123 for issues) are resolved to full URLs within the same project.
Space assignment: Captures are stored under a meaningful space name derived from the source:
- GitHub captures → owner/repo (e.g., myorg/backend)
- GitLab captures → group/project (e.g., platform/api)
- Slack captures → channel name (e.g., platform-incidents)
This makes allowed_spaces ACL filtering work correctly — a key scoped to ["myorg/backend"] will include GitHub captures from that repo.
Age baseline: Freshness is calculated from the original content creation date (when the PR was opened, when the Slack thread started) — not the time DocBrain captured it. Re-capturing the same thread updates its content but preserves the original creation date as the staleness baseline.
Memory Consolidation¶
| Variable | Default | Description |
|---|---|---|
CONSOLIDATION_INTERVAL_HOURS |
6 |
How often the memory consolidation job runs (merges episodic patterns into semantic/procedural memory) |
RAG Pipeline¶
| Variable | Default | Description |
|---|---|---|
RAG_TOP_K |
10 |
Chunks retrieved per query. Higher = more context passed to the LLM, at the cost of more tokens per call. Raise to 15–20 if answers are missing obvious information; lower to 5 to reduce cost on simple corpora. |
RAG_BM25_BOOST |
1.0 |
Weight of keyword (BM25) search relative to vector search in hybrid retrieval. Raise to 2.0–3.0 for corpora heavy with exact-match queries — error codes, CLI commands, ticket IDs, specific tool names. Leave at 1.0 for general prose documentation. |
SEARCH_MIN_SCORE |
0.0 |
Drop retrieved chunks below this relevance score before sending context to the LLM. 0.0 keeps everything. Set to 0.3–0.4 if you notice irrelevant chunks contaminating answers; leave at 0.0 for small corpora where recall matters more than precision. |
RAG_CACHE_TTL_HOURS |
24 |
How long to cache semantically identical answers |
RAG_CACHE_THRESHOLD |
0.95 |
Cosine similarity threshold for a query to count as a cache hit |
Chunking¶
Controls how documents are split before embedding. See Ingestion Guide for re-ingest instructions.
| Variable | Default | Description |
|---|---|---|
CHUNK_SIZE |
1500 |
Target chunk size in characters. Dense API refs: 800–1200. General docs: 1500. Long-form prose: 2000–2500. |
CHUNK_OVERLAP |
200 |
Overlap between adjacent paragraph-split chunks in characters. |
OpenSearch Index Names¶
| Variable | Default | Description |
|---|---|---|
OPENSEARCH_INDEX |
docbrain-chunks |
Index name for document chunks (vectors + BM25) |
OPENSEARCH_EPISODE_INDEX |
docbrain-episodes |
Index name for episode vectors (used in episodic memory recall) |
Only change these if you run multiple DocBrain instances sharing the same OpenSearch cluster, to avoid index collisions.
Data Retention¶
| Variable | Default | Description |
|---|---|---|
EPISODE_RETENTION_DAYS |
90 |
Episode (query history) rows older than this are pruned daily. Set to 0 to disable pruning. |
AUDIT_RETENTION_DAYS |
365 |
Audit log rows older than this are pruned daily. Set to 0 to disable pruning. |
Self-Ingest (Optional)¶
| Variable | Default | Description |
|---|---|---|
DOCBRAIN_SELF_INGEST |
true |
Auto-ingest DocBrain's own docs so it can answer configuration questions about itself |
DOCBRAIN_DOCS_PATH |
./docs |
Path to DocBrain's own documentation directory |
SSO / OIDC (Enterprise)¶
| Variable | Default | Description |
|---|---|---|
OIDC_ISSUER_URL |
— | OIDC provider URL (e.g. https://accounts.google.com) |
OIDC_CLIENT_ID |
— | OAuth client ID |
OIDC_CLIENT_SECRET |
— | OAuth client secret |
OIDC_REDIRECT_URI |
— | Callback URI (e.g. https://docbrain.example.com/api/v1/auth/oidc/callback) |
OIDC_WEB_UI_URL |
http://localhost:3001 |
Where to redirect after successful login |
OIDC_ACCEPT_INVALID_CERTS |
false |
Set to true to skip TLS verification — use for corporate/self-signed CAs |
GitLab OIDC¶
| Variable | Default | Description |
|---|---|---|
GITLAB_OIDC_ISSUER_URL |
— | GitLab instance URL (e.g. https://gitlab.com or https://gitlab.corp.example.com) |
GITLAB_CLIENT_ID |
— | GitLab OAuth application client ID |
GITLAB_CLIENT_SECRET |
— | GitLab OAuth application client secret |
GITLAB_REDIRECT_URI |
— | Callback URL (e.g. https://docbrain.example.com/api/v1/auth/gitlab/callback) |
Corporate GitLab: If your self-hosted GitLab uses an internal CA, set
OIDC_ACCEPT_INVALID_CERTS=true.
RBAC Role Assignment¶
Role is computed at login time and stored on the user record. The hierarchy is: viewer (1) < editor (2) < analyst (3) < admin (4). Higher-priority rules win.
| Variable | Helm key | Description |
|---|---|---|
OIDC_DEFAULT_ROLE |
rbac.defaultRole |
Role assigned to new SSO users who match no group rule. Default: viewer. |
OIDC_ADMIN_EMAILS |
rbac.adminEmails |
Comma-separated emails that always receive admin. |
OIDC_ADMIN_DOMAIN |
rbac.adminDomain |
Email domain whose users receive admin (e.g. acme.com). |
OIDC_ADMIN_GROUPS |
rbac.adminGroups |
Comma-separated IdP group names → admin role. |
OIDC_EDITOR_GROUPS |
rbac.editorGroups |
Comma-separated IdP group names → editor role. |
OIDC_ALLOWED_GROUPS |
rbac.allowedGroups |
Access gate: only these groups may log in (all others get 403). |
OIDC_ALLOWED_DOMAINS |
rbac.allowedDomains |
Access gate: only these email domains may log in. |
What every engineer can see¶
All authenticated users (including viewer) have full access to the intelligence dashboards:
| Page | What it shows |
|---|---|
| Velocity | Documentation ROI — queries deflected, hours saved, cost saved, per-team breakdown |
| Predictive | Predicted documentation gaps from code changes, cascade staleness, seasonal patterns, onboarding risks |
| Maintenance | AI-generated fix proposals with apply/reject workflow |
| Stream | Live knowledge event feed — incident warnings, freshness decay alerts, trending gaps |
These dashboards are visible to every engineer. The insight loop only works if the people who can act on it — the engineers — can actually see it.
Example — typical multi-team setup:
# Equivalent env vars
OIDC_DEFAULT_ROLE=viewer
OIDC_ADMIN_GROUPS=platform-team
OIDC_EDITOR_GROUPS=docs-writers
Note: Role is evaluated at login time. Group changes in your IdP take effect on next login.
Documentation Analytics¶
| Variable | Default | Description |
|---|---|---|
VELOCITY_MINUTES_SAVED_PER_QUERY |
15 |
Estimated minutes saved per deflected query |
VELOCITY_HOURLY_RATE |
75 |
Effective hourly rate (USD) for ROI calculation |
Knowledge Stream¶
| Variable | Default | Description |
|---|---|---|
STREAM_ENABLED |
false |
Enable background knowledge stream emission |
STREAM_INTERVAL_MINUTES |
30 |
How often the stream background task runs |
STREAM_INCIDENT_WARNING_MIN_USERS |
2 |
Minimum unique users hitting an unanswered question to emit an incident warning |
STREAM_DECAY_THRESHOLD |
0.5 |
Freshness score below which a decay alert is emitted |
Event Bus¶
The event bus is internal pub/sub infrastructure — always enabled, no opt-in required. Every significant action (document ingest, gap detection, draft generation, etc.) emits a typed event that subscribers can react to.
| Variable | Default | Description |
|---|---|---|
EVENT_BUS_CAPACITY |
4096 |
Broadcast channel buffer size. Increase if subscribers lag under high event volume. Max: 65536. |
EVENT_LOG_RETENTION_DAYS |
90 |
Days to retain events in the event_log table before purging. |
Admin API endpoints:
| Method | Path | Description |
|---|---|---|
GET |
/api/v1/events |
Query the persistent event log. Supports ?type=gap.detected&since=2026-03-01&limit=100&offset=0. |
GET |
/api/v1/events/stream |
SSE stream of real-time events. Max 10 concurrent connections. |
Both endpoints require admin role.
Knowledge Fragments¶
Knowledge fragments are first-class units of knowledge — smaller than documents, richer than chunks. They capture decisions, facts, caveats, procedures, and context from PRs, commits, IDE annotations, conversations, CI/CD pipelines, and manual entry.
Fragments are routed by confidence score: high-confidence fragments are auto-indexed into search, medium-confidence go to a review queue, and low-confidence are auto-discarded.
| Variable | Default | Description |
|---|---|---|
FRAGMENT_AUTO_INDEX_THRESHOLD |
0.7 |
Minimum confidence score to auto-index a fragment into OpenSearch. |
FRAGMENT_REVIEW_THRESHOLD |
0.4 |
Minimum confidence for the review queue. Fragments below this are auto-discarded. |
FRAGMENT_MAX_CONTENT_LENGTH |
10000 |
Maximum fragment content length in characters. |
Fragment Clustering & Auto-Composition¶
Semantic clustering groups related fragments by topic using embedding similarity (DBSCAN-style greedy algorithm). When a cluster meets composability criteria (5+ fragments, diverse sources, 500+ words), it can be auto-composed into a documentation draft via the API.
| Variable | Default | Description |
|---|---|---|
FRAGMENT_CLUSTERING_ENABLED |
true |
Enable or disable the fragment clustering endpoint. |
FRAGMENT_CLUSTER_THRESHOLD |
0.80 |
Cosine similarity threshold for grouping fragments (0.60 = loose, 0.90 = strict). |
FRAGMENT_MIN_CLUSTER_SIZE |
3 |
Minimum fragments required to form a cluster. |
FRAGMENT_MIN_SOURCE_DIVERSITY |
2 |
Minimum distinct source types for a cluster to be composable. |
FRAGMENT_MAX_PER_CLUSTERING_RUN |
2000 |
Maximum fragments loaded per clustering run (memory/cost control). |
CI/CD Pipeline Capture¶
Automated knowledge extraction from merged PRs and deployments. When enabled, DocBrain provides API endpoints that CI/CD pipelines can call to extract knowledge fragments from pull requests and deployment events. Uses the fast/cheap LLM model to keep costs low at high volume.
| Variable | Default | Description |
|---|---|---|
CI_ANALYZE_ENABLED |
true |
Enable or disable the CI/CD capture endpoints (/api/v1/ci/analyze and /api/v1/ci/deploy-capture). |
See the API Reference for endpoint details and the GitHub Action setup guide.
Conversation Auto-Distillation¶
Automatically extracts structured knowledge fragments from captured conversations — Slack threads (via /docbrain sync) and GitHub PR discussions (via @docbrain capture). After a successful capture, DocBrain runs LLM-powered distillation in the background to identify decisions, facts, caveats, procedures, and context embedded in the conversation.
Distillation is fire-and-forget: it never affects capture response time. Failures are logged and metriced but don't block the capture path.
| Variable | Default | Description |
|---|---|---|
DISTILLATION_ENABLED |
true |
Enable or disable conversation auto-distillation. |
DISTILLATION_MAX_CONCURRENT |
3 |
Maximum concurrent LLM distillation calls (bounded by semaphore). |
DISTILLATION_MAX_CONTENT_CHARS |
8000 |
Maximum conversation characters sent to the LLM. Longer conversations are truncated (tail-biased — keeps the most recent messages). |
DISTILLATION_MAX_FRAGMENTS |
5 |
Maximum knowledge fragments extracted per conversation. |
Governance SLA Checker¶
The SLA checker runs as a periodic background task that detects breaches across four entity types: gap acknowledgment, gap resolution, draft review, and document freshness. SLA thresholds are stored in the database (per-space overridable via the API) — these settings control the checker's operational behavior.
| Variable | Default | Description |
|---|---|---|
SLA_CHECKER_INTERVAL_HOURS |
1 |
How often the SLA breach checker runs (hours). |
SLA_CHECKER_QUERY_TIMEOUT_SECS |
30 |
Per-entity-type query timeout in seconds. |
SLA_CHECKER_MAX_CANDIDATES |
5000 |
Maximum candidate entities scanned per type per run. |
SLA_CHECKER_MAX_EVENTS_PER_RUN |
50 |
Maximum SlaBreached events emitted per run (prevents webhook flooding). |
See the API Reference — Governance SLAs for endpoint documentation.
External Connectors (HTTP Connector Protocol)¶
External connectors are stateless HTTP servers that implement a simple REST contract (GET /health, POST /documents/list, POST /documents/fetch). DocBrain calls them on a configurable cron schedule to ingest documents from external systems. Connectors are registered and managed via the admin API.
The connector scheduler runs as a background task, polling every 60 seconds for connectors whose cron schedule is due. A circuit breaker automatically disables connectors after repeated failures.
| Variable | Default | Description |
|---|---|---|
CONNECTOR_ENABLED |
true |
Enable/disable the connector scheduler |
CONNECTOR_MAX_CONCURRENT_SYNCS |
3 |
Max connectors syncing simultaneously (1-20) |
CONNECTOR_MAX_PAGES_PER_SYNC |
200 |
Max list pages fetched per sync |
CONNECTOR_MAX_DOCUMENTS_PER_SYNC |
5000 |
Max documents ingested per sync |
CONNECTOR_FETCH_BATCH_SIZE |
50 |
Documents fetched per batch (1-200) |
CONNECTOR_REQUEST_TIMEOUT_SECS |
30 |
HTTP timeout for individual connector requests (5-300 seconds) |
CONNECTOR_SYNC_TIMEOUT_SECS |
3600 |
Overall sync timeout per connector (60-7200 seconds) |
CONNECTOR_MAX_RESPONSE_BYTES |
10485760 |
Max response body size from connector (10 MB) |
CONNECTOR_CIRCUIT_BREAKER_THRESHOLD |
5 |
Consecutive failures before auto-disabling a connector |
CONNECTOR_ALLOW_INTERNAL |
false |
Allow connector URLs on private/internal IP addresses. Not recommended for production. |
See the API Reference — Connectors for endpoint documentation and the connector protocol spec.
Webhooks (Outbound)¶
Outbound webhook subscriptions let you push DocBrain events to external systems — Slack bots, CI/CD pipelines, PagerDuty, custom dashboards, etc. DocBrain signs every delivery with HMAC-SHA256, retries with exponential backoff, and automatically disables subscriptions that fail repeatedly (circuit breaker).
| Variable | Default | Description |
|---|---|---|
WEBHOOK_DELIVERY_TIMEOUT_SECONDS |
10 |
HTTP timeout per webhook delivery attempt (1-60 seconds) |
WEBHOOK_MAX_RETRIES |
4 |
Maximum delivery attempts before giving up (1-10) |
WEBHOOK_CIRCUIT_BREAKER_THRESHOLD |
10 |
Consecutive failures before auto-disabling a subscription (3-100) |
ALLOW_INTERNAL_WEBHOOKS |
false |
Allow delivery to private/internal IP addresses (10.x, 172.16.x, 192.168.x). Not recommended for production. |
See the API Reference — Webhooks for endpoint documentation and event types.
Style Rules Engine¶
The style rules engine provides configurable linting for documentation consistency. Rules are always enabled — no opt-in required. Rules are managed via the API (CRUD + YAML import/export) and stored in PostgreSQL.
Rules are scoped either globally (space = null) or per-space. When linting, global rules apply to all content, and space-specific rules override global rules with the same (rule_type, name) key.
Five default rules are seeded on first migration:
| Rule | Type | Default Severity |
|---|---|---|
avoid-simple |
terminology | warning |
avoid-just |
terminology | warning |
max-heading-depth (H4) |
formatting | warning |
max-sentence-length (40 words) |
formatting | info |
require-intro |
structure | warning |
API endpoints: See API Reference — Style Rules Engine for full endpoint documentation.
There are no environment variables for the style rules engine — all limits are compile-time constants. Custom rules are created and managed entirely through the API.