Configuration Guide¶
This document describes the OpenSage-ADK configuration system, including all configuration fields, their purposes, and how to write configuration files.
Overview¶
OpenSage-ADK uses TOML (Tom's Obvious, Minimal Language) format for configuration files. The configuration system supports:
- Template Variables: Use
${VAR_NAME}syntax for reusable values - Nested Sections: Organize related settings into logical groups
- Environment Variable Support: Template variables can reference environment variables
- Type Safety: Automatic conversion to Python dataclasses with type checking
Configuration File Location¶
Configuration files are loaded in the following order:
- Default Configuration:
src/opensage/templates/configs/default_config.toml(used when no config is specified) - Custom Configuration: Path specified via
config_pathparameter when creating a session
Configuration Structure¶
The configuration is organized into several main sections:
# Top-level template variables (optional)
VARIABLE_NAME = "value"
# Root-level fields
task_name = "my_task"
src_dir_in_sandbox = "/shared/code"
default_host = "127.0.0.1"
auto_cleanup = true
# Section-based configuration
[neo4j]
# Neo4j database configuration
[sandbox]
# Sandbox configuration
[llm]
# LLM model configuration
[history]
# History and tool response configuration
[plugins]
# Plugin configuration
[agent_ensemble]
# Agent ensemble configuration
[build]
# Build and execution configuration
[mcp]
# Model Context Protocol services configuration
Template Variables¶
OpenSage-ADK supports template variable expansion using ${VAR_NAME} syntax.
Rules:¶
- Top-level UPPERCASE variables automatically become template variables
- Variables can be referenced anywhere using
${VAR_NAME} - Variables are expanded recursively throughout the configuration
- Undefined variables cause an error at load time
Example:¶
# Define template variables (UPPERCASE)
DEFAULT_IMAGE = "ubuntu:20.04"
MAIN_MODEL = "openai/gpt-4"
NEO4J_PASSWORD = "mypassword123"
# Use template variables
[sandbox.sandboxes.main]
image = "${DEFAULT_IMAGE}"
[llm.model_configs.main]
model_name = "${MAIN_MODEL}"
[neo4j]
password = "${NEO4J_PASSWORD}"
Configuration Sections¶
Root-Level Fields¶
These fields are defined at the top level of the configuration file:
| Field | Type | Description | Default |
|---|---|---|---|
task_name | string | Name identifier for the current task/session | None |
src_dir_in_sandbox | string | Path to source code directory within sandbox containers | "/shared/code" |
agent_storage_path | string | Path where dynamically created agents are stored | None |
default_host | string | Default hostname for services (used by Neo4j and MCP services) | None (falls back to 127.0.0.1) |
auto_cleanup | boolean | Whether to automatically cleanup resources when session ends | true |
Example:
task_name = "vulnerability_analysis"
src_dir_in_sandbox = "/shared/code"
agent_storage_path = "/tmp/agents"
default_host = "localhost"
auto_cleanup = true
Neo4j Configuration¶
Configures the Neo4j graph database connection.
Section: [neo4j]
Sandbox Images & Requirements (Practical Notes)¶
Some sandboxes require Python tooling inside their Docker images. In the default configuration template (src/opensage/templates/configs/default_config.toml):
sandbox.sandboxes.main- Built from
src/opensage/templates/dockerfiles/main/Dockerfile - Provides
python3via/app/.venv/bin/python -
Installs Python package
neo4j(used bysrc/opensage/sandbox/initializers/main.py) -
sandbox.sandboxes.joern - Built from
src/opensage/templates/dockerfiles/joern/Dockerfile - Provides
python3via/app/.venv/bin/python - Installs Python packages
httpxandwebsockets(used by Joern query helper scripts)
These images install Python deps using uv in the Dockerfile (create /app/.venv and run uv pip install ...), rather than at runtime inside a running container.
| Field | Type | Description | Default |
|---|---|---|---|
user | string | Neo4j username | None |
password | string | Neo4j password | None |
bolt_port | integer | Neo4j Bolt protocol port | 7687 |
neo4j_http_port | integer | Neo4j HTTP port | 7474 |
Note: The uri property is dynamically constructed as neo4j://{default_host}:{bolt_port}. If default_host is not set, it defaults to 127.0.0.1.
Example:
Sandbox Configuration¶
Configures sandbox environments (Docker containers or Kubernetes pods).
Section: [sandbox]
Top-Level Sandbox Settings¶
| Field | Type | Description | Default |
|---|---|---|---|
default_image | string | Default Docker image for sandboxes | None |
backend | string | Sandbox backend type. Supported values: "native"; "remotedocker", "opensandbox", "agentdocker-lite", "local", and "k8s" are currently under development. | "native" |
project_relative_shared_data_path | string | Path relative to project root for shared data (will be mounted as /shared in containers) | None |
absolute_shared_data_path | string | Absolute path for shared data | None |
host_shared_mem_dir | string | Absolute host path mounted into all sandboxes as /mem/shared (for file-based shared knowledge) | None |
mount_host_paths | list[string] | Global host bind mounts injected into all sandboxes. Format: "/abs/host:/abs/container[:ro|rw]" | [] |
tolerations | list[dict] | Kubernetes tolerations applied to all pods (k8s; under development) | None |
Per-Sandbox Configuration¶
Each sandbox type is configured under [sandbox.sandboxes.<sandbox_type>]:
Common Sandbox Types: - main: Primary analysis sandbox - joern: Joern static analysis sandbox - codeql: CodeQL analysis sandbox - neo4j: Neo4j database container - gdb_mcp: GDB debugger MCP service - pdb_mcp: PDB debugger MCP service - fuzz: Fuzzing environment
Supported Backend Values: - native: local Docker backend - remotedocker: remote Docker daemon over SSH/TCP (under development) - opensandbox: OpenSandbox-managed execution backend (under development) - agentdocker-lite: namespace-based lightweight local isolation backend (under development) - local: direct host execution without containers (under development) - k8s: Kubernetes backend (under development)
Container Configuration Fields:
| Field | Type | Description | Default |
|---|---|---|---|
image | string | Docker image name/tag | None |
container_id | string | Connect to existing container (instead of creating new) | None |
timeout | integer | Container operation timeout in seconds | 300 |
project_relative_dockerfile_path | string | Path to Dockerfile relative to project root | None |
absolute_dockerfile_path | string | Absolute path to Dockerfile | None |
command | string | Override container command (empty string = use Dockerfile default, None = use bash) | None |
platform | string | Platform architecture (e.g., "linux/amd64") | None |
network | string | Docker network name | None |
privileged | boolean | Run container in privileged mode | false |
security_opt | list[string] | Security options | [] |
cap_add | list[string] | Additional capabilities | [] |
gpus | string | GPU allocation (e.g., "all" or "device=GPU-UUID") | None |
shm_size | string | Shared memory size (e.g., "2g") | None |
mem_limit | string | Memory limit (e.g., "4g") | None |
cpus | string | CPU limit (e.g., "2") | None |
user | string | User to run as (e.g., "1000:1000") | None |
working_dir | string | Working directory in container | None |
Build Configuration:
| Field | Type | Description |
|---|---|---|
build_args | dict[string, string] | Docker build arguments |
using_cached | boolean | Whether to use cached image (internal flag) |
Environment, Volumes, and Ports:
| Field | Type | Description |
|---|---|---|
environment | dict[string, any] | Environment variables |
volumes | list[string] | Volume mounts in format "/host:/container:ro" |
mounts | list[string] | Docker mount specifications |
ports | dict[string, int\|string] | Port mappings in format {"port/tcp" = host_port} |
docker_args | list[string] | Raw arguments passed through to Docker CLI |
Extra Configuration:
| Field | Type | Description |
|---|---|---|
extra | dict[string, any] | Additional custom configuration (e.g., initializer_timeout_sec) |
Kubernetes-Specific Fields:
| Field | Type | Description |
|---|---|---|
pod_name | string | Connect to existing Pod instead of creating new |
container_name | string | Name of container within the Pod |
Example:
[sandbox]
backend = "native"
project_relative_shared_data_path = "data/my_project.tar.gz"
host_shared_mem_dir = "/home/you/.local/opensage/shared-memory"
mount_host_paths = [
"/data/datasets:/workspace/datasets:ro",
"/tmp/run-cache:/workspace/run-cache:rw",
]
[sandbox.sandboxes.main]
image = "ubuntu:20.04"
project_relative_dockerfile_path = "dockerfiles/main/Dockerfile"
timeout = 300
[sandbox.sandboxes.main.build_args]
BASE_IMAGE = "ubuntu:20.04"
[sandbox.sandboxes.main.environment]
PYTHONPATH = "/shared/code"
[sandbox.sandboxes.main.ports]
"8080/tcp" = 8080
[sandbox.sandboxes.main.extra]
initializer_timeout_sec = 1800
[sandbox.sandboxes.joern]
image = "opensage/joern"
project_relative_dockerfile_path = "dockerfiles/joern/Dockerfile"
command = ""
[sandbox.sandboxes.joern.environment]
JAVA_OPTS = "-Xmx16G -Xms4G"
[sandbox.sandboxes.joern.ports]
"8081/tcp" = 18087
Backend-specific notes:
nativeis the default and recommended local-development backend.remotedockeris currently under development and additionally usesdocker_host/docker_remote_host.opensandboxis currently under development and additionally requiressandbox.opensandboxprovider settings.agentdocker-liteis currently under development and commonly usessandbox.sandboxes.<name>.extrafor namespace/cgroup options such asfs_backend,cpu_max, ormemory_max.localis currently under development, is mainly for debugging, and supports only a single sandbox without shared volumes.k8sexists in code, but should still be considered under development.
mount_host_paths is appended to every sandbox's volumes. Semantics: - absolute source path (/abs/path) is treated as host mount source - mode defaults to rw when omitted
host_shared_mem_dir is also injected into every sandbox volume as: - "<host_shared_mem_dir>:/mem/shared:rw" - the host directory is created automatically if it does not exist
LLM Configuration¶
Configures language models used by agents.
Section: [llm]
Models are configured under [llm.model_configs.<model_name>]:
When using provider-backed models (for example openai/... or anthropic/...), make sure provider credentials are available in environment variables before runtime (for example OPENAI_API_KEY or ANTHROPIC_API_KEY).
Common Model Names: - main: Primary model for agent reasoning - summarize: Model for summarization and context compression - flag_claims: Model for flag claims processing
Model Configuration Fields:
| Field | Type | Description | Default |
|---|---|---|---|
model_name | string | Model identifier (e.g., "openai/gpt-4", "anthropic/claude-3") | Required |
temperature | float | Sampling temperature (0.0-2.0) | None |
max_tokens | integer | Maximum tokens in response | None |
rpm | integer | Rate limit: requests per minute | None |
tpm | integer | Rate limit: tokens per minute | None |
Example:
[llm]
[llm.model_configs.main]
model_name = "openai/gpt-4"
temperature = 0.7
max_tokens = 4096
rpm = 60
tpm = 60000
[llm.model_configs.summarize]
model_name = "openai/gpt-3.5-turbo"
temperature = 0.3
max_tokens = 2048
rpm = 30
tpm = 30000
History Configuration¶
Configures tool response handling and event history management.
Section: [history]
| Field | Type | Description | Default |
|---|---|---|---|
max_tool_response_length | integer | Maximum length of a single tool response before special handling | 10000 |
enable_quota_countdown | boolean | Show remaining LLM call quota after each tool response | false |
Events Compaction Configuration:
Section: [history.events_compaction]
| Field | Type | Description | Default |
|---|---|---|---|
max_history_summary_length | integer | Character budget threshold for triggering compaction | 100000 |
compaction_percent | integer | Percentage of history to compress (0-100) | 50 |
Example:
[history]
max_tool_response_length = 10000
enable_quota_countdown = true
[history.events_compaction]
max_history_summary_length = 100000
compaction_percent = 50
Plugins Configuration¶
Configures which plugins are enabled and where to find them. See Plugins for full documentation.
Section: [plugins]
| Field | Type | Description | Default |
|---|---|---|---|
enabled | list[string] | List of enabled plugin names (or regex patterns) | [] |
extra_plugin_dirs | list[string] | Additional directories to search for plugins | [] |
adk_plugin_params | dict[string, dict] | Per-ADK-plugin constructor kwargs, keyed by plugin name | {} |
Default plugin discovery paths (no extra config required): - Built-in plugins: src/opensage/plugins/default/adk_plugins/ - Built-in Claude hook plugins: src/opensage/plugins/default/claude_code_hooks/ - User-local plugins: ~/.local/opensage/plugins/ (.py and .json)
You can still add additional directories via extra_plugin_dirs if needed.
Example:
[plugins]
enabled = [
"doom_loop_detector_plugin",
"history_summarizer_plugin",
"tool_response_summarizer_plugin",
"quota_after_tool_plugin",
]
extra_plugin_dirs = ["/path/to/shared/plugins"]
[plugins.adk_plugin_params.doom_loop_detector_plugin]
threshold = 5
Agent Ensemble Configuration¶
Configures multi-agent ensemble execution.
Section: [agent_ensemble]
| Field | Type | Description | Default |
|---|---|---|---|
thread_safe_tools | list[string] | List of tool names that are thread-safe (can be called in parallel) | [] |
available_models_for_ensemble | list[string] or string | List of model names available for ensemble (can be comma-separated string) | [] |
Example:
[agent_ensemble]
thread_safe_tools = ["google_search", "read_file"]
available_models_for_ensemble = ["openai/gpt-4", "anthropic/claude-3"]
Or as comma-separated string:
[agent_ensemble]
thread_safe_tools = ["google_search", "read_file"]
available_models_for_ensemble = "openai/gpt-4,anthropic/claude-3"
Build Configuration¶
Configures build and execution commands for target programs.
Section: [build]
| Field | Type | Description | Default |
|---|---|---|---|
poc_dir | string | Directory path for proof-of-concept code | None |
compile_command | string | Command to compile the target program | None |
run_command | string | Command to run the target program | None |
target_type | string | Type of target (e.g., "default", "binary") | None |
target_binary | string | Path to target binary | None |
Example:
[build]
poc_dir = "/tmp/poc"
compile_command = "gcc -o target target.c"
run_command = "./target"
target_type = "binary"
target_binary = "/tmp/poc/target"
MCP Configuration¶
Configures Model Context Protocol (MCP) services.
Section: [mcp]
MCP services are configured under [mcp.services.<service_name>]:
Common Service Names: - gdb_mcp: GDB debugger MCP service - pdb_mcp: PDB debugger MCP service
MCP Service Configuration Fields:
| Field | Type | Description |
|---|---|---|
sse_port | integer | Server-Sent Events (SSE) server port |
sse_host | string | SSE server host (if None, uses default_host from root config) |
Note: The sse_host property dynamically uses default_host from the root configuration if not explicitly set.
Example:
[mcp]
[mcp.services.gdb_mcp]
sse_port = 1111
[mcp.services.pdb_mcp]
sse_port = 1112
sse_host = "localhost" # Optional, defaults to root config's default_host
Complete Example¶
Here's a complete configuration file example:
# Template Variables
DEFAULT_IMAGE = "ubuntu:20.04"
MAIN_MODEL = "openai/gpt-4"
NEO4J_PASSWORD = "secure_password"
TASK_NAME = "security_analysis"
# Root Configuration
task_name = "${TASK_NAME}"
src_dir_in_sandbox = "/shared/code"
default_host = "localhost"
auto_cleanup = true
# Neo4j Configuration
[neo4j]
user = "neo4j"
password = "${NEO4J_PASSWORD}"
bolt_port = 7687
neo4j_http_port = 7474
# Sandbox Configuration
[sandbox]
backend = "native"
project_relative_shared_data_path = "data/project.tar.gz"
[sandbox.sandboxes.main]
image = "${DEFAULT_IMAGE}"
project_relative_dockerfile_path = "dockerfiles/main/Dockerfile"
timeout = 300
[sandbox.sandboxes.main.environment]
PYTHONPATH = "/shared/code"
[sandbox.sandboxes.joern]
image = "opensage/joern"
project_relative_dockerfile_path = "dockerfiles/joern/Dockerfile"
command = ""
[sandbox.sandboxes.joern.ports]
"8081/tcp" = 18087
# LLM Configuration
[llm]
[llm.model_configs.main]
model_name = "${MAIN_MODEL}"
temperature = 0.7
max_tokens = 4096
[llm.model_configs.summarize]
model_name = "${MAIN_MODEL}"
temperature = 0.3
max_tokens = 2048
# History Configuration
[history]
max_tool_response_length = 10000
enable_quota_countdown = true
[history.events_compaction]
max_history_summary_length = 100000
compaction_percent = 50
# Plugins Configuration
[plugins]
enabled = [
"doom_loop_detector_plugin",
"history_summarizer_plugin",
"tool_response_summarizer_plugin",
"quota_after_tool_plugin",
]
extra_plugin_dirs = []
# Agent Ensemble Configuration
[agent_ensemble]
thread_safe_tools = ["google_search"]
available_models_for_ensemble = "${MAIN_MODEL}"
# Build Configuration
[build]
compile_command = "make"
run_command = "./target"
# MCP Configuration
[mcp]
[mcp.services.gdb_mcp]
sse_port = 1111
Loading Configuration in Code¶
Using Default Configuration¶
import opensage
# Uses default config from src/opensage/templates/configs/default_config.toml
session = opensage.get_session("my_session")
Using Custom Configuration¶
import opensage
# Load custom configuration file
session = opensage.get_session(
"my_session",
config_path="/path/to/my_config.toml",
)
Accessing Configuration¶
# Access configuration through session
config = session.config
# Access specific sections
neo4j_config = config.neo4j
sandbox_config = config.sandbox
llm_config = config.llm
# Access nested configurations
main_sandbox = config.get_sandbox_config("main")
main_model = config.get_llm_config("main")
Best Practices¶
- Use Template Variables: Define reusable values as UPPERCASE template variables at the top
- Organize by Section: Group related settings into logical sections
- Document Custom Fields: Add comments for non-standard or custom configuration
- Version Control: Keep configuration files in version control, but exclude sensitive values (passwords, API keys)
- Environment-Specific Configs: Create separate config files for development, testing, and production
- Validate Early: Test configuration files before deploying to catch errors early
Troubleshooting¶
Template Variable Not Found¶
If you see KeyError: Template variable 'VAR_NAME' not found, ensure: - The variable is defined as an UPPERCASE top-level variable - The variable name matches exactly (case-sensitive) - There are no typos in ${VAR_NAME} references
Configuration Not Loading¶
- Verify the TOML file syntax is correct
- Check file path is correct (use absolute paths if relative paths don't work)
- Ensure all required fields are present (check error messages)
Dynamic Host Resolution¶
If default_host is not set, services like Neo4j and MCP will default to 127.0.0.1. Set default_host at the root level for Kubernetes deployments or remote services.