Backend Management¶

The backend subsystem turns a config file into running MCP transports, restart behavior, and a unified registry.

Backend lifecycle

Public backend states¶

The public enum in src/backend/mod.rs is intentionally small:

Starting
Healthy
Unhealthy
Stopped

There are no public states such as "Degraded", "Restarting", or "Circuit Open". Circuit-breaker timing exists internally in the health checker and is reflected through Unhealthy and Stopped.

BackendManager¶

BackendManager owns:

the live backend map
the backend config map
per-backend semaphores
per-backend retry configs
rate limiters
dynamic backend tracking
managed prerequisite PIDs
in-flight call draining
optional call tracking hooks

The manager is transport-agnostic; it delegates concrete behavior to Backend implementations.

Supported transports¶

`stdio`¶

Child process backends communicate over stdin/stdout using rmcp.

Key details from src/backend/stdio.rs:

stdin and stdout are piped
stderr is piped to a 200-line ring buffer (exposed via gatemini://backend/{name} and gatemini://health)
Unix builds place the child in a new process group
a reaper task watches for unexpected exit and marks the backend stopped

That process-group isolation is what lets shutdown send SIGTERM to the whole backend tree instead of only the parent process.

`streamable-http`¶

Remote HTTP backends are implemented in src/backend/http.rs.

Use this transport in config:

backends:
  github:
    transport: streamable-http
    url: "https://api.githubcopilot.com/mcp/"
    headers:
      Authorization: "Bearer ${GITHUB_PAT_TOKEN}"

The lenient client wrapper exists to tolerate some imperfect servers that omit expected response headers.

`cli-adapter`¶

CLI adapter backends let you publish tools without writing a separate MCP server.

You can either define tools inline:

backends:
  jq-tools:
    transport: cli-adapter
    timeout: 30s
    tools:
      filter:
        description: "Apply a jq filter to JSON input"
        input_schema:
          type: object
          properties:
            filter: { type: string }
            input: { type: string }
          required: [filter, input]
        command: "jq '{{filter}}'"
        stdin: "{{input}}"
        output: json

Or point to an external adapter file:

backends:
  ffmpeg-tools:
    transport: cli-adapter
    adapter_file: ~/.config/gatemini/adapters/ffmpeg.yaml

The adapter file path supports ~ expansion in the CLI adapter loader.

Dedicated instance mode¶

By default, all proxy sessions share a single backend instance (instance_mode: shared). For stateful backends like sequential-thinking that maintain per-session state, this causes state bleed across sessions.

Setting instance_mode: dedicated gives each proxy session its own isolated backend instance from an autoscaling pool:

backends:
  sequential-thinking:
    command: mcp-server-sequential-thinking
    timeout: 120s
    instance_mode: dedicated
    pool:
      min_idle: 1
      max_instances: 10
      acquire_timeout: 30s

Pool lifecycle

Pool behavior:

pre-warms min_idle instances at startup (default: 1)
lazily spawns new instances on demand up to max_instances (default: 20)
on session disconnect, the assigned instance is stopped and a fresh one is spawned to maintain the idle pool
if all instances are busy, new sessions wait up to acquire_timeout (default: 30s) before failing

Only stdio and cli-adapter transports support dedicated mode. HTTP backends ignore the setting.

The pool implementation lives in src/backend/pool.rs. The health checker calls restart_pool_primary() instead of restart_backend() for dedicated backends.

Setting	Default
`pool.min_idle`	`1`
`pool.max_instances`	`20`
`pool.acquire_timeout`	`30s`

Concurrency, retries, and fallback¶

Per-backend limits come from config:

max_concurrent_calls
semaphore_timeout
retry
rate_limit
fallback_chain

Retry behavior only applies to the Starting state, where the manager waits briefly for a backend that is still connecting. Calls to Unhealthy or Stopped backends fail immediately unless the manager routes into a fallback backend for a transient error.

Health checker¶

The health loop in src/backend/health.rs runs in three phases:

ping healthy backends
handle unhealthy and stopped backends
retry pending configured backends that never became live

Health checker

Current defaults from src/config.rs:

Setting	Default
`health.interval`	`30s`
`health.timeout`	`5s`
`health.failure_threshold`	`3`
`health.max_restarts`	`5`
`health.restart_window`	`60s`
`health.restart_initial_backoff`	`1s`
`health.restart_max_backoff`	`30s`
`health.restart_timeout`	`30s`
`health.recovery_multiplier`	`3`
`health.drain_timeout`	`10s`
`health.memory_check_interval`	`30s`
`health.memory_restart_cooldown`	`60s`

Internal circuit-breaker behavior:

healthy backends are pinged
failures increment consecutive_failures
once the threshold is reached, the backend is marked Unhealthy
the health checker records circuit_open_since
after interval * recovery_multiplier, a half-open probe is attempted
if the probe fails, restart logic or another recovery window applies

Prerequisites¶

Some backends depend on another process already running. That is handled by src/backend/prerequisite.rs.

Features:

optional pgrep -f dedup via process_match
optional managed lifecycle on shutdown
startup delay before backend connect

If managed: true, Gatemini records the spawned prerequisite PID and terminates the process group during shutdown.

Process supervision¶

Backend child processes are supervised with configurable shutdown behavior and memory monitoring.

Graceful shutdown¶

When a stdio backend is stopped, Gatemini sends SIGTERM to the process group (or taskkill /T on Windows), then polls try_wait() every 100ms for up to shutdown_grace_period (default 5s). If the child hasn't exited by the deadline, SIGKILL is sent (or taskkill /F /T on Windows).

Prerequisite processes follow the same pattern with a fixed 5s grace period.

Stderr capture¶

Backend stderr is piped to a 200-line ring buffer per backend. Recent lines are exposed in gatemini://backend/{name} and gatemini://health. When a backend exits unexpectedly, the last stderr lines are logged at warn level.

Memory monitoring¶

The health checker samples RSS for all backends every memory_check_interval (default 30s) via a single ps call. Stats are exposed in gatemini://health. If a backend's RSS exceeds max_memory_mb, it is restarted with a cooldown of memory_restart_cooldown (default 60s). A warning is logged at 80% of the limit.

Setting	Default
`shutdown_grace_period`	`5s`
`max_memory_mb`	none
`pool.replenish_delay`	`2s`

Output processing pipeline¶

Tool call responses pass through a three-stage pipeline before being returned to the client.

Stage 1 — Intent filtering: if the caller passes an intent string to call_tool_chain, the raw output is filtered to sections relevant to that intent before any further processing.

Stage 2 — Auto-chunk: if output_config.auto_chunk_json is enabled and the output is parseable JSON above output_config.chunk_threshold, the response is decomposed. Uniform arrays are collapsed to the first 3 items plus a count summary; non-uniform objects are rendered as a key-path summary.

Stage 3 — Truncation: if the output after the previous stages exceeds max_output_size, it is truncated using a head-60%/tail-40% split to preserve both the beginning and end of the response.

The tracker records bytes_returned (after the pipeline) and bytes_processed (raw bytes before) per tool call. These are exposed through gatemini://stats as a savings ratio and reduction percentage.

Composite tools¶

Composite tools are not a separate transport. They are registered under the virtual __composite backend and executed through the sandbox layer.

Important limitation:

config watcher notices composite tool changes
those changes are logged
they are not hot-reloaded; daemon restart is required