版本：最新版

Configuration

Semantic Router v0.3 uses one canonical YAML contract across local CLI, dashboard, Helm, and the operator:

version:
listeners:
providers:
routing:
global:

The detailed background is in Unified Config Contract v0.3. This page is the practical guide for using the contract.

Canonical contract

version: schema version. Use v0.3.
listeners: router listener ports and timeouts.
providers: deployment bindings and provider defaults.
routing: routing semantics.
global: sparse runtime overrides. If you omit a field here, the router's built-in default is used.

Ownership by section

routing is the DSL-owned surface.
- routing.modelCards
- routing.modelCards[].loras
- routing.signals
- routing.projections for partitions plus derived routing outputs
- routing.decisions
providers owns deployment and default-selection metadata.
- defaults
- models
- providers.defaults holds default_model, reasoning_families, and default_reasoning_effort
- providers.models[*] holds provider_model_id, backend_refs, pricing, api_format, and external_model_ids
global owns router-wide runtime overrides.
- global.router groups router-engine control knobs such as config-source selection, route-cache, and model-selection defaults
- global.router.config_source selects whether runtime config comes from the canonical YAML file (file) or from in-process Kubernetes CRD reconciliation (kubernetes)
- global.services groups shared APIs and control-plane services such as response_api, router_replay, observability, authz, and ratelimit
- global.stores groups shared storage-backed services such as semantic_cache, memory, and vector_store
global.integrations groups helper runtime integrations such as tools and looper
global.model_catalog groups router-owned model assets such as embeddings, system models, external models, reusable classifiers, and model-backed modules
global.model_catalog.embeddings.semantic.embedding_config.top_k limits how many ranked embedding rules are emitted for routing after scoring; the built-in default is 1
global.model_catalog.classifiers[] is the reusable registry for startup-loaded classifier packages such as taxonomy classifiers
global.model_catalog.modules groups capability modules such as prompt_guard, classifier, and hallucination_mitigation

Canonical example

version: v0.3

listeners:
  - name: http-8899
    address: 0.0.0.0
    port: 8899
    timeout: 300s

providers:
  defaults:
    default_model: qwen3-8b
    reasoning_families:
      qwen3:
        type: chat_template_kwargs
        parameter: enable_thinking
    default_reasoning_effort: medium
  models:
    - name: qwen3-8b
      reasoning_family: qwen3
      provider_model_id: qwen3-8b
      backend_refs:
        - name: primary
          endpoint: host.docker.internal:8000
          protocol: http
          weight: 100
          api_key_env: OPENAI_API_KEY

routing:
  modelCards:
    - name: qwen3-8b
      modality: text
      capabilities: [chat, reasoning]
      loras:
        - name: math-adapter
          description: Adapter used for symbolic math and proof-style prompts.

  signals:
    keywords:
      - name: math_terms
        operator: OR
        keywords: ["algebra", "calculus"]
    structure:
      - name: many_questions
        feature:
          type: count
          source:
            type: regex
            pattern: '[?？]'
        predicate:
          gte: 3
    embeddings:
      - name: technical_support
        threshold: 0.75
        candidates: ["installation guide", "troubleshooting steps"]
      - name: account_management
        threshold: 0.72
        candidates: ["billing information", "subscription management"]

  projections:
    partitions:
      - name: support_intents
        semantics: exclusive
        temperature: 0.3
        members: [technical_support, account_management]
        default: technical_support
    scores:
      - name: request_difficulty
        method: weighted_sum
        inputs:
          - type: embedding
            name: technical_support
            weight: 0.18
            value_source: confidence
          - type: context
            name: long_context
            weight: 0.18
          - type: structure
            name: many_questions
            weight: 0.12
    mappings:
      - name: request_band
        source: request_difficulty
        method: threshold_bands
        outputs:
          - name: support_fast
            lt: 0.25
          - name: support_escalated
            gte: 0.25

  decisions:
    - name: support_route
      description: Route support requests that need an escalated answer
      priority: 100
      rules:
        operator: AND
        conditions:
          - type: embedding
            name: technical_support
          - type: projection
            name: support_escalated
      modelRefs:
        - model: qwen3-8b
          use_reasoning: true
          lora_name: math-adapter

global:
  router:
    config_source: file
  services:
    observability:
      metrics:
        enabled: true

For routing.signals.structure, feature.type: density now uses built-in multilingual text-unit normalization. The router counts each CJK character as one unit, counts contiguous runs of other letters and digits as one unit, and ignores punctuation, so the same density rule shape behaves consistently across English, Chinese, and mixed-script prompts without a separate normalize_by field.

Repository config assets

The repository now separates the exhaustive canonical reference config from reusable routing fragments:

config/config.yaml: exhaustive canonical reference config
config/signal/: reusable routing.signals fragments
config/decision/: reusable routing.decisions rule-shape fragments
config/algorithm/: reusable decision.algorithm snippets
config/plugin/: reusable route-plugin snippets

config/decision/ is organized by boolean case shape: single/, and/, or/, not/, and composite/. config/algorithm/ is organized by routing policy family: looper/ and selection/. config/plugin/ is organized one plugin or reusable bundle per directory. The repository enforces this fragment catalog in go test ./pkg/config/..., so routing-surface changes must update the config/ tree in the same change.

Latest tutorials follow the same taxonomy:

tutorials/signal/overview plus tutorials/signal/heuristic/ and tutorials/signal/learned/ for config/signal/
tutorials/decision/ for config/decision/
tutorials/algorithm/ for config/algorithm/, with one page per algorithm
tutorials/plugin/ for config/plugin/, with one page per plugin
tutorials/global/ for sparse router-wide overrides under global:

Repo-owned runtime and harness assets now live outside config/:

deploy/examples/runtime/semantic-cache/
deploy/examples/runtime/response-api/
deploy/examples/runtime/tools/
e2e/config/
deploy/local/envoy.yaml

Test-only ONNX binding assets now live under e2e/config/onnx-binding/.

Those directories are support assets, not the main user-facing config contract. For hand-authored config, start from config/config.yaml or the fragment directories above. In this repository, the exhaustive reference config points global.integrations.tools.tools_db_path at deploy/examples/runtime/tools/tools_db.json for local development.

config/config.yaml is not just a sample anymore. The repository enforces it as the exhaustive public-contract reference:

go test ./pkg/config/... checks that it stays aligned to the canonical schema and routing surface catalog
make agent-lint runs the same reference-config contract check at lint level, so config/schema drift is blocked before merge
maintained deploy/ and e2e/ router config assets are checked against the same canonical contract, so repo-owned examples and harness profiles cannot drift back to legacy steady-state fields

Projection Workflow

Use routing.projections when the raw signal catalog is not enough on its own:

routing.signals defines reusable detectors.
routing.projections.partitions resolves one winner inside an exclusive domain or embedding family.
routing.projections.scores combines learned and heuristic signals into a weighted score.
routing.projections.mappings turns that score into named routing bands.
routing.decisions[*].rules.conditions[*] can reference those bands with type: projection.

The dashboard mirrors the same contract:

Config -> Projections edits partitions, scores, and mappings
Config -> Decisions can reference mapping outputs with condition type projection
DSL -> Visual manages PROJECTION partition, PROJECTION score, and PROJECTION mapping entities directly

For a focused tutorial, read Projections. For a maintained end-to-end example, use:

How to use it

Python CLI

Use the canonical YAML directly.

vllm-sr serve --config config.yaml

To migrate an older config first:

vllm-sr config migrate --config old-config.yaml
vllm-sr validate config.yaml

vllm-sr init was removed in v0.3. The steady-state file is config.yaml. Inside this repository, the default exhaustive reference file is config/config.yaml.

Router local / YAML-first

For local Docker or direct router development, hand-author config.yaml in canonical form and validate it before serving:

vllm-sr validate config.yaml
vllm-sr serve --config config.yaml

If you only need to override a few runtime defaults, write those under global: and leave the rest unset.

Dashboard / onboarding

Use the dashboard when you want to import or edit the full canonical YAML directly.

onboarding remote import accepts a complete version/listeners/providers/routing/global file
the config page edits the same canonical contract
the DSL editor can import the same YAML, but it only decompiles routing
decision model refs can carry lora_name, and those names resolve against routing.modelCards[].loras

Helm

Helm values now mirror the same canonical contract under config.

config:
  version: v0.3
  providers:
    defaults:
      default_model: qwen3-8b
    models:
      - name: qwen3-8b
        provider_model_id: qwen3-8b
        backend_refs:
          - name: primary
            endpoint: semantic-router-vllm.default.svc.cluster.local:8000
            protocol: http
  routing:
    modelCards:
      - name: qwen3-8b

Then install or upgrade normally:

helm upgrade --install semantic-router deploy/helm/semantic-router -f values.yaml

Operator

The operator keeps the same logical contract, but it wraps it inside the CRD:

spec.config.providers
spec.config.routing
spec.config.global

spec.vllmEndpoints is still the Kubernetes-native backend discovery adapter. The controller projects that data into canonical providers.models[].backend_refs[] and routing.modelCards entries, including any declared loras, when it renders the router config.

See Kubernetes Operator.

DSL

DSL only owns the routing surface.

Author MODEL, SIGNAL, and ROUTE
Compile to a routing fragment
Keep providers and global in YAML

The DSL compiler emits:

routing:
  modelCards:
  signals:
  decisions:

It does not emit listeners, providers, or global.

Import and migration

Onboarding remote import

The setup wizard can import a full canonical YAML file from a URL and apply the complete config, including providers, routing, and global.

DSL import

The DSL editor can import:

a full router config YAML
a routing-only YAML fragment

In both cases, only the routing section is decompiled into DSL.

Migrate old configs

Use the CLI migration command for older flat or mixed configs:

vllm-sr config migrate --config old-config.yaml

This migrates legacy shapes such as:

top-level signals, flat keyword_rules/categories/other signal blocks, and decisions
top-level model_config
top-level vllm_endpoints and provider_profiles
providers.models[].endpoints
inline access_key

into canonical providers/routing/global.

Import OpenClaw model providers

Use the CLI import command when you already have an openclaw.json with supported OpenAI-compatible provider endpoints and want VSR to take over model routing while rewriting OpenClaw to the first VSR listener:

vllm-sr config import --from openclaw --source openclaw.json --target config.yaml

When --source is omitted, the importer checks OPENCLAW_CONFIG_PATH, ./openclaw.json, and ~/.openclaw/openclaw.json in that order.

Quick guides by environment

Python CLI

Write config.yaml in canonical form.
Run vllm-sr validate config.yaml.
Run vllm-sr serve --config config.yaml.

Router local

Keep provider-wide defaults in providers.defaults and deployment bindings in providers.models[].backend_refs[].
Keep routing semantics in routing.modelCards/signals/decisions.
Put only runtime overrides you actually need under global.router/services/stores/integrations/model_catalog, and keep model-backed module settings under global.model_catalog.modules.
Use global.router.config_source: kubernetes only when the in-process IntelligentPool / IntelligentRoute controller is the active source of truth. Leave it as file for normal local, CLI, dashboard, Helm, and operator-authored canonical YAML.

Helm

Put the same canonical config under values.yaml -> config.
Use helm upgrade --install ... -f values.yaml.
Treat Helm as a deployment wrapper, not a second config schema.

Operator

Put portable config under spec.config.
Use spec.vllmEndpoints only when you want Kubernetes-native backend discovery.
Expect the operator to render canonical router config from that adapter layer.

DSL

Use DSL for routing.modelCards, routing.signals, and routing.decisions.
Importing a full YAML file still works, but only routing is decompiled into DSL.
Keep endpoints, API keys, listeners, and global in YAML.
Reusable routing fragments now live under config/signal/, config/decision/, config/algorithm/, and config/plugin/.

Configuration

Canonical contract​

Ownership by section​

Canonical example​

Repository config assets​

Projection Workflow​

How to use it​

Python CLI​

Router local / YAML-first​

Dashboard / onboarding​

Helm​

Operator​

DSL​

Import and migration​

Onboarding remote import​

DSL import​

Migrate old configs​

Import OpenClaw model providers​

Quick guides by environment​

Python CLI​

Router local​

Helm​

Operator​

DSL​

Canonical contract

Ownership by section

Canonical example

Repository config assets

Projection Workflow

How to use it

Python CLI

Router local / YAML-first

Dashboard / onboarding

Helm

Operator

DSL

Import and migration

Onboarding remote import

DSL import

Migrate old configs

Import OpenClaw model providers

Quick guides by environment

Python CLI

Router local

Helm

Operator

DSL