Telemetry and metrics

Virtual MCP Server (vMCP) provides comprehensive observability through OpenTelemetry instrumentation. You can export traces and metrics to monitor backend operations and workflow executions.

Telemetry types

vMCP supports two types of telemetry:

Traces: Track requests across vMCP and its backends, showing the full path of tool calls, resource reads, and workflow executions
Metrics: Counters and histograms for backend request rates, error rates, and latency distributions

For general ToolHive observability concepts including trace structure and metrics, see the observability overview.

Enable telemetry

There are two ways to configure telemetry for a VirtualMCPServer:

Shared config reference (recommended): Reference a shared MCPTelemetryConfig resource using spec.telemetryConfigRef. This enables reuse across multiple resources, Kubernetes-native secret references for OTLP auth headers, CA bundle ConfigMap references, and per-server serviceName overrides.
Inline config: Define telemetry settings directly in spec.config.telemetry. This approach is deprecated in favor of the shared config reference.

warning

You cannot use both spec.telemetryConfigRef and spec.config.telemetry on the same VirtualMCPServer. CEL validation enforces mutual exclusivity.

Shared telemetry config reference (recommended)

info

spec.telemetryConfigRef on VirtualMCPServer is a v0.20.0 feature. The CRD reference will be updated with the v0.20.0 operator release.

Create a shared MCPTelemetryConfig resource, then reference it from your VirtualMCPServer:

telemetry-config.yaml
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPTelemetryConfig
metadata:
  name: shared-otel
  namespace: toolhive-system
spec:
  openTelemetry:
    enabled: true
    endpoint: otel-collector.monitoring:4318
    insecure: true
    tracing:
      enabled: true
      samplingRate: '0.05'
    metrics:
      enabled: true
  prometheus:
    enabled: true

virtualmcpserver.yaml
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
  namespace: toolhive-system
spec:
  groupRef:
    name: my-group
  telemetryConfigRef:
    name: shared-otel
    serviceName: my-vmcp
  incomingAuth:
    type: anonymous

The serviceName field on telemetryConfigRef overrides the telemetry service name for this specific VirtualMCPServer, giving it a distinct identity in your observability backend. If omitted, it defaults to the server name with a thv- prefix.

When you update the MCPTelemetryConfig resource, the operator detects the change and triggers a rolling update of affected deployments.

Inline telemetry config (deprecated)

warning

Inline telemetry configuration via spec.config.telemetry is deprecated. Use spec.telemetryConfigRef instead for Kubernetes-native secret references and shared configuration across resources.

Configure telemetry directly in the VirtualMCPServer resource using the spec.config.telemetry field:

apiVersion: toolhive.stacklok.dev/v1alpha1
kind: VirtualMCPServer
metadata:
  name: my-vmcp
  namespace: toolhive-system
spec:
  groupRef:
    name: my-group
  config:
    telemetry:
      endpoint: 'otel-collector:4318'
      serviceName: 'my-vmcp'
      insecure: true
      tracingEnabled: true
      samplingRate: '0.05'
      metricsEnabled: true
      enablePrometheusMetricsPath: true
  incomingAuth:
    type: anonymous

Configuration options

Field	Description	Default
`endpoint`	OTLP collector endpoint (hostname:port)	-
`serviceName`	Service name in traces and metrics	VirtualMCPServer name
`tracingEnabled`	Enable tracing	`false`
`metricsEnabled`	Enable OTLP metrics export	`false`
`samplingRate`	Trace sampling rate (0.0-1.0)	`"0.05"`
`insecure`	Use HTTP instead of HTTPS	`false`
`enablePrometheusMetricsPath`	Expose `/metrics` endpoint	`false`

Export to observability backends

Export to Jaeger via OpenTelemetry Collector

Deploy an OpenTelemetry Collector configured to export to Jaeger:

otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 10s
    send_batch_size: 1024

exporters:
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/jaeger]

Then configure vMCP to send telemetry to the collector:

spec:
  config:
    telemetry:
      endpoint: 'otel-collector:4318'
      serviceName: 'production-vmcp'
      tracingEnabled: true
      metricsEnabled: true
      insecure: true

Metrics collection

vMCP supports two methods for collecting metrics:

Push via OpenTelemetry: Set metricsEnabled: true to push metrics to your OTel Collector via OTLP
Pull via Prometheus: Set enablePrometheusMetricsPath: true to expose a /metrics endpoint on the vMCP service port (4483) for Prometheus to scrape

Backend metrics

These metrics track requests to individual MCP server backends:

Metric	Type	Description
`toolhive_vmcp_backends_discovered`	Gauge	Number of backends discovered
`toolhive_vmcp_backend_requests`	Counter	Total requests per backend
`toolhive_vmcp_backend_errors`	Counter	Total errors per backend
`toolhive_vmcp_backend_requests_duration`	Histogram	Duration of backend requests
`mcp.client.operation.duration`	Histogram	MCP client operation duration (`mcp_client_operation_duration` on `/metrics`)

Workflow metrics

These metrics track workflow execution across backends:

Metric	Type	Description
`toolhive_vmcp_workflow_executions`	Counter	Total workflow executions
`toolhive_vmcp_workflow_errors`	Counter	Total workflow execution errors
`toolhive_vmcp_workflow_duration`	Histogram	Duration of workflow executions

Optimizer metrics

When the vMCP optimizer is enabled, these metrics track tool-finding and tool-calling performance:

Metric	Type	Description
`toolhive_vmcp_optimizer_find_tool_requests`	Counter	Total FindTool calls
`toolhive_vmcp_optimizer_find_tool_errors`	Counter	Total FindTool errors
`toolhive_vmcp_optimizer_find_tool_duration`	Histogram	Duration of FindTool calls
`toolhive_vmcp_optimizer_find_tool_results`	Histogram	Number of tools returned per call
`toolhive_vmcp_optimizer_token_savings_percent`	Histogram	Token savings percentage per call
`toolhive_vmcp_optimizer_call_tool_requests`	Counter	Total CallTool calls
`toolhive_vmcp_optimizer_call_tool_errors`	Counter	Total CallTool errors
`toolhive_vmcp_optimizer_call_tool_not_found`	Counter	CallTool calls where tool was not found
`toolhive_vmcp_optimizer_call_tool_duration`	Histogram	Duration of CallTool calls

Distributed tracing

vMCP creates client-side spans for backend operations with the following span names:

tools/call <tool_name> - Tool calls to backends
resources/read - Resource reads from backends
prompts/get <prompt_name> - Prompt retrieval from backends
list_capabilities - Backend capability discovery

Each span includes attributes for the target backend (target.workload_id, target.workload_name, target.base_url) and the relevant MCP attributes (mcp.method.name, gen_ai.tool.name, mcp.resource.uri).

Next steps

Set up audit logging for structured request and authorization event tracking

Observability concepts - overview of ToolHive's observability architecture
Kubernetes telemetry guide - telemetry for MCPServer resources
OpenTelemetry tutorial - set up a local observability stack

Telemetry types​

Enable telemetry​

Shared telemetry config reference (recommended)​

Inline telemetry config (deprecated)​

Configuration options​

Export to observability backends​

Export to Jaeger via OpenTelemetry Collector​

Metrics collection​

Backend metrics​

Workflow metrics​

Optimizer metrics​

Distributed tracing​

Next steps​

Related information​