Skip to content

Agent Application Metrics

This document describes the AgentSpaceMetrics class, which provides a unified interface to collect, expose, and push runtime metrics in agent-based applications. It supports Prometheus-compatible counters, gauges, and histograms, and streams these metrics both via HTTP and Redis.


Introduction

AgentSpaceMetrics enables:

  • Prometheus-compatible metrics registration and management.
  • Push-based metric streaming to Redis for centralized collection.
  • Custom labeling with subjectId, instanceId, and nodeId.
  • Built-in exporter via HTTP for Prometheus scraping.

It is designed for AI agents or modular services operating in distributed systems where real-time and historical visibility into system behavior is required.


Importing

from agent_sdk.metrics import AgentSpaceMetrics

Metric Types

Metric Type Description Use Case
Counter A cumulative metric that increases over time. It cannot be decreased. Request counts, error events, retries.
Gauge A metric that represents a value that can go up and down. Active connections, memory usage, queue length.
Histogram A metric that samples observations into configurable buckets, and provides count, sum, and distribution. Response latency, payload size, execution duration.

Usage Examples

1. Initialize

metrics = AgentSpaceMetrics(subject_id="subject-xyz")

If subject_id is not provided, it will use:

  • SUBJECT_ID from environment variables (default: default_subject)
  • INSTANCE_ID from environment variables (default: executor)
  • Redis host from METRICS_REDIS_HOST (default: localhost)

2. Register Metrics

metrics.register_counter("block_executions", "Number of executions", labelnames=["blockId"])
metrics.register_gauge("active_sessions", "Active user sessions")
metrics.register_histogram("execution_latency", "Execution latency in seconds", buckets=[0.1, 0.5, 1, 2, 5])

3. Update Metrics

metrics.increment_counter("block_executions", {"blockId": "abc123"})
metrics.set_gauge("active_sessions", 7)
metrics.observe_histogram("execution_latency", 0.9)

Each metric automatically includes the following labels:

Label Source Description
subjectId arg / SUBJECT_ID Identifier for the agent subject
instanceId INSTANCE_ID env Unique name of the running agent
nodeId detect_node_id() Identifier of the physical/logical node

Server

Start Prometheus Exporter + Redis Writer

metrics.start_http_server(port=8889)

This performs:

  • Starting a Prometheus-compatible HTTP endpoint at http://<host>:8889/metrics
  • Launching a background Redis writer thread that pushes current metrics every 5 seconds to:
Redis List: NODE_METRICS

Each Redis entry is a JSON string with this structure:

{
  "block_executions": {
    "sdk.instances.adhoc.block_executions_total": 5.0
  },
  "active_sessions": {
    "sdk.instances.adhoc.active_sessions": 3.0
  },
  "execution_latency": {
    "sdk.instances.adhoc.execution_latency_bucket": 2.0,
    ...
  },
  "subjectId": "subject-xyz",
  "instanceId": "executor",
  "nodeId": "agent-node-123"
}