Agent Application Metrics

This document describes the AgentSpaceMetrics class, which provides a unified interface to collect, expose, and push runtime metrics in agent-based applications. It supports Prometheus-compatible counters, gauges, and histograms, and streams these metrics both via HTTP and Redis.

Introduction

AgentSpaceMetrics enables:

Prometheus-compatible metrics registration and management.
Push-based metric streaming to Redis for centralized collection.
Custom labeling with subjectId, instanceId, and nodeId.
Built-in exporter via HTTP for Prometheus scraping.

It is designed for AI agents or modular services operating in distributed systems where real-time and historical visibility into system behavior is required.

Importing

from agent_sdk.metrics import AgentSpaceMetrics

Metric Types

Metric Type	Description	Use Case
Counter	A cumulative metric that increases over time. It cannot be decreased.	Request counts, error events, retries.
Gauge	A metric that represents a value that can go up and down.	Active connections, memory usage, queue length.
Histogram	A metric that samples observations into configurable buckets, and provides count, sum, and distribution.	Response latency, payload size, execution duration.

Usage Examples

1. Initialize

metrics = AgentSpaceMetrics(subject_id="subject-xyz")

If subject_id is not provided, it will use:

SUBJECT_ID from environment variables (default: default_subject)
INSTANCE_ID from environment variables (default: executor)
Redis host from METRICS_REDIS_HOST (default: localhost)

2. Register Metrics

metrics.register_counter("block_executions", "Number of executions", labelnames=["blockId"])
metrics.register_gauge("active_sessions", "Active user sessions")
metrics.register_histogram("execution_latency", "Execution latency in seconds", buckets=[0.1, 0.5, 1, 2, 5])

3. Update Metrics

metrics.increment_counter("block_executions", {"blockId": "abc123"})
metrics.set_gauge("active_sessions", 7)
metrics.observe_histogram("execution_latency", 0.9)

Each metric automatically includes the following labels:

Label	Source	Description
`subjectId`	arg / `SUBJECT_ID`	Identifier for the agent subject
`instanceId`	`INSTANCE_ID` env	Unique name of the running agent
`nodeId`	`detect_node_id()`	Identifier of the physical/logical node

Server

Start Prometheus Exporter + Redis Writer

metrics.start_http_server(port=8889)

This performs:

Starting a Prometheus-compatible HTTP endpoint at http://<host>:8889/metrics
Launching a background Redis writer thread that pushes current metrics every 5 seconds to:

Redis List: NODE_METRICS

Each Redis entry is a JSON string with this structure:

{
  "block_executions": {
    "sdk.instances.adhoc.block_executions_total": 5.0
  },
  "active_sessions": {
    "sdk.instances.adhoc.active_sessions": 3.0
  },
  "execution_latency": {
    "sdk.instances.adhoc.execution_latency_bucket": 2.0,
    ...
  },
  "subjectId": "subject-xyz",
  "instanceId": "executor",
  "nodeId": "agent-node-123"
}