Docs/Core Concepts

Core Concepts

Understanding the fundamental concepts of AgenticAnts will help you make the most of the platform.

Platform Architecture

AgenticAnts is built around these core principles:

1. LLMOps Framework

AgenticAnts implements LLMOps (Large Language Model Operations) - the comprehensive discipline for managing LLM operations from development to production.

2. Three Pillars Approach

We provide comprehensive LLMOps coverage through three integrated domains:

  • FinOps: Cost optimization and financial management
  • SRE: Reliability engineering and performance
  • Security Posture: Security and compliance

3. Agent-Centric Observability

Everything in AgenticAnts is centered around AI agents - autonomous systems that make decisions and take actions.

4. Credit-Based Economics

Flexible, usage-based pricing that scales with your needs

5. OpenTelemetry Standard

Built on industry standards for maximum compatibility

Key Concepts

LLMOps Framework

LLMOps encompasses the entire lifecycle of LLM operations:

  • Model Lifecycle Management - Selection, versioning, deployment, and retirement
  • Prompt Operations - Prompt engineering, versioning, and optimization
  • Performance Optimization - Latency, throughput, and cost optimization
  • Model Governance - Policies, compliance, and risk management
  • Versioning & Deployment - CI/CD pipelines and rollback strategies

Learn more about LLMOPs →

Agents

An agent is an autonomous AI system that:

  • Receives inputs (user queries, events, data)
  • Makes decisions using LLMs and logic
  • Takes actions (API calls, tool usage, responses)
  • Learns and adapts over time

Learn more about AI agents →

Traces

A trace represents a complete execution path of an agent or application:

code
Trace: Customer Support Request Span: Query Classification Span: Retrieve Customer Data Span: Database Query Span: LLM Processing Span: Token Generation Span: Response Formatting Span: Response Delivery

Learn more about tracing →

Spans

A span represents a single unit of work within a trace:

  • Function call
  • API request
  • LLM inference
  • Database query
  • Tool execution

Metrics

Metrics are numerical measurements collected over time:

  • Latency (p50, p95, p99)
  • Throughput (requests/second)
  • Error rates
  • Token usage
  • Cost per operation

Events

Events are discrete occurrences in your system:

  • Agent started
  • Error occurred
  • Threshold exceeded
  • User feedback received

The Three Pillars

FinOps - AI Cost Optimization

Control and optimize your AI spending:

Key Features:

  • Token usage tracking
  • Cost attribution (per customer, per agent, per operation)
  • Budget management and alerts
  • Contract optimization recommendations
  • ROI analytics

Use Cases:

  • "How much does our customer support agent cost per query?"
  • "Which customers are driving the most AI costs?"
  • "What's the ROI of our AI investments?"

Explore FinOps →

SRE - AI Reliability Engineering

Ensure your AI systems are reliable and performant:

Key Features:

  • End-to-end tracing
  • Performance monitoring
  • Automated alerting
  • Incident response
  • SLA tracking

Use Cases:

  • "Why is our agent slow for certain queries?"
  • "What caused the spike in errors yesterday?"
  • "Are we meeting our SLA targets?"

Explore SRE →

Security Posture - AI Security Control

Secure your AI operations and maintain compliance:

Key Features:

  • PII detection and redaction
  • Security guardrails
  • Compliance reporting
  • Audit trails
  • RBAC and access control

Use Cases:

  • "Are we exposing any PII in our agent responses?"
  • "Can we prove GDPR compliance for our AI systems?"
  • "Who accessed sensitive agent data?"

Explore Security Posture →

Credit System

AgenticAnts uses a credit-based pricing model for flexible, usage-based billing.

How Credits Work

Credits are consumed based on platform usage:

OperationCredit Cost
Trace ingestion (per 1000)1 credit
Span ingestion (per 1000)0.1 credit
Metric data point (per 1000)0.05 credit
Data storage (per GB/month)5 credits
API request (per 1000)0.5 credit

Credit Allocation

Credits can be used flexibly across:

  • Observability (traces, metrics, logs)
  • Agents (monitoring, analytics)
  • Policies (evaluation, enforcement)
  • Projects (multi-project organizations)

Learn more about credits →

Observability Model

AgenticAnts provides comprehensive observability for AI systems:

Collection Layer

code
Your Application AgenticAnts SDK / OpenTelemetry Ingestion Pipeline Storage & Indexing

Data Types

  1. Traces: Complete execution paths
  2. Metrics: Time-series measurements
  3. Logs: Discrete events and messages
  4. Metadata: Context and tags

Query Layer

code
Storage & Indexing Query Engine Dashboard UI REST API GraphQL API Webhooks

Learn more about observability →

Data Model

Hierarchy

code
Organization Projects Environments Agents Traces Spans Events

Relationships

  • Organizations contain multiple Projects
  • Projects have multiple Environments (prod, staging, dev)
  • Environments host multiple Agents
  • Agents generate Traces
  • Traces contain Spans
  • Spans can have Events

Best Practices

1. Structured Instrumentation

typescript
// Good: Structured and consistent await ants.trace.create({ name: 'customer-support-agent', input: query, metadata: { customerId: '123', channel: 'web', priority: 'high' } }) // Avoid: Unstructured or missing context await ants.trace.create({ name: 'agent', input: query })

2. Meaningful Names

typescript
// Good: Descriptive and hierarchical 'customer-support.classify-query' 'customer-support.retrieve-context' 'customer-support.generate-response' // Avoid: Vague or inconsistent 'function1' 'process' 'handler'

3. Rich Metadata

Include relevant context:

typescript
{ metadata: { // Business context customerId: '123', orderId: 'ORD-456', // Technical context modelName: 'gpt-4', temperature: 0.7, // Operational context region: 'us-east-1', version: '1.2.3' } }

4. Error Handling

Always capture errors:

typescript
try { const result = await agent.run(input) await trace.complete({ output: result }) } catch (error) { await trace.error({ error: error.message, stack: error.stack, severity: 'error' }) throw error }

Common Patterns

Pattern 1: Multi-Agent Systems

typescript
const mainTrace = await ants.trace.create({ name: 'multi-agent-workflow' }) // Coordinator agent const coordinatorSpan = mainTrace.span('coordinator-agent') const plan = await coordinator.plan(query) coordinatorSpan.end() // Worker agents (parallel) const results = await Promise.all( plan.tasks.map(task => workerAgent.execute(task, mainTrace) ) ) mainTrace.complete({ output: results })

Pattern 2: RAG Systems

typescript
const trace = await ants.trace.create({ name: 'rag-query-system' }) // Retrieval phase const retrievalSpan = trace.span('document-retrieval') const docs = await vectorDB.search(query) retrievalSpan.end({ documents: docs.length }) // Generation phase const generationSpan = trace.span('llm-generation') const response = await llm.generate({ query, context: docs }) generationSpan.end({ tokens: response.usage.total }) trace.complete({ output: response.text })

Pattern 3: Tool-Using Agents

typescript
const trace = await ants.trace.create({ name: 'tool-using-agent' }) // Agent decides which tools to use const planSpan = trace.span('plan-tools') const toolPlan = await agent.plan(query) planSpan.end() // Execute tools for (const tool of toolPlan.tools) { const toolSpan = trace.span(`tool:${tool.name}`) const result = await executeTool(tool) toolSpan.end({ result }) } trace.complete()

Next Steps

Explore each concept in detail:

© 2026 ANTS Platform, Inc.Docs v1.0 · Last updated June 2026