Docs/Core Concepts

Core Concepts

Understanding the fundamental concepts of AgenticAnts will help you make the most of the platform.

Platform Architecture

AgenticAnts is built around these core principles:

1. LLMOps Framework

AgenticAnts implements LLMOps (Large Language Model Operations) - the comprehensive discipline for managing LLM operations from development to production.

2. Three Pillars Approach

We provide comprehensive LLMOps coverage through three integrated domains:

FinOps: Cost optimization and financial management
SRE: Reliability engineering and performance
Security Posture: Security and compliance

3. Agent-Centric Observability

Everything in AgenticAnts is centered around AI agents - autonomous systems that make decisions and take actions.

4. Credit-Based Economics

Flexible, usage-based pricing that scales with your needs

5. OpenTelemetry Standard

Built on industry standards for maximum compatibility

Key Concepts

LLMOps Framework

LLMOps encompasses the entire lifecycle of LLM operations:

Model Lifecycle Management - Selection, versioning, deployment, and retirement
Prompt Operations - Prompt engineering, versioning, and optimization
Performance Optimization - Latency, throughput, and cost optimization
Model Governance - Policies, compliance, and risk management
Versioning & Deployment - CI/CD pipelines and rollback strategies

Learn more about LLMOPs →

Agents

An agent is an autonomous AI system that:

Receives inputs (user queries, events, data)
Makes decisions using LLMs and logic
Takes actions (API calls, tool usage, responses)
Learns and adapts over time

Learn more about AI agents →

Traces

A trace represents a complete execution path of an agent or application:

code

Trace: Customer Support Request ├─ Span: Query Classification ├─ Span: Retrieve Customer Data │ └─ Span: Database Query ├─ Span: LLM Processing │ ├─ Span: Token Generation │ └─ Span: Response Formatting └─ Span: Response Delivery

Learn more about tracing →

Spans

A span represents a single unit of work within a trace:

Function call
API request
LLM inference
Database query
Tool execution

Metrics

Metrics are numerical measurements collected over time:

Latency (p50, p95, p99)
Throughput (requests/second)
Error rates
Token usage
Cost per operation

Events

Events are discrete occurrences in your system:

Agent started
Error occurred
Threshold exceeded
User feedback received

The Three Pillars

FinOps - AI Cost Optimization

Control and optimize your AI spending:

Key Features:

Token usage tracking
Cost attribution (per customer, per agent, per operation)
Budget management and alerts
Contract optimization recommendations
ROI analytics

Use Cases:

"How much does our customer support agent cost per query?"
"Which customers are driving the most AI costs?"
"What's the ROI of our AI investments?"

Explore FinOps →

SRE - AI Reliability Engineering

Ensure your AI systems are reliable and performant:

Key Features:

End-to-end tracing
Performance monitoring
Automated alerting
Incident response
SLA tracking

Use Cases:

"Why is our agent slow for certain queries?"
"What caused the spike in errors yesterday?"
"Are we meeting our SLA targets?"

Explore SRE →

Security Posture - AI Security Control

Secure your AI operations and maintain compliance:

Key Features:

PII detection and redaction
Security guardrails
Compliance reporting
Audit trails
RBAC and access control

Use Cases:

"Are we exposing any PII in our agent responses?"
"Can we prove GDPR compliance for our AI systems?"
"Who accessed sensitive agent data?"

Explore Security Posture →

Credit System

AgenticAnts uses a credit-based pricing model for flexible, usage-based billing.

How Credits Work

Credits are consumed based on platform usage:

Operation	Credit Cost
Trace ingestion (per 1000)	1 credit
Span ingestion (per 1000)	0.1 credit
Metric data point (per 1000)	0.05 credit
Data storage (per GB/month)	5 credits
API request (per 1000)	0.5 credit

Credit Allocation

Credits can be used flexibly across:

Observability (traces, metrics, logs)
Agents (monitoring, analytics)
Policies (evaluation, enforcement)
Projects (multi-project organizations)

Learn more about credits →

Observability Model

AgenticAnts provides comprehensive observability for AI systems:

Collection Layer

code

Your Application ↓ AgenticAnts SDK / OpenTelemetry ↓ Ingestion Pipeline ↓ Storage & Indexing

Data Types

Traces: Complete execution paths
Metrics: Time-series measurements
Logs: Discrete events and messages
Metadata: Context and tags

Query Layer

code

Storage & Indexing ↓ Query Engine ↓ ├─ Dashboard UI ├─ REST API ├─ GraphQL API └─ Webhooks

Learn more about observability →

Data Model

Hierarchy

code

Organization └─ Projects └─ Environments └─ Agents └─ Traces └─ Spans └─ Events

Relationships

Organizations contain multiple Projects
Projects have multiple Environments (prod, staging, dev)
Environments host multiple Agents
Agents generate Traces
Traces contain Spans
Spans can have Events

Best Practices

1. Structured Instrumentation

typescript

// Good: Structured and consistent await ants.trace.create({ name: 'customer-support-agent', input: query, metadata: { customerId: '123', channel: 'web', priority: 'high' } }) // Avoid: Unstructured or missing context await ants.trace.create({ name: 'agent', input: query })

2. Meaningful Names

typescript

// Good: Descriptive and hierarchical 'customer-support.classify-query' 'customer-support.retrieve-context' 'customer-support.generate-response' // Avoid: Vague or inconsistent 'function1' 'process' 'handler'

3. Rich Metadata

Include relevant context:

typescript

{ metadata: { // Business context customerId: '123', orderId: 'ORD-456', // Technical context modelName: 'gpt-4', temperature: 0.7, // Operational context region: 'us-east-1', version: '1.2.3' } }

4. Error Handling

Always capture errors:

typescript

try { const result = await agent.run(input) await trace.complete({ output: result }) } catch (error) { await trace.error({ error: error.message, stack: error.stack, severity: 'error' }) throw error }

Common Patterns

Pattern 1: Multi-Agent Systems

typescript

const mainTrace = await ants.trace.create({ name: 'multi-agent-workflow' }) // Coordinator agent const coordinatorSpan = mainTrace.span('coordinator-agent') const plan = await coordinator.plan(query) coordinatorSpan.end() // Worker agents (parallel) const results = await Promise.all( plan.tasks.map(task => workerAgent.execute(task, mainTrace) ) ) mainTrace.complete({ output: results })

Pattern 2: RAG Systems

typescript

const trace = await ants.trace.create({ name: 'rag-query-system' }) // Retrieval phase const retrievalSpan = trace.span('document-retrieval') const docs = await vectorDB.search(query) retrievalSpan.end({ documents: docs.length }) // Generation phase const generationSpan = trace.span('llm-generation') const response = await llm.generate({ query, context: docs }) generationSpan.end({ tokens: response.usage.total }) trace.complete({ output: response.text })

Pattern 3: Tool-Using Agents

typescript

const trace = await ants.trace.create({ name: 'tool-using-agent' }) // Agent decides which tools to use const planSpan = trace.span('plan-tools') const toolPlan = await agent.plan(query) planSpan.end() // Execute tools for (const tool of toolPlan.tools) { const toolSpan = trace.span(`tool:${tool.name}`) const result = await executeTool(tool) toolSpan.end({ result }) } trace.complete()

Next Steps

Explore each concept in detail:

Understanding AI Agents

Learn about AI agents and their characteristics

Three Pillars

Deep dive into FinOps, SRE, and Security Posture

Credit System

Understanding credits and pricing