OmniCoreAgent

The Open Production Agent Harness for Python

Parallel tool batches, structured observations, signature loop detection, MCP tools, memory, workspace files, subagents, background tasks, and REST/SSE serving.

---

What It Is

An LLM is not an agent by itself. The model provides intelligence; the harness

gives that intelligence a working environment.

OmniCoreAgent is the application-facing harness layer around a model:

text

model
  + prompt contract
  + reasoning loop
  + local tools
  + MCP tools
  + parallel tool batches
  + structured observations
  + memory
  + context control
  + workspace files
  + tool-output offloading
  + guardrails
  + events
  + subagents
  + background tasks
  + REST/SSE serving

That is the difference between an agent harness and a generic agent library.

A library gives you pieces to assemble. A harness gives you the runtime boundary

that makes a model usable inside an application.

OmniCoreAgent keeps that boundary explicit:

Layer	What It Owns
Agent harness	Model loop, prompt contract, tools, observations, memory, context, workspace, guardrails, events, subagents
Serving boundary	OmniServe REST/SSE APIs, request lifecycle, readiness, auth, rate limits, metrics
Background boundary	Durable scheduled/manual task execution with task state, run history, leases, retries, and workspace output
External tool boundary	MCP server tools and local Python tools exposed through one runtime surface

Start with the core harness. Turn on heavier production pieces only when the

workload needs them.

If you prefer guided docs, start with the

Quick Start.

If you use AI coding tools, use the

AI tools guide

for Ask AI, /llms.txt, hosted docs MCP, Cursor, VS Code, ChatGPT, Claude, and

Perplexity.

---

Quick Start

bash

pip install omnicoreagent

bash

export LLM_API_KEY=your_api_key

python

import asyncio
from omnicoreagent import OmniCoreAgent

agent = OmniCoreAgent(
    name="assistant",
    system_instruction="You are a helpful assistant.",
    model_config={"provider": "openai", "model": "gpt-4o"},
)

async def main():
    result = await agent.run(
        "Research the top 3 open-source agent runtimes and summarize them.",
        session_id="quickstart",
    )
    print(result["response"])
    await agent.cleanup()

asyncio.run(main())

That is the smallest path: one agent, one model, one stable session, the harness

loop, session memory, guardrails, workspace files, error handling, and metrics

around each run.

Context management, tool output offloading, BM25 tool retrieval, subagents, skills,

cloud workspace storage, and production backends are opt-in so a small agent stays

small.

Ready to go deeper? The Cookbook has progressive examples from
hello world to production deployments.

---

Choose Your Path

Goal	Start Here
Build your first agent	Quick Start
Add Python tools	Local tools cookbook
Connect MCP server tools	MCP tools cookbook
Manage memory and context	Getting started cookbook
Save files, artifacts, and large tool results	Tool offload cookbook
Build a production-shaped app harness	Real applications cookbook
Build multi-step workflows	Workflows cookbook
Serve an agent over HTTP/SSE	OmniServe cookbook
Use the docs inside AI tools	AI tools guide
Debug setup or configuration	Configuration guide
Understand the runtime internals	Implementation Map

---

What You Can Build

OmniCoreAgent is for application builders who need the agent runtime to hold

together after the prototype works.

Build	Harness Pieces You Use
MCP-connected product agents	MCP tools, local tools, structured observations, guardrails, session memory
Research and analysis agents	Parallel tool batches, workspace files, tool offloading, context management, artifact readback
Long-running worker agents	Background tasks, durable task stores, run history, workspace output, retries, cancellation
Multi-agent task systems	Dynamic subagents, shared workspace output, workflow orchestration, telemetry events
Agent APIs	OmniServe REST/SSE, readiness, auth, request timeout, rate limits, metrics
Production app integrations	Optional Redis, MongoDB, SQL, S3, and R2 backends without making the core install heavy

The core idea is simple: one harness entry point, many application membranes.

You bring the domain instructions, tools, and business logic. OmniCoreAgent

provides the execution boundary around them.

---

Why It Matters

Most demos stop at "LLM plus tool loop." Production agents fail in the layer

around that loop: slow sequential tool calls, noisy observations, repeated

actions, context exhaustion, unsafe tool output, missing workspace state,

uninspectable background work, and weak serving boundaries.

OmniCoreAgent exists for that layer.

1. Agents call tools in batches instead of forced sequences

The usual tool loop looks like this:

text

LLM -> call tool A -> wait -> result -> LLM -> call tool B -> wait -> result

OmniCoreAgent lets the model request independent tools together:

text

LLM -> [tool A + tool B + tool C in parallel] -> one structured observation -> LLM

The model gets one complete view of the batch before it reasons again. A failed

tool is represented beside the successful tools instead of silently collapsing the

whole step.

Native function calling alone is not the runtime. OmniCoreAgent uses its own

tool-call contract, parser, resolver, parallel runner, and result formatter so

the harness controls the full execution path.

2. Tool results become structured observations

Raw tool output is often too noisy for the next reasoning step. Large payloads,

errors, irrelevant fields, and prompt-injection content can all distort the loop.

OmniCoreAgent routes tool results through an observation pipeline:

text

tool output -> parse -> format -> guardrail check -> offload when configured -> observation -> model

The model receives the signal it needs to continue the task, not an unbounded dump

of every byte returned by a tool. When tool offloading is enabled, large outputs

are written into the active workspace and the model receives a readable preview

plus a path it can use later.

3. Loop detection uses signatures beyond step counts

max_steps is still useful, but it is a blunt instrument. It stops an agent that

is making progress just as quickly as one that is stuck.

OmniCoreAgent tracks SHA256-backed tool-call signatures across the loop. Each

signature is based on the tool name, input, and output for the call. The runtime

detects:

Consecutive loops: the same tool call returns the same result repeatedly.
Pattern loops: the same tool repeats a small interaction pattern.

When the harness stops a loop, the agent gets a reason. That makes debugging the

agent behavior much easier than "max iterations reached."

4. The harness is already assembled

OmniCoreAgent ships as a working harness, not a bag of disconnected pieces:

text

model + prompt + loop + tools + memory + context + workspace + guardrails + telemetry

Keep it small for simple agents, then turn on the heavier harness pieces when the

workload needs them: MCP tools, BM25 tool retrieval, dynamic subagents, skills,

cloud workspace storage, Redis/Postgres/MongoDB memory, telemetry events, and

OmniServe.

5. Context is managed before the model call

When context management is enabled, OmniCoreAgent checks the active message

history before every LLM request. If the configured threshold is crossed, the

harness automatically applies the selected strategy before calling the model:

text

messages -> threshold check -> truncate or summarize+truncate -> LLM

The system prompt is preserved, recent messages are preserved, and older middle

history is either summarized or removed depending on configuration. If you set

the budget below your model's real context window, the harness acts before the

provider rejects the request.

---

See It In Action

python

import asyncio
from omnicoreagent import MemoryRouter, OmniCoreAgent, ToolRegistry

tools = ToolRegistry()

@tools.register_tool("search_web")
def search_web(query: str) -> dict:
    """Search the web for information."""
    return {"results": [f"Result for: {query}"]}

@tools.register_tool("fetch_document")
def fetch_document(path: str) -> dict:
    """Fetch a domain document from an application-owned source."""
    return {"path": path, "content": f"Contents of {path}"}

agent = OmniCoreAgent(
    name="research-agent",
    system_instruction=(
        "You are a research assistant. Use tools in parallel when the calls are "
        "independent and you can reason over the results together."
    ),
    model_config={"provider": "openai", "model": "gpt-4o"},
    local_tools=tools,
    memory_router=MemoryRouter("in_memory"),
    agent_config={
        "max_steps": 20,
        "context_management": {"enabled": True},
        "tool_offload": {"enabled": True},
        "enable_subagents": True,
        "enable_advanced_tool_use": True,
    },
)

async def main():
    result = await agent.run(
        "Search for recent AI agent papers and fetch notes.md. Do both at once "
        "if neither depends on the other."
    )
    print(result["response"])
    await agent.cleanup()

asyncio.run(main())

The runtime accepts search_web and fetch_document in the same batch, returns both

results together, and continues from one structured observation.

---

Install Only What You Need

bash

pip install omnicoreagent                    # Core runtime
pip install "omnicoreagent[redis]"           # Redis memory backend
pip install "omnicoreagent[postgres]"        # PostgreSQL / SQL memory
pip install "omnicoreagent[mongodb]"         # MongoDB memory
pip install "omnicoreagent[s3]"              # S3 / R2 workspace storage
pip install "omnicoreagent[serve]"           # OmniServe REST/SSE API
pip install "omnicoreagent[tokenizer]"       # Token-aware context budgeting
pip install "omnicoreagent[otel]"            # OTLP trace export
pip install "omnicoreagent[langsmith]"       # LangSmith trace export
pip install "omnicoreagent[opik]"            # Comet Opik trace export
pip install "omnicoreagent[all]"             # Everything

Production backends are installable extras. Install only what the agent actually

uses.

---

Features

Core Runtime

Feature	What It Does
Parallel Batch Tool Execution	Executes independent tool calls concurrently and returns one combined observation to the model.
Structured Observation Pipeline	Parses, formats, guardrail-checks, and offloads tool results when configured before the model sees them.
Signature-Based Loop Detection	Detects repeated SHA256-backed tool-call signatures and repeated tool interaction patterns beyond step-count exhaustion.
Local Tool Registry	Registers Python functions as tools with inferred schemas and async/sync execution support.
Multi-Tier Memory	Uses in-memory, Redis, MongoDB, or SQL-backed session history through the memory router.
Context Engineering	Checks context before each model call and automatically truncates or summarizes when the configured budget threshold is crossed.
Workspace Files	Gives agents a local, S3, or R2-backed file workspace for notes, scratchpads, artifacts, and tool offloads.
Tool Output Offloading	Writes large tool results to workspace files and gives the model a preview plus a file reference.
Guardrails	Adds prompt-injection screening inside the observation path with configurable behavior.

Production Harness

Feature	What It Does
Dynamic Subagents	Lets the main agent spawn focused workers with isolated context and shared workspace output.
Durable Background Tasks	Runs manual or scheduled agent work with task state, run history, retries, cancellation, and workspace output.
Workflow Orchestration	Provides sequential, parallel, and router agents for multi-step application workflows.
Telemetry and Traces	Emits typed telemetry events, retrieves traces by exact `trace_id`, latest session, or `run_id` correlation, and exports traces to OTLP, LangSmith, Opik, or JSONL.
OmniServe	Turns an agent into a REST/SSE service with lifecycle management, auth, rate limits, telemetry APIs, background APIs, and metrics.

Integrations

Feature	What It Does
MCP Native Tools	Connects MCP servers over stdio, SSE, and Streamable HTTP, including OAuth-capable remote servers.
Agent Skills	Loads packaged capabilities implemented with Python, Bash, or Node.js.
BM25 Tool Retrieval	Selects relevant tools from large tool sets so the prompt stays focused.
Runtime Backend Switching	Switches memory backends at runtime when configured.
Universal Models	Supports OpenAI, Anthropic, Gemini, Groq, Ollama, DeepSeek, Mistral, OpenRouter, Azure, and Cencori through the runtime model layer.

---

Implementation Map

OmniCoreAgent's capabilities are backed by concrete runtime modules:

Capability	Where It Lives
Parallel tool batches	`core/tools/tool_batch_runner.py`
XML tool-call contract	`core/agents/xml_parser.py`
Structured observations	`core/tools/tool_observation.py`
Tool output offloading	`core/workspace/artifacts.py`
Automatic context control	`core/agents/llm_step.py`, `core/context_manager.py`
Workspace files	`core/workspace/tools.py`, `core/workspace/storage.py`
Dynamic subagents	`core/subagents.py`
Loop detection	`core/agents/loop_detection.py`
MCP server tools	`mcp_clients_connection/client.py`
OmniServe	`serve/`

See the Agent Harness docs

for the full implementation map.

---

Cookbook

All examples live in the **Cookbook** and are organized by use case.

Category	What You'll Build
Getting Started	First agent, tools, memory, telemetry events, and traces
Real Applications	Due diligence, support operations, and workspace code review harnesses
Workflows	Sequential, Parallel, Router agents
Background Agents	Scheduled autonomous tasks
Production	Guardrails, serving, and production patterns

---

Configuration

Environment Variables

For the first run, most hosted model providers only need LLM_API_KEY.

OmniCoreAgent defaults memory and events to in-memory storage, workspace files to

local disk, and optional production integrations stay off until you configure

them.

bash

export LLM_API_KEY=your_api_key

Add backend-specific variables only when you opt into Redis, MongoDB, SQL

database storage, S3, R2, or OmniServe deployment settings.

Full Harness Config Example

The defaults keep the first agent small: workspace files and guardrails are on,

conversation memory is in-memory, and advanced harness pieces stay off until

you enable them. This example shows the production-style switches together.

python

agent_config = {
    "max_steps": 15,
    "tool_call_timeout": 30,
    "request_limit": 0,                  # 0 = unlimited
    "total_tokens_limit": 0,             # 0 = unlimited
    "memory_config": {
        "mode": "sliding_window",
        "value": 10000,
        "summary": {"enabled": False},
    },
    "enable_workspace_files": True,      # Default on
    "guardrail_mode": "full",            # Default
    "context_management": {"enabled": True},  # Default off
    "tool_offload": {"enabled": True},        # Default off
    "enable_advanced_tool_use": True,         # Default off
    "enable_subagents": True,                 # Default off
    "enable_agent_skills": True,              # Default off
}

When enable_subagents is true, workspace files are enabled automatically so

subagents write outputs, notes, todos, and artifacts into the active

workspace.

Full reference: Configuration Guide

---

Development

bash

git clone https://github.com/omnirexflora-labs/omnicoreagent.git
cd omnicoreagent

uv venv && source .venv/bin/activate
uv sync --dev

pytest tests/ -v
pytest tests/ --cov=src --cov-report=term-missing

---

Troubleshooting

Error	Fix
`Invalid API key`	Export `LLM_API_KEY` with the key for the provider selected in `model_config`.
`ModuleNotFoundError` for Redis / Postgres / MongoDB / S3	Install the matching extra, for example `pip install "omnicoreagent[redis]"`.
`Redis connection failed`	Start Redis or use `MemoryRouter("in_memory")`.
`MCP connection refused`	Ensure the MCP server is running before starting the agent.

More help: Basic Usage Guide

---

Contributing

bash

git clone https://github.com/omnirexflora-labs/omnicoreagent.git
cd omnicoreagent

uv venv && source .venv/bin/activate
uv sync --dev
pre-commit install

See CONTRIBUTING.md for guidelines. PRs are welcome.

---

License

MIT - see LICENSE.

---

Author

**Built by Abiola Adeshina**.

GitHub: @Abiorh001
X (Twitter): @abiorhmangana
Email: abiolaadedayo1993@gmail.com

The OmniRexFlora Ecosystem

Project	Description
OmniMemory	Self-evolving memory for autonomous agents
OmniCoreAgent	Production agent harness (this project)
OmniDaemon	Event-driven runtime for running agents as supervised, autonomous infrastructure services

Built On

LiteLLM - FastAPI - Redis - Pydantic

---

Omnicoreagent

Documentation

What It Is

Quick Start

Choose Your Path

What You Can Build

Why It Matters

1. Agents call tools in batches instead of forced sequences

2. Tool results become structured observations

3. Loop detection uses signatures beyond step counts

4. The harness is already assembled

5. Context is managed before the model call

See It In Action

Install Only What You Need

Features

Core Runtime

Production Harness

Integrations

Implementation Map

Cookbook

Configuration

Environment Variables

Full Harness Config Example

Development

Troubleshooting

Contributing

License

Author

The OmniRexFlora Ecosystem

Built On

Similar MCP

Mcp Aoai Web Browsing

Davinci Resolve Mcp

Aws Mcp Server

Biomcp

Trending MCP

Playwright Mcp

Serena

Mcp Playwright

Mcp Server Cloudflare

Mcp Aoai Web Browsing

Davinci Resolve Mcp

Aws Mcp Server

Biomcp

Playwright Mcp

Serena

Mcp Playwright

Mcp Server Cloudflare