DB DevBrain

Ai

Agent And Application Orchestration

Overview

This document is about the layer above model serving.
It covers the software that coordinates agents, tools, memory, sessions, handoffs, guardrails, and workflow state.
It does not cover the runtime layer that actually executes model weights. That belongs to a different document.
A useful stack is:
Model runtime → serving API → agent orchestration SDK → protocol layers → app/backend/frontend → platform
Examples:
The important boundary is this:
Runtimes execute models. Agent frameworks decide how model calls, tools, and state are coordinated.

Mental model

What this layer is responsible for

This layer usually owns:

What this layer usually does not own

This layer usually does not own:
Those belong lower in the stack.

Product categories

Agent orchestration frameworks

These define how agents are composed, routed, resumed, and observed.

Tool-calling and function-execution layer

This is the layer that connects the model to real actions.
Common patterns:

Memory and session layer

This handles continuity across turns and across runs.
Common patterns:

Guardrails and policy layer

This is where agent behavior is constrained or checked.
Common patterns:

Multi-agent routing and handoffs

This is the control-flow layer for specialization.
Common patterns:

Quick comparison

Name
Primary role
Main strength
State model
Best fit
OpenAI Agents SDK
Agent orchestration SDK
Clean agent, tool, handoff, guardrail model
Sessions and run-level state
Applications built around tool use and handoffs
LangGraph
Graph-based orchestration framework
Durable execution and explicit stateful workflows
Strong explicit graph and checkpoint state
Long-running, resumable, production-style agent workflows
AutoGen
Multi-agent framework
Conversational multi-agent patterns
Agent/chat-centric state
Existing AutoGen users and research/prototyping patterns
Semantic Kernel
AI middleware + agent framework
Enterprise-oriented integration and plugin model
App/service-oriented state
Teams building AI features into larger business systems

Tool-calling and execution design

What matters

In practice, tool-calling design is one of the main determinants of whether an agent system is useful or fragile.
The important questions are:

Common implementation patterns

Typed function calling

The model selects from a defined set of functions with structured arguments.
Best when:

MCP-backed tools

A protocol layer exposes tools from external systems in a standard way.
Best when:

Sandboxed execution tools

The agent can run code or shell commands inside an isolated environment.
Best when:
Risk:

Memory and session layers

Short-term vs long-term memory

A useful distinction:
These should not automatically be treated as the same thing.

Session design questions

Important design questions:

Engineering reality

Most bad agent memory systems fail because they mix all of this together:
Those should be modeled separately.

Guardrails

What guardrails are actually for

Guardrails are not magic safety dust.
They are explicit checks around inputs, outputs, tool calls, and side effects.
Useful guardrails include:

Where guardrails belong

Good systems usually place guardrails at multiple points:

Multi-agent routing and handoffs

When multi-agent actually helps

Multi-agent designs help when there is real specialization, for example:

When it does not help

It does not help when one agent could do the job and the system is split into many agents just to look advanced.
That usually adds:

Handoff design questions

When one agent transfers control to another, you need to define:

Protocol layers

Protocols are not the same thing as orchestration frameworks.
A useful split is:

MCP

Category: Tool and context protocol
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

A2A

Category: Agent-to-agent protocol
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

AG-UI

Category: Agent-to-frontend interaction protocol
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

Framework profiles

OpenAI Agents SDK

Category: Agent orchestration SDK
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

LangGraph

Category: Graph-based orchestration framework
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

AutoGen

Category: Multi-agent framework
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

Semantic Kernel

Category: AI middleware + agent framework
What it is
Engineering strengths
Operational concerns
Best fit
Poor fit

Reference architectures

Thin agent layer over an existing model API

Typical stack:
Good for:

Durable workflow agent system

Typical stack:
Good for:

Multi-agent specialist system

Typical stack:
Good for:

Enterprise app integration pattern

Typical stack:
Good for:

Failure modes

Agent systems usually break in boring ways, not magical ones.
Common failure modes:
In practice, most production pain comes from control-flow ambiguity, state ambiguity, and side-effect ambiguity.

What actually matters in framework selection

When comparing frameworks, the real engineering questions are: