MCP is written with LLM apps as the clients, not human end-users. It provides a set of conventions on how to agnostically provide context to LLM apps.
MCP 101
There are 3 key participants in the MCP architecture:
- MCP Host: The AI application that manages one or more MCP clients, e.g., Claude Code, Copilot, etc.
- MCP Client: A component that maintains a connection to an MCP server and obtains context from an MCP server for the MCP host to use.
- MCP Server: A program that provides context to MCP clients.
MCP consists of two layers:
- Transport layer: The outer layer that defines communication mechanisms and
channels that enable client-server data exchange, e.g., message framing,
authorization, etc. MCP supports two transport mechanisms:
- Stdio for local processes on the same machine.
- Streamable HTTP for remote server communication with bearer tokens, API keys, and custom headers.
- Data layer: The inner layer that defines the JSON-RPC based protocol for client-server communication, e.g., lifecycle management and core primitives.
MCP primitives define what clients and servers can offer each other. Servers can expose:
- Tools: Executable functions that the MCP host can invoke to perform actions.
- Resources: Data sources that provide contextual information.
- Prompts: Reusable templates to help structure LLM inference.
… where each primitive has associated methods for discovery (*/list),
retrieval (*/get). Additionally, there’s tools/call for execution.
Clients can expose these MCP primitives:
- Sampling: Allows servers to request LLM inference from the client’s AI application, enabling the MCP server to stay model-independent.
- Elicitation: Allows servers to request additional information from users.
- Logging: Enables servers to send log messages to clients for debugging and monitoring purposes.
- Tasks (Experimental): Durable execution wrappers that enable deferred result retrieval and status tracking for MCP requests.
MCP also supports notifications to enable dynamic updates between servers
and clients. Unlike other messages, notifications lack an id field as the
sender (can be either the server or the client) doesn’t need a response.
Sample Flow
The client sends an initialize request, e.g.,
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-06-18",
"capabilities": { "elicitation": {} },
"clientInfo": { "name": "example-client", "version": "1.0.0" }
}
}
… and the server responds:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2025-06-18",
"capabilities": {
"tools": { "listChanged": true },
"resources": {}
},
"serverInfo": { "name": "example-server", "version": "1.0.0" }
}
}
… and the MCP host stores these capabilities for later use.
If a mutually compatible protocolVersion cannot be found, the connection
should be terminated. Otherwise, the client notifies the server that it’s ready:
{ "jsonrpc": "2.0", "method": "notifications/initialized" }
To discover available tools, the client sends {"jsonrpc": "2.0", "id": 2, "method": "tools/list"} and the server responds, e.g.,
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [
{
"name": "weather_current", // Unique within the server's namespace
"title": "Weather Information", // Human readable name of the tool
"description": "Get current weather information for any location worldwide",
"inputSchema": { // Enables type validation and documentation on required params
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, address, or coordinates (latitude, longitude)"
},
"units": {
"type": "string",
"enum": ["metric", "imperial", "kelvin"],
"description": "Temperature units to use in response",
"default": "metric"
}
},
"required": ["location"]
}
}
]
}
}
… and the MCP host combines all the tools from the MCP servers into a unified tool registry that the LLM can access.
There’s additional bookkeeping needed on the MCP host. While weather_current
is unique in example-server’s namespace, that might not hold in the MCP
hosts’s tool registry. A potential solution is prefixing with the server’s name,
e.g., example-server.weather_current, and the host strips the prefix at the
host/client boundary.
When the LLM decides to use a given tool, the MCP host routes it to the
appropriate MCP client which sends a tools/call request to the MCP server,
e.g.,
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "weather_current", // Must match the name from tools/list
"arguments": { "location": "San Francisco", "units": "imperial" }
}
}
… and the MCP server responds:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [
{
"type": "text",
"text": "Current weather in San Francisco: 68°F, partly cloudy with light winds from the west at 8 mph. Humidity: 65%"
}
]
}
}
One of the common criticisms for MCP is that cases like weather_current can
already be solved with existing tech, i.e., expose the API docs and a utility
for making network calls. Admittedly, if I squint, MCP is the name for this
technique. It’s not a new concept, but the name makes it easier to talk about
unambiguously and succinctly.
For real-time updates, MCP supports notifications. For example, an MCP server
that specified "tools": { "listChanged": true } can send a notification like:
{ "jsonrpc": "2.0", "method": "notifications/tools/list_changed" }. The client
typically reacts by requesting tools/list and refreshing its list of available
tools. That way, the client doesn’t need to poll for changes.
Incomplete messages for simplicity’s sake? One can envision a case where
notifications/tools/list_changed also comes with tools/list’s payload
because that’s what the MCP client is expected to call.
MCP Primitives
Resources
Resources provide structured access to information that the AI app can
retrieve and pass to the model as context, e.g., a server can expose
calendar://events/2026 to return calendar availability for the user. A server
can also expose resource templates, e.g.,
travel://activities/{city}/{category} for flexible queries, e.g.,
travel://activities/barcelona/museums.
The boundaries here seem pretty fluid. For example, filtering to contextual
data, e.g., travel://activities/barcelona/museums and
file://documents/travel/passport.pdf, may require an inference to map the user
query against the list of available resources. Furthermore, resource templates
seem like they should be behind a tool call, e.g., getAttractions(city: str | None, category: str | None).
Prompts
Prompts provide reusable templates for common scenarios, e.g., a
plan-vacation prompt that accepts a destination, duration, budget, and
interests.
I don’t fully understand the utility here. If the vacation MCP server provides
a plan-vacation prompt, then the customizations are more like a way to ensure
that the user provides all of the required data upfront. Contrast this to a case
where the user has vague vacation plans and the host needs to elicit more
specific information from the user before calling a tool in the vacation MCP
server.
If the plan-vacation prompt contains procedural guidance on how to call the
tools in the vacation MCP server, then why not have those instructions in some
internal system prompt? Ah, we’re assuming that vacation has an LLM, which is
not true in general.
Tools
A simple server implementation:
from fastmcp import FastMCP
mcp = FastMCP(name="CalculatorServer")
@mcp.tool
def add(a: int, b: int) -> int:
"""Adds two integer numbers together."""
return a + b
The function name, add, becomes the tool name and the docstring becomes the
tool description. The @mcp.tool decorator also accepts arguments such as
name, description, tags, icons, annotations, meta, timeout,
version, and output_schema. The @tool decorator is available for
instance/class methods; @mcp.tool wouldn’t work as it registers the tool
immediately.
async def and def are both supported. Even then, synchronous tools run in a
threadpool to avoid blocking the event loop. For I/O bound operations, prefer
async tools because they’re more efficient compared to threadpool dispatch
overhead.
FastMCP generates an input schema based on the function’s params and type
annotations. Given an incoming tools/call message, FastMCP parses the input,
returning errors if validation fails. The function’s output is also validated
against the output schema derived from the return type. FastMCP supports all
Pydantic types as well. bytes parameters lack base64 encoding; for base64
data, use str and manually call base64.b64decode(). By default, FastMCP uses
strict_input_validation=False allowing "10" to be coerced into an int
where the tool requires it so. However, Pydantic models must be provided as JSON
objects (dicts); strict_input_validation=False won’t work on stringified JSON.
Because we’re dealing with an LLM, it seems that strict_input_validation=False
is desirable to combat unreliability on the LLM’s part.
The robustness principle states: be conservative in what you send, be liberal in what you accept. However, this principle has been criticized for entrenching flaws as a de facto standard because ensuring interoperability in such environments is aiming yo be bug-for-bug compatible.
Using Pydantic’s Field class with Annotated to impose validation
constraints, e.g.,
@mcp.tool
def search_database(
query: Annotated[str, Field(description="Search query string")],
limit: Annotated[int, Field(description="Maximum number of results", ge=1, le=100)] = 10
) -> list:
"""Search the database with the provided query."""
# Implementation...
… where Field supports validation and documentation features like
description, ge, gt, le, lt, min_length, max_length, pattern,
and default.
Parameters using Depends() are automatically excluded from the tool schema,
e.g.,
from fastmcp import FastMCP
from fastmcp.dependencies import Depends
mcp = FastMCP()
def get_user_() -> str:
return "user_123" # Injected at runtime
@mcp.tool
def get_user_details(user_id: str = Depends(get_user_id)) -> str:
# user_id is injected by the server, not provided by the LLM.
return f"Details for {user_id}"
This gels with
Deterministic Control Flows vs. Prompting
.
While we could prompt the LLM to give the correct user_id, this is a
deterministic security/privacy boundary that we shouldn’t toss the dice for.
The return type is influenced by the return type annotation. For example, -> int makes FastMCP generate a JSON schema for the output and also included a
structuredContent in the response, e.g.,
{
"content": [
{
"type": "text",
"text": "8"
}
],
"structuredContent": {
"result": 8
}
}
… That said, with -> ToolResult from fastmcp.tools.tool, one can specify
the traditional content, structured data, and metadata (e.g., execution time).
When the tool raises an exception, FastMCP logs it and sends back an MCP error
response including the details. Specifying mask_error_details=True suppresses
internal error details and uses a generic error message. However, ToolErrors
from fastmcp.exceptions are always sent to clients with their error messages.
@mcp.tool(timeout=30.0) makes FastMCP return an error to the client indicating
that the call timed out. However, tools that run as background tasks execute in
a Docket worker and need a Docket timeout, e.g.,
from datetime import timedelta
from docket import Timeout
@mcp.tool(task=True)
async def long_running_task(
data: str,
timeout: Timeout = Timeout(timedelta(minutes=10)))
...
One can specify tags, e.g., @mcp.tool(tags={"admin"}), to allow batch updates.
For instance, mcp.disable(tags={"admin"}) disables all tools with the admin
tag. Disabled tools don’t appear in list_tools and can’t be called. If there
are any connected clients, then FastMCP sends each of them a
notifications/tools/list_changed notification so that they can refresh their
tool list.
@mcp.tool supports annotations, which are specialized metadata that do not
appear in LLM prompts, but can support appropriate UX patterns. Possible
annotations include title: str, readOnlyHint: bool, destructiveHint: bool,
idempotentHint: bool, and openWorldHint: bool. For example, an MCP host can
choose to skip confirmation prompts for read-only tools, and enable more
aggressive batching and caching.
Tasks
MCP allows requestors (can be clients or servers) to augment their requests with tasks, e.g.,
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "get_weather",
"arguments": {"city": "New York"},
"task": {}
}
}
… upon which the receiver immediately returns, e.g.,
{
"jsonrpc": "2.0",
"id": 1,
"result": { // CreateTaskResult
"task": {
"taskId": "task-123", // Receiver-generated ID
"status": "working",
"statusMessage": "The operation is now in progress.",
"createdAt": "2025-11-25T10:30:00Z",
"lastUpdatedAt": "2025-11-25T10:40:00Z",
"ttl": 60000,
"pollInterval": 5000 // As a courtesy, the requestor should respect this.
}
}
}
The requestor then polls tasks/get until the task reaches a terminal status (
completed, failed, or cancelled) or until encountering the
input_required status. Calling tasks/result blocks until the task reaches a
terminal status (but requestors can keep polling tasks/get in parallel). The
receiver responds to tasks/result with a result, e.g.,
{
"jsonrpc": "2.0",
"id": 4,
"result": {
"content": [
{
"type": "text",
"text": "Current weather in New York:\nTemperature: 68°F\nConditions: Partly windy"
}
],
"isError": false,
"_meta": {
"io.modelcontextprotocol/related-task": {
"taskId": "task-123"
}
}
}
}
The tasks.list and tasks.cancel capabilities support the tasks/list and
tasks/cancel operations. The server exposes the tasks.requests.tools.call
capability to express support for task-augmented tools/call requests.
Similarly, the client exposes the tasks.requests.sampling.createMessage and
tasks.requests.elicitation.create capabilities to express support for the
task-augmented sampling/createMessage and elicitation/create requests.
In the result of a tools/list call, servers declare tool-level support via
execution.taskSupport. If omitted or forbidden, then servers return an error
if a client attempts a task-augmented tool call. If required, servers return
an error if the client omits task-augmentation. If optional, then it’s up to
the client to choose the mode.
FastMCP’s @mcp.tool(task=True) is shorthand for
@mcp.tool(task=TaskConfig(mode="optional")). FastMCP has graceful degradation
for requestors sending task-augmented requests for mode="forbidden" so that
clients can always request background execution without worrying about server
capabilities. However, there are no accomodations for requestors failing to meet
mode="required".
When a task status changes, receivers may send a notifications/tasks/status
to inform the requestor. The notification includes the full Task object, and
so an additional tasks/get isn’t necessary (but a tasks/result would be).
The requestor must not rely on notifications/tasks/status and should poll via
tasks/get to ensure they receive status updates.
Why is it optional for receivers to send notifications/tasks/status? Surely,
notifications/tasks/status is more efficient than polling tasks/get? Maybe
some MCP clients have already shipped without notifications/tasks/status and
thus this is a back-compat guidance? But an experimental API shouldn’t try to be
backwards-compatible…
When an authorization context is present, receivers must bind tasks to said
context; requestors shouldn’t be able to view/modify tasks that they did not
create. If auth is not available, receivers must generate cryptographically
secure task IDs with shorter TTLs to reduce the exposure window. Furthermore,
such requestors should not declare the tasks.list capability.
FastMCP supports two backends for task execution. The in-memory backend has no external dependencies but loses tasks on server restart, has a high ~250ms task pickup time, and has no horizontal scaling. The Redis backend provides persistence across server restarts, sub-10ms task pickup latency, and horizontal scaling.
To report progress back to clients, FastMCP can inject a Progress object, with
APIs for set_total(n: int), increment(amount: int), and set_message(text),
e.g.,
@mcp.tool(task=True)
async def process_files(files: list[str], progress; Progress = Progress()) -> str:
...
References
- Architecture overview - Model Context Protocol. modelcontextprotocol.io . Accessed May 9, 2026.
- Understanding MCP servers - Model Context Protocol. modelcontextprotocol.io . Accessed May 9, 2026.
- Tools - FastMCP. gofastmcp.com . Accessed May 14, 2026.
- Robustness principle - Wikipedia. en.wikipedia.org . Accessed May 14, 2026.
- Tasks - Model Context Protocol. modelcontextprotocol.io . Accessed May 14, 2026.
- Background Tasks - FastMCP. gofastmcp.com . Accessed May 14, 2026.
So typically, the MCP server and the client are implementation details that the end user doesn’t really see. The user’s view is mediated by the MCP host.