Model Context Protocol (MCP)

Dated May 9, 2026; last modified on Thu, 14 May 2026

MCP is written with LLM apps as the clients, not human end-users. It provides a set of conventions on how to agnostically provide context to LLM apps.

MCP 101

There are 3 key participants in the MCP architecture:

  • MCP Host: The AI application that manages one or more MCP clients, e.g., Claude Code, Copilot, etc.
  • MCP Client: A component that maintains a connection to an MCP server and obtains context from an MCP server for the MCP host to use.
  • MCP Server: A program that provides context to MCP clients.

So typically, the MCP server and the client are implementation details that the end user doesn’t really see. The user’s view is mediated by the MCP host.

MCP consists of two layers:

  • Transport layer: The outer layer that defines communication mechanisms and channels that enable client-server data exchange, e.g., message framing, authorization, etc. MCP supports two transport mechanisms:
    • Stdio for local processes on the same machine.
    • Streamable HTTP for remote server communication with bearer tokens, API keys, and custom headers.
  • Data layer: The inner layer that defines the JSON-RPC based protocol for client-server communication, e.g., lifecycle management and core primitives.

MCP primitives define what clients and servers can offer each other. Servers can expose:

  • Tools: Executable functions that the MCP host can invoke to perform actions.
  • Resources: Data sources that provide contextual information.
  • Prompts: Reusable templates to help structure LLM inference.

… where each primitive has associated methods for discovery (*/list), retrieval (*/get). Additionally, there’s tools/call for execution.

Clients can expose these MCP primitives:

  • Sampling: Allows servers to request LLM inference from the client’s AI application, enabling the MCP server to stay model-independent.
  • Elicitation: Allows servers to request additional information from users.
  • Logging: Enables servers to send log messages to clients for debugging and monitoring purposes.
  • Tasks (Experimental): Durable execution wrappers that enable deferred result retrieval and status tracking for MCP requests.

MCP also supports notifications to enable dynamic updates between servers and clients. Unlike other messages, notifications lack an id field as the sender (can be either the server or the client) doesn’t need a response.

Sample Flow

The client sends an initialize request, e.g.,

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-06-18",
    "capabilities": { "elicitation": {} },
    "clientInfo": { "name": "example-client", "version": "1.0.0" }
  }
}

… and the server responds:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": {}
    },
    "serverInfo": { "name": "example-server", "version": "1.0.0" }
  }
}

… and the MCP host stores these capabilities for later use.

If a mutually compatible protocolVersion cannot be found, the connection should be terminated. Otherwise, the client notifies the server that it’s ready: { "jsonrpc": "2.0", "method": "notifications/initialized" }

To discover available tools, the client sends {"jsonrpc": "2.0", "id": 2, "method": "tools/list"} and the server responds, e.g.,

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "tools": [
      {
        "name": "weather_current", // Unique within the server's namespace
        "title": "Weather Information", // Human readable name of the tool
        "description": "Get current weather information for any location worldwide",
        "inputSchema": { // Enables type validation and documentation on required params
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name, address, or coordinates (latitude, longitude)"
            },
            "units": {
              "type": "string",
              "enum": ["metric", "imperial", "kelvin"],
              "description": "Temperature units to use in response",
              "default": "metric"
            }
          },
          "required": ["location"]
        }
      }
    ]
  }
}

… and the MCP host combines all the tools from the MCP servers into a unified tool registry that the LLM can access.

There’s additional bookkeeping needed on the MCP host. While weather_current is unique in example-server’s namespace, that might not hold in the MCP hosts’s tool registry. A potential solution is prefixing with the server’s name, e.g., example-server.weather_current, and the host strips the prefix at the host/client boundary.

When the LLM decides to use a given tool, the MCP host routes it to the appropriate MCP client which sends a tools/call request to the MCP server, e.g.,

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "weather_current", // Must match the name from tools/list
    "arguments": { "location": "San Francisco", "units": "imperial" }
  }
}

… and the MCP server responds:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Current weather in San Francisco: 68°F, partly cloudy with light winds from the west at 8 mph. Humidity: 65%"
      }
    ]
  }
}

One of the common criticisms for MCP is that cases like weather_current can already be solved with existing tech, i.e., expose the API docs and a utility for making network calls. Admittedly, if I squint, MCP is the name for this technique. It’s not a new concept, but the name makes it easier to talk about unambiguously and succinctly.

For real-time updates, MCP supports notifications. For example, an MCP server that specified "tools": { "listChanged": true } can send a notification like: { "jsonrpc": "2.0", "method": "notifications/tools/list_changed" }. The client typically reacts by requesting tools/list and refreshing its list of available tools. That way, the client doesn’t need to poll for changes.

Incomplete messages for simplicity’s sake? One can envision a case where notifications/tools/list_changed also comes with tools/list’s payload because that’s what the MCP client is expected to call.

MCP Primitives

Resources

Resources provide structured access to information that the AI app can retrieve and pass to the model as context, e.g., a server can expose calendar://events/2026 to return calendar availability for the user. A server can also expose resource templates, e.g., travel://activities/{city}/{category} for flexible queries, e.g., travel://activities/barcelona/museums.

The boundaries here seem pretty fluid. For example, filtering to contextual data, e.g., travel://activities/barcelona/museums and file://documents/travel/passport.pdf, may require an inference to map the user query against the list of available resources. Furthermore, resource templates seem like they should be behind a tool call, e.g., getAttractions(city: str | None, category: str | None).

Prompts

Prompts provide reusable templates for common scenarios, e.g., a plan-vacation prompt that accepts a destination, duration, budget, and interests.

I don’t fully understand the utility here. If the vacation MCP server provides a plan-vacation prompt, then the customizations are more like a way to ensure that the user provides all of the required data upfront. Contrast this to a case where the user has vague vacation plans and the host needs to elicit more specific information from the user before calling a tool in the vacation MCP server.

If the plan-vacation prompt contains procedural guidance on how to call the tools in the vacation MCP server, then why not have those instructions in some internal system prompt? Ah, we’re assuming that vacation has an LLM, which is not true in general.

Tools

A simple server implementation:

from fastmcp import FastMCP

mcp = FastMCP(name="CalculatorServer")

@mcp.tool
def add(a: int, b: int) -> int:
  """Adds two integer numbers together."""
  return a + b

The function name, add, becomes the tool name and the docstring becomes the tool description. The @mcp.tool decorator also accepts arguments such as name, description, tags, icons, annotations, meta, timeout, version, and output_schema. The @tool decorator is available for instance/class methods; @mcp.tool wouldn’t work as it registers the tool immediately.

async def and def are both supported. Even then, synchronous tools run in a threadpool to avoid blocking the event loop. For I/O bound operations, prefer async tools because they’re more efficient compared to threadpool dispatch overhead.

FastMCP generates an input schema based on the function’s params and type annotations. Given an incoming tools/call message, FastMCP parses the input, returning errors if validation fails. The function’s output is also validated against the output schema derived from the return type. FastMCP supports all Pydantic types as well. bytes parameters lack base64 encoding; for base64 data, use str and manually call base64.b64decode(). By default, FastMCP uses strict_input_validation=False allowing "10" to be coerced into an int where the tool requires it so. However, Pydantic models must be provided as JSON objects (dicts); strict_input_validation=False won’t work on stringified JSON.

Because we’re dealing with an LLM, it seems that strict_input_validation=False is desirable to combat unreliability on the LLM’s part.

The robustness principle states: be conservative in what you send, be liberal in what you accept. However, this principle has been criticized for entrenching flaws as a de facto standard because ensuring interoperability in such environments is aiming yo be bug-for-bug compatible.

Using Pydantic’s Field class with Annotated to impose validation constraints, e.g.,

@mcp.tool
def search_database(
  query: Annotated[str, Field(description="Search query string")],
  limit: Annotated[int, Field(description="Maximum number of results", ge=1, le=100)] = 10
) -> list:
  """Search the database with the provided query."""
  # Implementation...

… where Field supports validation and documentation features like description, ge, gt, le, lt, min_length, max_length, pattern, and default.

Parameters using Depends() are automatically excluded from the tool schema, e.g.,

from fastmcp import FastMCP
from fastmcp.dependencies import Depends

mcp = FastMCP()

def get_user_() -> str:
  return "user_123" # Injected at runtime

@mcp.tool
def get_user_details(user_id: str = Depends(get_user_id)) -> str:
  # user_id is injected by the server, not provided by the LLM.
  return f"Details for {user_id}"

This gels with Deterministic Control Flows vs. Prompting . While we could prompt the LLM to give the correct user_id, this is a deterministic security/privacy boundary that we shouldn’t toss the dice for.

The return type is influenced by the return type annotation. For example, -> int makes FastMCP generate a JSON schema for the output and also included a structuredContent in the response, e.g.,

{
  "content": [
    {
      "type": "text",
      "text": "8"
    }
  ],
  "structuredContent": {
    "result": 8
  }
}

… That said, with -> ToolResult from fastmcp.tools.tool, one can specify the traditional content, structured data, and metadata (e.g., execution time).

When the tool raises an exception, FastMCP logs it and sends back an MCP error response including the details. Specifying mask_error_details=True suppresses internal error details and uses a generic error message. However, ToolErrors from fastmcp.exceptions are always sent to clients with their error messages.

@mcp.tool(timeout=30.0) makes FastMCP return an error to the client indicating that the call timed out. However, tools that run as background tasks execute in a Docket worker and need a Docket timeout, e.g.,

from datetime import timedelta
from docket import Timeout

@mcp.tool(task=True)
async def long_running_task(
  data: str,
  timeout: Timeout = Timeout(timedelta(minutes=10)))
  ...

One can specify tags, e.g., @mcp.tool(tags={"admin"}), to allow batch updates. For instance, mcp.disable(tags={"admin"}) disables all tools with the admin tag. Disabled tools don’t appear in list_tools and can’t be called. If there are any connected clients, then FastMCP sends each of them a notifications/tools/list_changed notification so that they can refresh their tool list.

@mcp.tool supports annotations, which are specialized metadata that do not appear in LLM prompts, but can support appropriate UX patterns. Possible annotations include title: str, readOnlyHint: bool, destructiveHint: bool, idempotentHint: bool, and openWorldHint: bool. For example, an MCP host can choose to skip confirmation prompts for read-only tools, and enable more aggressive batching and caching.

Tasks

Tasks are still experimental per , having been proposed in Nov 2025.

MCP allows requestors (can be clients or servers) to augment their requests with tasks, e.g.,

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {"city": "New York"},
    "task": {}
  }
}

… upon which the receiver immediately returns, e.g.,

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": { // CreateTaskResult
    "task": {
      "taskId": "task-123", // Receiver-generated ID
      "status": "working",
      "statusMessage": "The operation is now in progress.",
      "createdAt": "2025-11-25T10:30:00Z",
      "lastUpdatedAt": "2025-11-25T10:40:00Z",
      "ttl": 60000,
      "pollInterval": 5000 // As a courtesy, the requestor should respect this.
    }
  }
}

The requestor then polls tasks/get until the task reaches a terminal status ( completed, failed, or cancelled) or until encountering the input_required status. Calling tasks/result blocks until the task reaches a terminal status (but requestors can keep polling tasks/get in parallel). The receiver responds to tasks/result with a result, e.g.,

{
  "jsonrpc": "2.0",
  "id": 4,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Current weather in New York:\nTemperature: 68°F\nConditions: Partly windy"
      }
    ],
    "isError": false,
    "_meta": {
      "io.modelcontextprotocol/related-task": {
        "taskId": "task-123"
      }
    }
  }
}

The tasks.list and tasks.cancel capabilities support the tasks/list and tasks/cancel operations. The server exposes the tasks.requests.tools.call capability to express support for task-augmented tools/call requests. Similarly, the client exposes the tasks.requests.sampling.createMessage and tasks.requests.elicitation.create capabilities to express support for the task-augmented sampling/createMessage and elicitation/create requests.

In the result of a tools/list call, servers declare tool-level support via execution.taskSupport. If omitted or forbidden, then servers return an error if a client attempts a task-augmented tool call. If required, servers return an error if the client omits task-augmentation. If optional, then it’s up to the client to choose the mode.

FastMCP’s @mcp.tool(task=True) is shorthand for @mcp.tool(task=TaskConfig(mode="optional")). FastMCP has graceful degradation for requestors sending task-augmented requests for mode="forbidden" so that clients can always request background execution without worrying about server capabilities. However, there are no accomodations for requestors failing to meet mode="required".

In what cases would a client choose not to task-augment a tool call? Maybe the client doesn’t yet have support for showing incremental results?

Re: forking of the implementation for execution.taskSupport = "optional", implies that the fork lives in the library, and not in the application code.

’s deviation from ’s spec for execution.taskSupport = "forbidden" is entrenching a flaw into a de-facto standard, which is what was criticized for.

When a task status changes, receivers may send a notifications/tasks/status to inform the requestor. The notification includes the full Task object, and so an additional tasks/get isn’t necessary (but a tasks/result would be). The requestor must not rely on notifications/tasks/status and should poll via tasks/get to ensure they receive status updates.

Why is it optional for receivers to send notifications/tasks/status? Surely, notifications/tasks/status is more efficient than polling tasks/get? Maybe some MCP clients have already shipped without notifications/tasks/status and thus this is a back-compat guidance? But an experimental API shouldn’t try to be backwards-compatible…

When an authorization context is present, receivers must bind tasks to said context; requestors shouldn’t be able to view/modify tasks that they did not create. If auth is not available, receivers must generate cryptographically secure task IDs with shorter TTLs to reduce the exposure window. Furthermore, such requestors should not declare the tasks.list capability.

FastMCP supports two backends for task execution. The in-memory backend has no external dependencies but loses tasks on server restart, has a high ~250ms task pickup time, and has no horizontal scaling. The Redis backend provides persistence across server restarts, sub-10ms task pickup latency, and horizontal scaling.

To report progress back to clients, FastMCP can inject a Progress object, with APIs for set_total(n: int), increment(amount: int), and set_message(text), e.g.,

@mcp.tool(task=True)
async def process_files(files: list[str], progress; Progress = Progress()) -> str:
  ...

References

  1. Architecture overview - Model Context Protocol. modelcontextprotocol.io . Accessed May 9, 2026.
  2. Understanding MCP servers - Model Context Protocol. modelcontextprotocol.io . Accessed May 9, 2026.
  3. Tools - FastMCP. gofastmcp.com . Accessed May 14, 2026.
  4. Robustness principle - Wikipedia. en.wikipedia.org . Accessed May 14, 2026.
  5. Tasks - Model Context Protocol. modelcontextprotocol.io . Accessed May 14, 2026.
  6. Background Tasks - FastMCP. gofastmcp.com . Accessed May 14, 2026.