MCP Client Support for Discourse: A Design Proposal

Model Context Protocol has become the de facto standard for connecting AI agents to external tools and data. Every serious AI platform has adopted it: Claude, Cursor, Zed, VS Code Copilot. The ecosystem of MCP servers is growing fast — Jira, GitHub, Slack, Linear, Postgres, Stripe, and thousands of others already have production-quality MCP servers available. Discourse has a capable AI agent layer. It should be able to talk to all of them.

This post lays out the design we’re working towards for MCP client support — Discourse as the host that connects to external MCP servers, not Discourse as a server others connect to. That distinction matters. The goal is to let Discourse AI agents call tools from any MCP-compatible service, with minimal configuration and solid security defaults.

What MCP actually is

MCP is a JSON-RPC 2.0 protocol with a defined lifecycle. A client (the host application) connects to a server (the tool provider), negotiates capabilities, and then can call tools/list to discover what tools are available and tools/call to invoke them. Tools are described with a name, a description, and a JSON Schema for their parameters — the same shape Discourse’s own tool system uses internally.

MCP’s design draws explicit inspiration from the Language Server Protocol, which standardised how editors talk to language-specific tooling. MCP attempts the same for AI agents and external context providers.

The spec defines transport mechanisms, of which the current standard is Streamable HTTP: a single endpoint handling both POST (sending requests) and optional GET (receiving server-initiated SSE streams). The older HTTP+SSE transport from 2024-11-05 is now deprecated.

Beyond tools, MCP also defines resources (contextual data the server exposes) and prompts (templated message flows). We’re not touching those in v0. Tools are where the value is and where we can ship something coherent.

Where it fits in Discourse’s AI stack

Discourse AI has a layered architecture:

lib/completions/ — LLM abstraction. Dialects, endpoints, streaming. This is how Discourse talks to OpenAI, Anthropic, Gemini, etc.
lib/agents/tools/ — Tool framework. ~35 built-in tools (Search, WebBrowser, EditPost, CreateArtifact, etc.), plus Custom < Tool which wraps AiTool records running JavaScript via MiniRacer.
lib/agents/agent.rb — Agent base class. Assembles tool signatures into prompts, routes tool calls to the right class, manages the conversation loop.

MCP belongs at layer 2. An MCP server isn’t a new kind of LLM — it’s a new kind of tool source. The right integration point is lib/agents/tools/mcp.rb, following the exact pattern Tools::Custom already established.

Custom.class_instance(tool_id) dynamically generates a Ruby class per AiTool database record so each custom tool looks like a native tool to the agent system. MCP needs the same trick: one Ruby class per discovered tool, but the “records” come from a remote server’s tools/list response rather than the database. That’s the core mechanism.

Transport

For a web application running multiple worker processes, Streamable HTTP is the right default. The Streamable HTTP transport maps cleanly onto Rails. stdio requires spawning a subprocess per connection; in a multi-worker environment you’d need a broker process or limit stdio servers to background jobs only. That complexity is real and the benefit — running local binaries — is primarily useful for self-hosted deployments.

Tool discovery is not a session concern. The tools/list response is cached globally per MCP server, not per conversation or user. Tools exposed by a Jira or GitHub MCP server are the same regardless of who’s talking to the bot. The cache is warmed on first use, has a configurable TTL, and is refreshed in the background. When a user returns to a conversation three days later, the tool list is fetched fresh if the cache has expired — that’s correct behaviour, not a failure mode. Tools may have been added or removed; the agent works with whatever is current.

Sessions are a separate concern, and simpler than they look. The spec’s session management mechanism is optional — servers MAY issue an Mcp-Session-Id at initialization, many won’t. For those that do, the ID must be included in all subsequent requests; a 404 response means the session has expired and the client must re-initialize.

For v0, we adopt a per-agent-turn session model: initialize at the start of each bot reply, call whatever tools are needed, and let the session lapse. No session IDs are persisted between turns. This works correctly for the vast majority of MCP servers, which are stateless API wrappers (Jira, GitHub, Slack) that don’t maintain any cross-turn server-side state. The session ID in those cases is effectively a handshake artifact.

The spec defines sessions as a transport mechanism — for routing SSE streams and optional server-side bookkeeping — not as a semantic continuity guarantee between tool calls. There is no protocol expectation that a server remembers what happened in a previous tools/call invocation. Per-turn initialization is fully compliant.

One detail that does need care: when a client POSTs a tools/call request, the server can respond with either Content-Type: application/json (synchronous) or Content-Type: text/event-stream (SSE, for long-running tools). Tool results must be synchronous from the agent’s perspective — they’re fed back into the conversation. For the SSE case, we buffer until the final response event arrives.

The dynamic tool problem

This is the interesting engineering challenge. Discourse’s tool model is one-Ruby-class-per-tool. An MCP server exposes N tools dynamically, discovered at runtime via tools/list. You can’t hard-code a class for jira_search_issues — you don’t know it exists until you’ve connected to the server.

The solution, following Custom’s precedent:

# lib/agents/tools/mcp.rb
class Mcp < Tool
  class << self
    def class_instance(server_id, tool_name, schema)
      klass = Class.new(self)
      klass.instance_variable_set(:@server_id, server_id)
      klass.instance_variable_set(:@tool_name, tool_name)
      klass.instance_variable_set(:@mcp_schema, schema)
      klass
    end

    def signature
      {
        name: @tool_name,
        description: @mcp_schema["description"],
        parameters: convert_json_schema(@mcp_schema["inputSchema"])
      }
    end

    def custom? = true
  end

  def invoke
    client = DiscourseAi::Mcp::Client.new(server)
    result = client.call_tool(self.class.tool_name, parameters)
    # MCP content is an array: [{type: "text", text: "..."}, ...]
    { result: result["content"].map { _1["text"] }.compact.join("\n") }
  end
end

A ToolRegistry class, backed by Redis, caches tools/list responses per server and generates the dynamic class set. The cache is warmed on first use and refreshed by a background job on a configurable interval. When Agent.all_available_tools is called, it appends the MCP tool classes for any servers the agent has assigned.

The MCP tools spec also defines a notifications/tools/list_changed notification for when the tool list changes. We’ll poll for now and handle live notifications in a later phase.

The model layer

One new model: AiMcpServer. The essential fields:

name, description
url — the MCP endpoint (Streamable HTTP)
ai_secret_id — reference to the existing AiSecret model, which already handles encrypted credential storage across LLMs, embeddings, and custom tools
auth_header — which HTTP header to transmit the credential on (default: Authorization)
auth_scheme — optional prefix (default: Bearer; blank = send raw value, for X-Api-Key style headers)
enabled

Join table: ai_agent_mcp_servers (agent_id, mcp_server_id). Agents carry zero-to-many MCP servers. All tools from an assigned server become available to that agent.

A note on authentication. The MCP spec’s authorization model is OAuth 2.1 — Client Credentials for machine-to-machine calls, Authorization Code when acting on behalf of a user. That’s the right long-term direction. In practice today, virtually every real MCP server (GitHub, Jira, Slack, Linear) uses a static pre-obtained API token sent as Authorization: Bearer <token> or a custom header. The model above covers that case cleanly. Full OAuth Client Credentials support — auto-discover the token endpoint from /.well-known/oauth-authorization-server, exchange credentials, cache and refresh access tokens — is a natural Phase 2 addition once the ecosystem has adopted it more broadly.

UI: one unified Tools tab

The current AI admin navigation has four tabs: LLMs, Custom Tools, Agents, Usage. Adding a fifth tab for MCP Servers was our first instinct, but it’s the wrong call.

Custom Tools and MCP Servers are the same conceptual family: both are ways to give agents access to external functionality. The implementation differs — one runs a JavaScript sandbox, the other calls a remote service — but an admin managing these shouldn’t care about that distinction at the navigation level. They’re both just “tools.”

The nav becomes: LLMs | Tools | Agents | Usage

The unified Tools page has a type filter — Scripts and MCP Servers — and a single “New” entry point that lets you choose which kind to create. The list shows both together, distinguished by an icon. This is consistent with how the rest of the admin UI works: one list, filtered views, not proliferating tabs.

In the agent editor, MCP servers appear as a distinct section below the individual tool picker, since a server assignment is semantically different — you’re assigning a whole capability surface, not a single tool. The agent editor shows each assigned server, how many tools it exposes, its health status, and a remove button. A dropdown lets you assign additional servers or create a new one inline (which saves to the global registry) without leaving the agent editor.

Security

MCP tools represent arbitrary external calls. The security defaults need to be conservative. The MCP spec itself is explicit about this: users must consent to and understand all operations, and hosts must obtain explicit user consent before invoking any tool.

That said, Discourse is async by nature. The MCP spec’s “human in the loop” approval model assumes a synchronous, interactive client — an IDE that can pause mid-flow and show “the agent wants to call this tool, approve?” before proceeding. A forum bot can’t pause a reply mid-thread and wait for interactive confirmation. So we translate the intent rather than the mechanism.

Configuration-time approval, not invocation-time. The meaningful approval gate in Discourse is the admin deciding which MCP servers an agent can use and which user groups can interact with that agent. That assignment is the consent. Admins must explicitly configure access — MCP servers are not available to any agent by default.

Transparency over gating. Where interactive approval isn’t possible, transparency is the substitute. Agents can disclose what external calls they’re making as part of their reply — “Searching Jira for open P1 issues…” — so the conversation record shows what happened and why. This is the meaningful equivalent of “show the user what the agent is doing” in a forum context.

Credentials never reach the LLM. The credential value lives in AiSecret — the same encrypted store used by LLMs and custom tools — and the auth_header / auth_scheme fields tell the client how to transmit it. The resolved header is added to outbound HTTP requests only. It appears nowhere in the conversation context, tool signatures, or prompt construction.

Origin validation. When making requests to MCP servers, we validate that the server URL is not a private/loopback address (unless explicitly configured for internal use). This prevents a malicious MCP tool description from redirecting calls to internal services. The transport spec explicitly warns about DNS rebinding attacks — we take this seriously.

Per-call timeouts and turn caps. Each AiMcpServer has a configurable HTTP timeout (default 30s) and a max calls per turn limit (default 10). Both are enforced before the call is made.

What we’re not building in v0

Resources — MCP’s second primitive lets servers expose contextual data (files, documents, database rows) that can be injected into prompts. This is interesting — it maps reasonably well to Discourse’s existing RAG system — but it’s a separate design problem. v0 is tools only.

Sampling — MCP’s client capability that lets servers trigger LLM inference. We’re not implementing this. It’s architecturally complex (requires the server to trust the client’s LLM choices) and the security implications for a multi-tenant platform are significant.

Prompts — MCP’s third server primitive for templated message flows. Niche, and mapping it to Discourse’s conversation model is awkward.

The phased plan

Phase 1 — HTTP transport, tools only

AiMcpServer model with encrypted credentials
Mcp::Client implementing the initialize handshake, tools/list, tools/call, and Redis session management
Tools::Mcp with class_instance factory following the Custom tool pattern
Mcp::ToolRegistry for cached tool discovery and background refresh
ai_agent_mcp_servers join table
Unified Tools admin page (Scripts + MCP Servers sections)
Connection test UI with live tool discovery and per-tool approval toggle
Agent editor MCP server assignment section with health status
Per-turn session model: initialize, call tools, done — no cross-turn session state to manage

Phase 2 — Resources as on-demand tools

MCP resources are contextual data that servers expose (files, documents, database rows, live state). The naive approach would be to inject them into the system prompt at craft_prompt time — but that busts Anthropic prompt caching entirely, since the system prompt is the most valuable thing to keep stable across turns.

The right approach is how Claude and Cursor actually handle resources: expose them as tools the LLM can call on demand. If an MCP server advertises the resources capability, synthesise two additional tool classes for it — mcp_list_resources (wraps resources/list, returns available resources with descriptions) and mcp_read_resource (wraps resources/read for a given URI). The LLM calls them when it decides it needs something; resource content arrives as a tool result at the tail of the conversation, where nothing is cached anyway. No automatic injection, no cache impact, and the model only fetches what it actually needs for the task at hand.

The real value of MCP for Discourse isn’t any single integration. It’s that once the client infrastructure exists, every MCP server that anyone in the ecosystem builds becomes immediately available to Discourse agents with no additional code. The MCP registry already lists hundreds of servers across every major SaaS category; community directories track thousands more. That’s the USB-C argument: one protocol, universal compatibility. Discourse should be a first-class citizen in it.

We’re aiming to have a working prototype of Phase 1 shipping for community feedback in the coming weeks. Discussion welcome on meta.discourse.org.