March 25, 2026Tutorials14 min read

Build a Production MCP Server in TypeScript (2026)

By Ergini, Software & AI Developer

TL;DR

Most MCP tutorials stop at stdio. This one goes all the way: typed tools with Zod, resources, OAuth, remote SSE transport, deployment to Vercel, and the testing harness - in TypeScript, wired into Claude Code and Cursor.

Most MCP tutorials stop at "hello world over stdio." That is useful for ten minutes and then you hit the real questions: how do I type my tools, how do I expose resources, how do I run this remotely so my whole team can use it, how do I add OAuth, how do I deploy it, and how do I wire it into Claude Code, Cursor, and Claude Desktop at the same time? This post answers all of it in TypeScript, with code you can copy.

What MCP is and why it matters in 2026

The Model Context Protocol is the "USB-C of LLM tooling." Anthropic shipped it in late 2024 to solve a stupid problem: every chat client, every editor, every coding agent invented its own way to expose tools to a model. If you wrote a GitHub integration for Cursor it did not work in Claude Desktop. If you wrote one for Claude Desktop it did not work in Zed. Every team rebuilt the same five integrations against five different proprietary surfaces.

MCP standardized the wire. A server speaks a JSON-RPC dialect over a transport (stdio or HTTP/SSE). It advertises three kinds of capability - tools, resources, and prompts. A client (the editor or the chat app) connects, introspects the catalog, and hands the model whatever the server offers. Write the server once, plug it into every compliant client. In 2026 the compliant-client list includes Claude Desktop, Claude Code, Cursor, Zed, Continue, Cline, Sourcegraph Cody, and the OpenAI Agents SDK. The bet has paid off.

The protocol itself lives at modelcontextprotocol.io, with reference implementations on github.com/modelcontextprotocol. The spec is short and stable. The SDKs do all the JSON-RPC plumbing for you, which means a real server can be 40 lines of TypeScript.

MCP server concepts

An MCP server exposes four categories of capability. Most servers only implement the first one and that is fine - but you should know what each category is for before you write any code.

Capability	Shape	What it is for	Example
Tool	Typed input and output, side effects allowed	Model takes an action	create_issue, search_docs, send_email
Resource	URI-addressed read-only content	Model pulls in context	file://README.md, db://users/42
Prompt	Reusable parameterized prompt template	User picks a workflow from a menu	summarize_pr, draft_changelog
Sampling	Server asks the client to call its own LLM	Server delegates reasoning to the host model	Rare; agent-style servers only

Transports are the other half of the picture. The default is stdio: the client spawns your server as a subprocess and they talk over stdin and stdout. Cheap, zero auth surface, perfect for personal tools. The remote transports are HTTP+SSE (the original remote spec) and the newer streamable HTTP transport that ships better with serverless. Remote transports unlock multi-user servers, browser-based clients, and team sharing - at the cost of having to think about auth, CORS, and session state.

Setup

You need Node.js 20+, the official TypeScript SDK, and a couple of helpers. Spin up a fresh project:

mkdir my-mcp-server && cd my-mcp-server
npm init -y
npm install @modelcontextprotocol/sdk zod
npm install -D typescript tsx @types/node
npx tsc --init

Set your package.json up for ESM and add a dev script that runs the server with hot reload. Stdio servers are easiest to iterate on with tsx watch, which restarts the subprocess on every save.

// package.json
{
  "name": "my-mcp-server",
  "version": "0.1.0",
  "type": "module",
  "bin": {
    "my-mcp-server": "./dist/server.js"
  },
  "scripts": {
    "dev": "tsx watch src/server.ts",
    "build": "tsc",
    "start": "node dist/server.js",
    "inspect": "npx @modelcontextprotocol/inspector tsx src/server.ts"
  },
  "dependencies": {
    "@modelcontextprotocol/sdk": "^1.12.0",
    "zod": "^3.24.1"
  },
  "devDependencies": {
    "@types/node": "^22.10.0",
    "tsx": "^4.19.0",
    "typescript": "^5.7.0"
  }
}

Two settings matter in tsconfig.json: "module": "NodeNext" and "target": "ES2022". The SDK is ESM-only, and modern Node features (top-level await, native fetch) make the server code much shorter.

Hello-world stdio server

Here is the smallest useful MCP server. It exposes one tool - echo - that returns whatever you give it. The file is 35 lines and it is the foundation everything else in this post extends.

// src/server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "my-mcp-server",
  version: "0.1.0",
});

server.registerTool(
  "echo",
  {
    title: "Echo",
    description: "Echoes the input string back to the caller.",
    inputSchema: { message: z.string().describe("The message to echo back.") },
  },
  async ({ message }) => ({
    content: [{ type: "text", text: message }],
  })
);

const transport = new StdioServerTransport();
await server.connect(transport);
console.error("my-mcp-server running on stdio");

Two things to notice. First, every log line goes to console.error, never console.log - stdout is reserved for the JSON-RPC protocol, and a stray log line will corrupt the wire and the client will silently disconnect. Second, the tool handler returns a content array, not a raw string. The protocol supports text, images, audio, and resource references in a single tool result.

Run it with npm run inspect and the inspector spawns your server, lists the echo tool, and gives you a form to call it. That is the entire loop.

Adding typed tools

Real tools have real inputs and outputs. The SDK uses Zod schemas for both, and the JSON Schema sent to the model is generated automatically. Good tool design is its own topic - see my tool calling best practices guide for the naming, scoping, and error-contract patterns that keep models from misusing your tools.

A typed search tool over a fake document store. Note the structured output: the SDK serializes the return value into thestructuredContent field, and recent clients hand the model a typed object instead of a string blob.

// src/tools/search.ts
import { z } from "zod";
import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";

const SearchResult = z.object({
  id: z.string(),
  title: z.string(),
  snippet: z.string(),
  score: z.number(),
});

export function registerSearchTool(server: McpServer) {
  server.registerTool(
    "search_docs",
    {
      title: "Search documents",
      description:
        "Full-text search across the indexed document corpus. Returns up to N results sorted by relevance. Use this whenever the user asks about content that might live in the knowledge base.",
      inputSchema: {
        query: z.string().min(1).describe("The search query."),
        limit: z.number().int().min(1).max(20).default(5),
      },
      outputSchema: {
        results: z.array(SearchResult),
        totalMatched: z.number().int(),
      },
    },
    async ({ query, limit }) => {
      const results = await db.search(query, limit);
      const structured = { results, totalMatched: results.length };
      return {
        content: [{ type: "text", text: JSON.stringify(structured, null, 2) }],
        structuredContent: structured,
      };
    }
  );
}

The description matters more than the name. Clients show it to the model verbatim - if it does not say when to use the tool, the model will not use it. Write descriptions the way you would write a function docstring for a junior engineer: what it does, when to call it, what the inputs mean, what shape the output takes.

Resources - when to use them vs tools

A resource is a read-only blob the model can pull into context. It is addressed by URI (file://, https://, db://, or any custom scheme you invent). Use resources when the data is static enough to be browseable, when the user might want to pick from a list before the model touches it, or when the content is too large to materialize on every tool call.

A resource handler that exposes files from a docs directory. Thelist handler returns the directory listing; the read handler returns a single file's contents.

// src/resources/docs.ts
import { readFile, readdir } from "node:fs/promises";
import { join } from "node:path";
import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";

const DOCS_DIR = process.env.DOCS_DIR ?? "./docs";

export function registerDocResources(server: McpServer) {
  server.registerResourceTemplate(
    "doc",
    {
      uriTemplate: "doc:///{path}",
      name: "Knowledge base document",
      mimeType: "text/markdown",
    },
    async (uri) => {
      const path = uri.pathname.slice(1);
      const text = await readFile(join(DOCS_DIR, path), "utf8");
      return {
        contents: [{ uri: uri.href, mimeType: "text/markdown", text }],
      };
    }
  );

  server.registerResourceList(async () => {
    const files = await readdir(DOCS_DIR);
    return {
      resources: files.map((f) => ({
        uri: `doc:///${f}`,
        name: f,
        mimeType: "text/markdown",
      })),
    };
  });
}

Rule of thumb: if the model needs to do something, ship a tool. If the user needs to pick something to give the model, ship a resource. Most servers I write expose 80% tools and 20% resources. The exception is documentation servers, where it inverts.

Prompts (templates)

Prompts are parameterized templates the client exposes as slash commands or palette entries. The user picks one, fills in any arguments, and the client sends the rendered prompt to the model. They are underused in 2026 - most server authors skip them - but they are the cheapest way to expose curated workflows to non-technical users.

// src/prompts/changelog.ts
import { z } from "zod";
import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";

export function registerChangelogPrompt(server: McpServer) {
  server.registerPrompt(
    "draft_changelog",
    {
      title: "Draft a changelog from commits",
      description:
        "Reads the git log between two refs and drafts a user-facing changelog grouped by feature, fix, and breaking change.",
      argsSchema: {
        from: z.string().describe("Older git ref (e.g. v1.2.0)"),
        to: z.string().default("HEAD").describe("Newer git ref"),
      },
    },
    async ({ from, to }) => ({
      messages: [
        {
          role: "user",
          content: {
            type: "text",
            text: `Use the git_log tool to read commits between ${from} and ${to}. Group them into Features, Fixes, and Breaking Changes. Drop merge commits. Write in active voice, one bullet per change.`,
          },
        },
      ],
    })
  );
}

In Claude Desktop the user types / and your prompts show up alongside the built-ins. Cursor exposes them through the slash menu in the chat sidebar. Same prompt, two surfaces, no extra code.

From stdio to HTTP/SSE remote transport

Stdio servers run on the user's machine, which is fine for personal tools and disastrous for anything that touches shared state. A Notion server that lives in your editor cannot see the Notion data your teammate just updated unless it goes through a single shared service. Remote transport is how you ship a server once and serve a whole team - or expose it to browser-based clients that cannot spawn subprocesses at all.

The 2026 default is the streamable HTTP transport, which works seamlessly with serverless. A minimal Express setup:

// src/http-server.ts
import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { registerSearchTool } from "./tools/search.js";
import { registerDocResources } from "./resources/docs.js";

const app = express();
app.use(express.json());

const sessions = new Map<string, StreamableHTTPServerTransport>();

app.all("/mcp", async (req, res) => {
  const sessionId = req.header("mcp-session-id");
  let transport = sessionId ? sessions.get(sessionId) : undefined;

  if (!transport) {
    transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => crypto.randomUUID(),
      onsessioninitialized: (id) => sessions.set(id, transport!),
      onsessionclosed: (id) => sessions.delete(id),
    });

    const server = new McpServer({ name: "my-mcp-server", version: "0.1.0" });
    registerSearchTool(server);
    registerDocResources(server);
    await server.connect(transport);
  }

  await transport.handleRequest(req, res, req.body);
});

app.listen(3000, () => console.error("MCP server on http://localhost:3000/mcp"));

One transport instance per session, identified by the mcp-session-id header the client echoes back after initialization. The first request creates the session; subsequent requests reuse it. The transport handles the SSE upgrade transparently - clients that prefer SSE get a long-lived stream, clients that prefer plain HTTP get request-response, and you write the same code for both.

OAuth for remote MCP servers

Any remote server that touches user data needs auth. The MCP spec adopted OAuth 2.1 with PKCE in 2025 and every serious client now supports it. The minimum viable implementation exposes three endpoints: the protected-resource metadata, the authorization endpoint, and the token endpoint. The client discovers them automatically.

// src/auth/oauth.ts
import type { Express, Request, Response } from "express";

export function mountOAuth(app: Express, opts: { issuer: string }) {
  app.get("/.well-known/oauth-protected-resource", (_req, res) => {
    res.json({
      resource: opts.issuer,
      authorization_servers: [opts.issuer],
      bearer_methods_supported: ["header"],
    });
  });

  app.get("/.well-known/oauth-authorization-server", (_req, res) => {
    res.json({
      issuer: opts.issuer,
      authorization_endpoint: `${opts.issuer}/authorize`,
      token_endpoint: `${opts.issuer}/token`,
      code_challenge_methods_supported: ["S256"],
      grant_types_supported: ["authorization_code", "refresh_token"],
      response_types_supported: ["code"],
    });
  });

  app.get("/authorize", async (req, res) => {
    const { client_id, redirect_uri, code_challenge, state } = req.query;
    // Render your login UI; on success, mint a code bound to the PKCE challenge.
    const code = await mintAuthCode({
      clientId: String(client_id),
      challenge: String(code_challenge),
    });
    res.redirect(`${redirect_uri}?code=${code}&state=${state}`);
  });

  app.post("/token", express.urlencoded({ extended: false }), async (req, res) => {
    const { code, code_verifier, grant_type } = req.body;
    if (grant_type !== "authorization_code") {
      return res.status(400).json({ error: "unsupported_grant_type" });
    }
    const token = await exchangeCode({ code, verifier: code_verifier });
    res.json({
      access_token: token,
      token_type: "Bearer",
      expires_in: 3600,
    });
  });
}

export function requireBearer(req: Request, res: Response, next: () => void) {
  const auth = req.header("authorization");
  if (!auth?.startsWith("Bearer ")) {
    res
      .status(401)
      .set("WWW-Authenticate", `Bearer realm="mcp", resource="${process.env.ISSUER}"`)
      .json({ error: "unauthorized" });
    return;
  }
  validateToken(auth.slice(7)).then((user) => {
    (req as any).user = user;
    next();
  });
}

Wire mountOAuth(app, ...) before your /mcp route and add requireBearer as middleware. The 401 response with a WWW-Authenticate header is what triggers the client to start the OAuth dance - without it, clients silently fail. For internal-only servers you can skip OAuth entirely and require a static x-api-key header. The OAuth path is for anything installed by end users.

Deploying to Vercel

Vercel is the path of least resistance for a remote MCP server in 2026. The streamable HTTP transport maps cleanly to Vercel Functions - one route, long-lived streams as needed, OAuth on standard routes. Two settings matter: maxDuration (raise it; long-lived SSE sessions can sit open for minutes), and the runtime (Node, not Edge - the SDK uses Node streams that Edge does not support cleanly).

// app/mcp/route.ts (Next.js App Router)
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { registerSearchTool } from "@/lib/tools/search";

export const runtime = "nodejs";
export const maxDuration = 300;

const sessions = new Map<string, StreamableHTTPServerTransport>();

async function handler(req: Request) {
  const sessionId = req.headers.get("mcp-session-id") ?? undefined;
  let transport = sessionId ? sessions.get(sessionId) : undefined;

  if (!transport) {
    transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => crypto.randomUUID(),
      onsessioninitialized: (id) => sessions.set(id, transport!),
      onsessionclosed: (id) => sessions.delete(id),
    });
    const server = new McpServer({ name: "ergini-mcp", version: "0.1.0" });
    registerSearchTool(server);
    await server.connect(transport);
  }

  // Adapter from Fetch Request to Node IncomingMessage handled by SDK helper.
  return transport.handleFetchRequest(req);
}

export { handler as GET, handler as POST, handler as DELETE };

Two cold-start tips. First, pre-register everything at module scope so the function only constructs the server on the first request per instance - the second request through the same warm instance reuses it. Second, if your tool handlers hit a database, wrap the connection in a singleton so you do not open a fresh pool per invocation. Both tricks cut warm-call latency from 300ms to under 80ms.

For session state that needs to survive across Vercel function instances, do not use the in-memory Map from the examples. Move sessions to Redis (Upstash is one click on the Vercel marketplace) and serialize the transport state on onsessioninitialized. Most simple MCP servers do not need cross-instance sessions because every request initializes a fresh session anyway - but anything with long-running streamed responses does.

Wiring into Claude Code, Cursor, Claude Desktop

The point of MCP is that the same server works in every client. Here is the config snippet for the three clients that matter most in 2026.

Claude Desktop reads from a JSON file on your machine. On macOS it lives at ~/Library/Application Support/Claude/claude_desktop_config.json. For a stdio server:

{
  "mcpServers": {
    "my-server": {
      "command": "npx",
      "args": ["-y", "my-mcp-server"],
      "env": { "DOCS_DIR": "/Users/me/notes" }
    }
  }
}

For a remote server with OAuth, the config is even shorter - Claude Desktop handles the OAuth dance internally:

{
  "mcpServers": {
    "my-server": {
      "url": "https://mcp.example.com/mcp"
    }
  }
}

Claude Code uses the claude mcp add CLI command or a project-scoped .mcp.json file in your repo root. Project-scoped servers are committed to git and shared with the whole team - every dev who clones the repo gets your tooling automatically when they open it in claude.ai/code.

# Add a stdio server
claude mcp add my-server -- npx -y my-mcp-server

# Add a remote server (Claude Code handles the OAuth dance)
claude mcp add my-server --transport http https://mcp.example.com/mcp

# Or commit .mcp.json to your repo
{
  "mcpServers": {
    "my-server": {
      "command": "npx",
      "args": ["-y", "my-mcp-server"]
    }
  }
}

Cursor lives at ~/.cursor/mcp.json globally or .cursor/mcp.json per-project. Same shape as Claude Desktop, slightly different path. Cursor was the second major client to ship MCP and the support is solid.

{
  "mcpServers": {
    "my-server": {
      "url": "https://mcp.example.com/mcp"
    }
  }
}

Restart each client after editing the config. Claude Desktop is the worst offender here - it caches the config aggressively and a misconfigured server fails silently. Watch the logs (Help → Show Logs) on first connect.

Testing your MCP server

Two layers of testing matter. The first is the official inspector - a small web UI that connects to any MCP server, lists the catalog, and gives you forms to call every tool, read every resource, and render every prompt. Run it against your server with one command:

# Inspector against a local stdio server
npx @modelcontextprotocol/inspector tsx src/server.ts

# Inspector against a remote HTTP server
npx @modelcontextprotocol/inspector --transport http http://localhost:3000/mcp

The second layer is automated. Instantiate the MCP client from the same SDK in your test runner, point it at your server in stdio or HTTP mode, and assert on the catalog and on tool outputs. Contract tests like this catch schema drift before it reaches a real client - far cheaper than discovering it when Claude Desktop refuses to load your server.

// test/contract.test.ts
import { test, expect } from "vitest";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

test("server exposes the expected tools", async () => {
  const client = new Client({ name: "test", version: "0.0.1" });
  const transport = new StdioClientTransport({
    command: "tsx",
    args: ["src/server.ts"],
  });
  await client.connect(transport);

  const { tools } = await client.listTools();
  const names = tools.map((t) => t.name).sort();
  expect(names).toEqual(["echo", "search_docs"]);

  const result = await client.callTool({
    name: "echo",
    arguments: { message: "ping" },
  });
  expect(result.content[0].text).toBe("ping");

  await client.close();
});

The same pattern works for evaluating tool behavior end-to-end - run your real LLM client against the server in a CI job, capture transcripts, and check that tool calls happen at the expected decision points. I cover the broader pattern in my agentic RAG architecture post - the eval discipline transfers directly to MCP-backed agents.

Real example: a Notion MCP server (10 lines)

The smallest useful real server I can write is a Notion search tool. Ten lines of business logic, two of plumbing, and you have a server that lets every editor in your stack search your workspace.

// src/notion-server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { Client as Notion } from "@notionhq/client";
import { z } from "zod";

const notion = new Notion({ auth: process.env.NOTION_TOKEN! });
const server = new McpServer({ name: "notion-mcp", version: "0.1.0" });

server.registerTool(
  "notion_search",
  {
    title: "Search Notion",
    description: "Full-text search across the connected Notion workspace.",
    inputSchema: { query: z.string(), pageSize: z.number().int().max(20).default(5) },
  },
  async ({ query, pageSize }) => {
    const r = await notion.search({ query, page_size: pageSize });
    return { content: [{ type: "text", text: JSON.stringify(r.results, null, 2) }] };
  }
);

await server.connect(new StdioServerTransport());

Drop this into your claude_desktop_config.json with your Notion integration token in env and Claude can search your workspace from any chat. Same five-minute install in Cursor, Claude Code, or Zed. That is the leverage MCP unlocks - one server, every client.

Production gotchas

Five issues bite every team on first ship. All of them are easy to fix in advance and hellish to debug once your server is live in a teammate's editor.

Stdout pollution kills stdio servers. Any console.log in your code path writes to the JSON-RPC wire and corrupts the next message. Use console.error for all logging. If a third-party dep logs to stdout, monkey-patch process.stdout.write or redirect via a logger shim.
Tool descriptions are the API. The model only knows what your description tells it. Be explicit about when to call the tool, what each parameter means, what the output looks like, and what error conditions to expect. Bad descriptions cause silent misuse, not loud failures.
Schema versioning is on you. MCP has no built-in versioning for tool schemas. If you rename a parameter, every deployed client breaks until it reconnects. Treat tool names and input schemas as a public API - add new tools or new optional parameters, never break existing ones.
Error messages flow back to the model. If your tool throws, the SDK surfaces the error as a tool result with isError: true. The model reads the error text and decides what to do. Write actionable errors ("rate limit exceeded, retry in 30s"), not stack traces. A good error lets the model recover; a bad one sends it in circles.
Rate limits and timeouts. The host model does not know your downstream APIs have limits. Add internal rate limiting per tool call, surface 429s as structured errors, and set a hard timeout (15 to 30 seconds) on every tool execution. Long-running work belongs in a background job the model can poll, not a synchronous tool call.

For deeper coverage of agent failure modes that map directly to MCP server design, my AI agent design patterns post walks through the reflection, planner-executor, and tool-use patterns that determine how robust your server feels in production. And if you are building tool calls into a Next.js frontend at the same time, the Vercel AI SDK tool calling tutorial covers the client-side patterns that complement the server-side ones here.

If you want a senior engineer who has actually shipped MCP servers in production - typed tools, OAuth, remote deployment, the full client-wiring matrix - my AI agent development practice covers exactly this scope, and AI integration when the server needs to wire into existing internal systems. I work with teams worldwide, and you can also hire an AI developer in Kosovo directly. Same person behind OmniAPI, the universal API gateway that ships its own MCP server out of the box.

Frequently asked questions

What is an MCP server?

An MCP (Model Context Protocol) server is a small program that exposes tools, resources, and prompts to LLM clients like Claude Desktop, Claude Code, or Cursor through a standardized JSON-RPC protocol. Think of it as the USB-C of LLM tooling - write one server, plug it into any compatible client. A typical MCP server runs as a local subprocess over stdio or as a remote HTTP/SSE service, and answers requests like 'list your tools' or 'call this tool with these arguments' so the model can take actions outside its context window.

Why build an MCP server instead of just calling APIs from my agent?

Three reasons. First, portability: the same MCP server works in Claude Desktop, Claude Code, Cursor, Zed, Continue, and any other MCP client without rewriting the integration. Second, discoverability: clients introspect your server at startup and the model sees a typed tool catalog automatically. Third, separation of concerns: your business logic lives in the server with proper auth, logging, and rate limiting, while the model client just consumes a clean tool interface. The cost is one extra hop and a small protocol overhead.

Should I use stdio or remote HTTP/SSE transport?

Use stdio for personal tools and anything that wraps local resources (your filesystem, your database CLI, a desktop app). It is dead simple, has no auth surface, and the client spawns it as a subprocess. Use HTTP/SSE (or the newer streamable HTTP transport) for anything multi-user, anything that should be shared across a team, or anything that runs in a browser-based client. Remote transports give you a real network surface, real auth, and a single source of truth - at the cost of having to handle OAuth, CORS, and session management.

Does MCP only work with Claude?

No. MCP was created by Anthropic but the spec is open and the client list is growing fast in 2026. Claude Desktop, Claude Code, Cursor, Zed, Continue, Cline, Sourcegraph Cody, and Block all ship MCP support. OpenAI's Agents SDK added MCP-client support in late 2025. If you write a server today against the public spec, it will work in everything that matters in your editor and chat stack.

How do I add authentication to a remote MCP server?

The spec uses OAuth 2.1 with PKCE for user-attributable auth on remote servers. The minimal pattern is to expose the standard OAuth discovery endpoints (/.well-known/oauth-authorization-server and /.well-known/oauth-protected-resource), implement authorization-code-with-PKCE on /authorize, issue bearer tokens at /token, and validate the Bearer header on every tool call. For internal servers you can shortcut this with a static API key in the Authorization header, but every real client will eventually want OAuth.

What is the difference between a tool and a resource in MCP?

Tools are actions the model invokes - they take typed input, perform side effects, and return results (call an API, write to a database, send a message). Resources are read-only blobs of context the model can pull in - a file, a database row, a URL. Tools show up to the model the way function calls do; resources show up the way attachments do. Rule of thumb: if the model needs to do something, ship a tool. If the model needs to know something, ship a resource.

Can I deploy an MCP server to Vercel?

Yes, and it is the easiest path for a remote MCP server in 2026. The SSE and streamable-HTTP transports map cleanly to Vercel Functions and Edge Functions. The two gotchas are cold starts (keep the function warm if your server is latency-sensitive) and stateful sessions (SSE wants a long-lived connection - set maxDuration and prefer the streamable HTTP transport for better serverless behavior). I cover the vercel.json config and a working handler later in this post.

How do I test an MCP server before wiring it into a client?

Use the official @modelcontextprotocol/inspector. It is a small GUI that connects to any MCP server (stdio or HTTP) and lets you list tools, call them with form-generated inputs, list resources, and read prompts - all without touching Claude or Cursor. For CI, instantiate the MCP client from the same SDK and call your server from inside a test. That gives you contract tests that catch schema drift before it reaches a real client.