BACK

Building a Postgres MCP server that survives production

Francisc Toth · FEB 18, 2026 · 9 minutes

A Postgres MCP server is easy to write. A Postgres MCP server that doesn't kill your database is hard. Here's the gap, and how to close it.

The naive version

The first version of a Postgres MCP server is about 50 lines:

import { Server } from "@modelcontextprotocol/sdk/server"
import { Client } from "pg"

const client = new Client(process.env.DATABASE_URL) await client.connect()

const server = new Server({ name: "postgres", version: "0.1.0" })

server.tool("query", { description: "Run a SQL query", schema: { query: z.string() } }, async ({ query }) => { const result = await client.query(query) return { rows: result.rows } })

server.listen() ```

This works. You can connect from Claude Desktop and ask questions about your database. It's also a disaster waiting to happen.

The model will write DROP TABLE users if you ask the wrong way. It will run SELECT * FROM events on a billion-row table and crash your process trying to serialize the response. It will hold connections open during slow queries and exhaust your pool.

Here's what changes between the demo and production.

Query safety: read-only by default

The first thing to add is a real query parser, not a regex.

You cannot rely on "starts with SELECT" to block writes. The model is creative. It will craft a CTE that wraps a DELETE inside a SELECT. It will use WITH x AS (DELETE FROM ...) SELECT 1. Regex-based filters lose this game.

Use a proper SQL parser like pg-query-emscripten. Parse the query into an AST. Walk the AST and reject anything that isn't a SELECT at the top level or inside a non-data-modifying CTE.

import { parse } from "pg-query-emscripten"

function isReadOnly(sql: string): boolean { const tree = parse(sql) return tree.stmts.every(stmt => stmt.stmt?.SelectStmt !== undefined ) } ```

The parser is 2MB compiled. Worth it.

Query safety: row limits

Even with read-only enforcement, an unbounded SELECT will hurt you. Two patterns work.

Mandatory LIMIT clause. Rewrite the query to inject a LIMIT if the user didn't provide one. The model writes SELECT * FROM events, the server rewrites to SELECT * FROM events LIMIT 100. Document the limit in the tool description so the model knows it exists.

EXPLAIN before execute. Run EXPLAIN (FORMAT JSON) <query> first. Parse the estimated row count. Reject queries above a threshold (say, 100,000 rows).

The combination is what production servers actually use. LIMIT for the cases where the model wants to scan an unbounded table. EXPLAIN for the cases where a JOIN explodes into a billion rows from what looks like a small query.

Connection pooling

pg.Client is for one-shot scripts. pg.Pool is for production. The difference matters more than you'd think.

import { Pool } from "pg"

const pool = new Pool({ connectionString: process.env.DATABASE_URL, max: 10, idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000 })

server.tool("query", { ... }, async ({ query }) => { const client = await pool.connect() try { const result = await client.query(query) return { rows: result.rows } } finally { client.release() } }) ```

The try/finally matters. Without it, an exception leaks the connection. After a few exceptions you've exhausted the pool and every subsequent request hangs.

Tune max based on your database tier. RDS db.t3.small allows 30 connections total. If you have three server instances, 10 per pool is fine. Going higher creates contention you don't want.

Schema introspection

The model needs to know what tables and columns exist. Without that, it's guessing.

Don't expose a tool that returns the schema on every call. The schema is large and changes rarely. Use a resource instead.

server.resource("schema://tables", {
  name: "Database schema",
  description: "Tables, columns, types, primary keys, foreign keys"
}, async () => {
  const result = await pool.query(`
    SELECT table_name, column_name, data_type
    FROM information_schema.columns
    WHERE table_schema = 'public'
    ORDER BY table_name, ordinal_position
  `)
  return { content: JSON.stringify(result.rows) }
})

The client caches the resource and refreshes when the server signals a change. Tool calls stay focused on actual queries.

Statement timeouts

A query that runs for 30 seconds is a query you don't want. Set a statement timeout at the database level:

const client = await pool.connect()
try {
  await client.query("SET statement_timeout = 10000")
  const result = await client.query(query)
  return { rows: result.rows }
} finally {
  client.release()
}

This forces Postgres to cancel the query if it runs too long. Without it, a runaway query holds the connection until someone manually kills it.

Response size limits

A query returning 100 rows of 10 columns is fine. A query returning 100 rows of TEXT columns containing 100KB each is a 100MB response. The model's context window doesn't fit that.

Cap response size by row count AND by serialized byte size:

const result = await client.query(query)
let totalBytes = 0
const rows = []
for (const row of result.rows) {
  const json = JSON.stringify(row)
  if (totalBytes + json.length > 100000) break
  rows.push(row)
  totalBytes += json.length
}
return { rows, truncated: rows.length < result.rows.length }

The truncated flag lets the model know to refine its query.

Error handling

Don't leak database errors to the model. They contain schema details, server hostnames, and sometimes data.

try {
  const result = await client.query(query)
  return { rows: result.rows }
} catch (err) {
  if (err.code === "42P01") {
    return { error: "Table not found. Check the schema resource." }
  }
  if (err.code === "42703") {
    return { error: "Column not found in the referenced table." }
  }
  return { error: "Query failed. Check your SQL syntax." }
}

Translate Postgres error codes into messages the model can act on. Hide the underlying error. Log it on the server side for debugging.

What we built into Toolcall

The Postgres template ships with all of this: parser-based read-only enforcement, mandatory LIMIT injection, EXPLAIN-based scan blocking, connection pooling tuned to your database tier, schema as a resource, 10-second default statement timeout, 100KB response cap.

You can disable each guard individually if you have a reason. The defaults are what we found to be safe for most production databases.

If you're building your own from scratch, treat this list as the minimum. The naive version works in a demo. The production version is what survives a curious model and a busy weekend.