The core risk in one sentence
Whatever tools your MCP server exposes, a prompt-injected model can be talked into using — so an unsandboxed server with broad scope and shared credentials is a remote-code-execution surface wearing a friendly name.
The seven-step checklist
1.Treat every MCP server as a privileged service
An MCP (Model Context Protocol) server exposes tools — file access, shell, database queries, HTTP — to an LLM. Whatever the server can do, the model can be talked into doing. Inventory each MCP server like you would any service with production access, and write down exactly which tools it exposes.
2.Scope tools to least privilege — and prefer read-only
Don't ship a filesystem server rooted at /, or a database server with a superuser role, because it was easy. Constrain each tool to the narrowest path, table, or scope it needs. Read-only by default; require an explicit, separately-audited capability for any write or shell tool.
3.Give each MCP server its own credential
Never reuse a human or shared service token. Each MCP server gets its own identity and key so you can revoke or rotate it in isolation, and so its actions are attributable to that server alone in your logs.
4.Sandbox and pin the server itself
MCP servers are often npm/pip packages run with broad host access. Run them in a sandbox (container, restricted user, no ambient cloud credentials), pin exact versions, and review the package before adoption — a malicious or compromised MCP server is a supply-chain foothold straight into your tool surface.
5.Assume prompt injection and add a human gate for dangerous tools
Untrusted content the model reads (a web page, a file, an email) can carry instructions that try to invoke your tools. For any irreversible or sensitive action — delete, deploy, send, pay — require explicit human confirmation rather than letting the agent act unattended.
6.Record every tool call — tamper-evidently
Log each invocation: which server, which tool, arguments, result, timestamp. Store it hash-chained so the record cannot be quietly rewritten. This is both your incident-response trail and your SOC 2 / HIPAA audit evidence for what the AI actually did.
7.Monitor for drift and silence
Alert when an MCP server starts calling a tool it never called before, when call volume spikes, or when a server that should run on a cadence goes silent. Those are the early signals of a compromised, prompt-injected, or misconfigured server.
Watch your MCP servers with Agent Watch
KollGuard's Agent Watch treats an MCP server like any other agent you deploy: it records each run as a tamper-evident, hash-chained heartbeat, alerts on health and behavior drift, and maps the activity to SOC 2 and HIPAA controls — so a server that goes silent or starts doing more than it should doesn't slip past you.
Frequently asked
- What is MCP server security?
- MCP server security is the practice of securing Model Context Protocol servers — the components that expose tools (files, shell, databases, HTTP) to an LLM. It covers least-privilege tool scoping, per-server credentials, sandboxing the server, prompt-injection defenses with human gates for dangerous actions, and tamper-evident logging of every tool call.
- Why are MCP servers a security risk?
- An MCP server hands an LLM real capabilities, and the model can be manipulated through prompt injection into misusing them. A server with broad filesystem, shell, or database access — running unsandboxed with shared credentials and no audit log — is a direct path from untrusted input to production damage.
- How do you monitor an MCP server?
- Record every tool call (server, tool, arguments, result) to a tamper-evident log, give each server its own identity, and alert on behavior drift (new tool calls, volume spikes) and on silence (a server that stopped running). KollGuard's Agent Watch does this and maps it to SOC 2 / HIPAA controls.
- Do MCP servers fall under SOC 2 or HIPAA?
- If an MCP server can touch data in scope (PHI, customer data) or holds production credentials, its tool calls are access events under SOC 2 CC6/CC7 and HIPAA §164.312 audit controls — so they need to be logged, retained, and reviewable like any other privileged access.
