Read this first
If your prototype is hitting api.openai.com with patient data on a personal API key, you are out of HIPAA compliance — even in dev. Move to Enterprise (and a signed BAA) before your first production deployment. This guide assumes you're on that path.
The seven-step checklist
1.Map your PHI flow before you write a prompt
Draw the data path: where does PHI enter (patient portal, EHR sync, clinician input)? Where does it sit (database, file storage, message queue)? Where does it go (AI model, retrieval index, logging, error monitoring)? Every node is a potential BAA requirement.
2.Get on the Enterprise tier of every AI provider that touches PHI
Consumer tiers (ChatGPT Plus, Claude.ai Pro, Gemini Advanced) will NOT sign a BAA. As of 2026, OpenAI Enterprise + Education, Anthropic Enterprise, Google Cloud Vertex AI (Healthcare API) WILL sign BAAs with model-training opt-out. Verify the contract terms; don't assume.
3.Disable model training EXPLICITLY
Even on Enterprise tiers, model-training opt-out is sometimes a separate setting. OpenAI: confirm zero data retention (ZDR) is enabled for your org. Anthropic: confirm 'do not train on Customer Data' clause in your DPA. Save the screenshots; auditors will ask.
4.Log every prompt and response
HIPAA §164.312(b) requires audit controls for ePHI access. AI calls that include PHI are ePHI access events. Log the prompt, the response, the timestamp, the user, and the model version to a tamper-evident store. Retain 6 years. Yes, this is a lot of data. Plan for it.
5.Never send PHI to a code-completion tool
Copilot, Cursor, Codeium do NOT sign BAAs as of 2026 (Copilot Enterprise has more controls but check terms). If your source code contains hardcoded PHI in test fixtures, the model has seen it. Strip PHI from code at ingestion time; configure pre-commit hooks; document the policy.
6.Vendor risk-review the embedding + vector DBs too
RAG workflows often store embeddings of PHI in vector DBs (Pinecone, Weaviate, pgvector). Embeddings can be partially reversible. Treat the vector store as a PHI store: BAA required, encryption at rest, access controls. pgvector inside your existing HIPAA-covered Postgres is the simplest path.
7.Document the model card + risk assessment
For every model you use on PHI, document: provider, version, training-data policy, where PHI is sent, where it's logged, who can read the logs, retention, BAA expiry. This is the §164.308(a)(1) risk analysis for AI. Auditors expect it. KollGuard's evidence package bundles this with your other artifacts.
AI provider BAA matrix (2026)
| Provider | Tier that signs a BAA | Model-training opt-out |
|---|---|---|
| OpenAI | Enterprise + Education (Edu) | Zero Data Retention available, must enable per-org |
| Anthropic | Enterprise (Claude for Business / via Bedrock) | Yes — "do not train on Customer Data" clause |
| Vertex AI (with GCP Healthcare BAA) | Yes — opt-in to training only | |
| AWS Bedrock | Standard (AWS BAA covers it) | Yes — by default |
| Azure OpenAI | Standard (Microsoft BAA covers it) | Yes — by default |
| Consumer tiers (ChatGPT, Claude.ai, Gemini app) | No | N/A — do not use for PHI |
Frequently asked
- Can I build a HIPAA-compliant product on OpenAI's API?
- Yes, on OpenAI Enterprise (or OpenAI Education for academic medical centers) with a signed BAA and zero-data-retention enabled. Consumer ChatGPT Plus will NOT work for PHI. The same logic applies to Anthropic Enterprise and Google Vertex AI Healthcare.
- What about embedding models on Hugging Face / self-hosted?
- If you host the embedding model yourself (e.g., sentence-transformers running on your HIPAA-eligible infrastructure), no BAA needed for the model — the data never leaves your control. This is often the cleanest path for early-stage healthtech.
- Does HIPAA apply to AI scribes / ambient documentation tools?
- If the tool processes a real patient encounter and produces clinical notes, yes — it's PHI in motion. The major scribes (Abridge, DAX Copilot, Nuance) all sign BAAs. Smaller vendors might not. Always verify.
- What if my AI workflow produces a 'medical recommendation'?
- Beyond HIPAA, you may trigger FDA Software as a Medical Device (SaMD) regulations. Get legal counsel early; the line between clinical decision support and SaMD is narrower than you think.
- Is this guide legal advice?
- No. KollGuard provides informational guidance on technical compliance patterns. Engage healthcare counsel for HIPAA legal questions, FDA SaMD scoping, and BAA contract review.
