Guide · 10 min read

HIPAA for AI startups (the actually-practical guide)

Building an AI product that touches patient data? HIPAA applies — and most consumer AI tiers won't help you. Here are the seven concrete steps an early-stage AI healthtech startup actually takes to ship into US health systems in 2026, with the specific provider tiers and contract language that work.

HIPAAAIOpenAIAnthropicBAAhealthtech§164.312

Read this first

If your prototype is hitting api.openai.com with patient data on a personal API key, you are out of HIPAA compliance — even in dev. Move to Enterprise (and a signed BAA) before your first production deployment. This guide assumes you're on that path.

The seven-step checklist

  1. 1.Map your PHI flow before you write a prompt

    Draw the data path: where does PHI enter (patient portal, EHR sync, clinician input)? Where does it sit (database, file storage, message queue)? Where does it go (AI model, retrieval index, logging, error monitoring)? Every node is a potential BAA requirement.

  2. 2.Get on the Enterprise tier of every AI provider that touches PHI

    Consumer tiers (ChatGPT Plus, Claude.ai Pro, Gemini Advanced) will NOT sign a BAA. As of 2026, OpenAI Enterprise + Education, Anthropic Enterprise, Google Cloud Vertex AI (Healthcare API) WILL sign BAAs with model-training opt-out. Verify the contract terms; don't assume.

  3. 3.Disable model training EXPLICITLY

    Even on Enterprise tiers, model-training opt-out is sometimes a separate setting. OpenAI: confirm zero data retention (ZDR) is enabled for your org. Anthropic: confirm 'do not train on Customer Data' clause in your DPA. Save the screenshots; auditors will ask.

  4. 4.Log every prompt and response

    HIPAA §164.312(b) requires audit controls for ePHI access. AI calls that include PHI are ePHI access events. Log the prompt, the response, the timestamp, the user, and the model version to a tamper-evident store. Retain 6 years. Yes, this is a lot of data. Plan for it.

  5. 5.Never send PHI to a code-completion tool

    Copilot, Cursor, Codeium do NOT sign BAAs as of 2026 (Copilot Enterprise has more controls but check terms). If your source code contains hardcoded PHI in test fixtures, the model has seen it. Strip PHI from code at ingestion time; configure pre-commit hooks; document the policy.

  6. 6.Vendor risk-review the embedding + vector DBs too

    RAG workflows often store embeddings of PHI in vector DBs (Pinecone, Weaviate, pgvector). Embeddings can be partially reversible. Treat the vector store as a PHI store: BAA required, encryption at rest, access controls. pgvector inside your existing HIPAA-covered Postgres is the simplest path.

  7. 7.Document the model card + risk assessment

    For every model you use on PHI, document: provider, version, training-data policy, where PHI is sent, where it's logged, who can read the logs, retention, BAA expiry. This is the §164.308(a)(1) risk analysis for AI. Auditors expect it. KollGuard's evidence package bundles this with your other artifacts.

AI provider BAA matrix (2026)

ProviderTier that signs a BAAModel-training opt-out
OpenAIEnterprise + Education (Edu)Zero Data Retention available, must enable per-org
AnthropicEnterprise (Claude for Business / via Bedrock)Yes — "do not train on Customer Data" clause
GoogleVertex AI (with GCP Healthcare BAA)Yes — opt-in to training only
AWS BedrockStandard (AWS BAA covers it)Yes — by default
Azure OpenAIStandard (Microsoft BAA covers it)Yes — by default
Consumer tiers (ChatGPT, Claude.ai, Gemini app)NoN/A — do not use for PHI

Frequently asked

Can I build a HIPAA-compliant product on OpenAI's API?
Yes, on OpenAI Enterprise (or OpenAI Education for academic medical centers) with a signed BAA and zero-data-retention enabled. Consumer ChatGPT Plus will NOT work for PHI. The same logic applies to Anthropic Enterprise and Google Vertex AI Healthcare.
What about embedding models on Hugging Face / self-hosted?
If you host the embedding model yourself (e.g., sentence-transformers running on your HIPAA-eligible infrastructure), no BAA needed for the model — the data never leaves your control. This is often the cleanest path for early-stage healthtech.
Does HIPAA apply to AI scribes / ambient documentation tools?
If the tool processes a real patient encounter and produces clinical notes, yes — it's PHI in motion. The major scribes (Abridge, DAX Copilot, Nuance) all sign BAAs. Smaller vendors might not. Always verify.
What if my AI workflow produces a 'medical recommendation'?
Beyond HIPAA, you may trigger FDA Software as a Medical Device (SaMD) regulations. Get legal counsel early; the line between clinical decision support and SaMD is narrower than you think.
Is this guide legal advice?
No. KollGuard provides informational guidance on technical compliance patterns. Engage healthcare counsel for HIPAA legal questions, FDA SaMD scoping, and BAA contract review.

Scan your healthtech stack today

First scan free. Maps to HIPAA Security Rule citations directly.