The compliance floor
Healthcare AI agent deployments in the US operate under HIPAA, which imposes specific obligations on how Protected Health Information (PHI) is accessed, stored, transmitted, and disclosed. Meeting HIPAA is not optional and not a detail: it's the floor that determines whether the deployment is viable at all.
The practical baseline: signed Business Associate Agreements (BAAs) with every processor in the stack (the model provider, the telephony provider, the hosting provider, any third-party enrichment services), encryption in transit (TLS 1.2+) and at rest (AES-256 or equivalent), role-based access controls, audit logging on every PHI access, and minimum necessary disclosure (the agent sees and returns only the PHI required for the specific interaction).
This floor is achievable. What makes it hard is rigor: a deployment that meets HIPAA on paper but leaks PHI into prompts, logs PHI in plaintext outside the audit system, or lacks proper access controls is a deployment that will fail its first compliance review. The work is in the details, not the headline.
EHR integration: the FHIR path
For any agent that needs scheduling, patient record access, insurance verification, or charting, EHR integration is the spine of the deployment. FHIR (Fast Healthcare Interoperability Resources) is the standard protocol, and all major EHRs — Epic, Cerner/Oracle Health, Athenahealth, eClinicalWorks, NextGen — expose FHIR APIs.
Practical notes: FHIR implementations are not uniform across vendors. Epic's FHIR API, while compliant, has its own quirks in resource modeling and scopes. Cerner, eClinicalWorks, and Athenahealth each have their own gaps. The integration work is 40% FHIR and 60% vendor-specific handling — budget for it.
Where FHIR falls short, we use vendor-specific APIs (proprietary REST, HL7 v2, database-level integration in rare on-premise cases) — but only as a backstop. The less vendor-specific the integration, the more portable the deployment across the customer's evolving EHR landscape.
What the agent does vs. what the agent refuses
The single most important design decision in healthcare agent deployments is scope. An agent that tries to answer "what should I do about this symptom" is an agent that will either fail its first clinical review or produce an incident the organization has to explain. The answer is to bound scope explicitly in the agent constitution.
What agents handle end-to-end: appointment scheduling, rescheduling, cancellation, reminders, insurance verification (routing the coverage question, not answering it), pre-visit intake form collection, post-visit follow-up (satisfaction, basic care instructions referenced to a clinician), prescription refill intake (staging for clinician review).
What agents route to humans: any clinical question (symptoms, medication side-effects, dosage, diagnosis), insurance coverage disputes, complaints, distress signals, requests from parents or third parties about protected records, anything outside the documented scope.
The routing is not soft — it's enforced by the policy layer at runtime. Every routed call is logged with reason, context, and target queue.
Authentication and minimum necessary
Before disclosing any PHI, the agent must authenticate the caller. Patterns that work: knowledge-based authentication (date of birth plus one additional factor), secondary channel confirmation (text verification code), or integration with the patient portal identity provider. The right choice depends on the organization's existing patient identity model.
"Minimum necessary" means the agent discloses only the information required for the specific interaction. An appointment reminder reveals the appointment time and provider; it does not reveal reason for visit. A scheduling call reveals availability for the relevant department; it does not reveal other visits.
This is implemented as both a retrieval-time control (the agent's knowledge base returns minimum necessary) and a policy-time control (the response filter refuses to include PHI beyond the scope of the question).
Clinical escalation design
Every healthcare deployment includes at least three escalation paths: clinical escalation (anything that needs a licensed practitioner), administrative escalation (anything that needs a human staff member but not a clinician), and urgent escalation (distress, emergency, safety signal).
Urgent escalation is the most important. The agent is trained to recognize distress signals — mentions of self-harm, severe pain, chest symptoms, breathing difficulty, severe bleeding — and routes to the appropriate resource immediately (often a triage nurse, sometimes direct-to-911 with full disclosure that the caller is being transferred). These thresholds are tuned conservatively and reviewed quarterly.
Clinical escalation covers the gray area between administrative and urgent: questions that need a clinician's judgment but aren't emergencies. Routing here is typically to a triage queue with defined response-time SLAs, and every escalation is logged for review.
Staged rollout
Healthcare agents do not launch network-wide on day one. A viable rollout: single clinic pilot (2–4 weeks) → single specialty across multiple sites (4–6 weeks) → network-wide (ongoing, with phased expansion). Each stage has a compliance review gate.
The pilot clinic should include: clear success metrics defined up front (response time, no-show rate, staff time saved, CSAT), daily conversation review for the first two weeks, a compliance review at the end of the pilot period, and a go/no-go decision with named decision-makers.
Between stages, we review: conversation logs, escalation logs, policy-block rate (how often the agent refused per policy), and any incidents. Policy amendments and constitution updates happen between stages, not mid-stage.