The Anatomy of an AI Doctor
Most things called 'AI Doctors' today are something else. Here is what would actually have to be true for the term to mean what it sounds like. The substrate, the reasoning, the levels of autonomy, the liability, the honest gap.
The phrase AI Doctor does a lot of work. It is used for a chatbot you ask about a rash, for a multi-agent system that just got authorized to renew prescriptions in one US state, for a $12B clinical decision support tool that only physicians can log into, and for the consumer membership product that sends you a quarterly biomarker panel.
These are not the same thing. Treating them as the same thing is how the category gets diluted into a marketing claim, and how patients end up trusting tools that were never meant to carry that trust.
This essay decomposes the actual machinery: the data substrate the AI has to see, the agentic reasoning that runs on top of it, the levels of autonomy a system can hold, and the liability that has to follow each level. By the end you should have a framework you can use to look at any AI health product and locate it on the map. Both what it actually does today, and what it would take for it to credibly hold the title.
One number to start with, because it sets the calibration for everything that follows. The first FDA-cleared autonomous AI in any field of medicine was approved in 2018. It reads a single image, a retinal photograph, for a single condition: diabetic retinopathy. Eight years later, no general-purpose AI Doctor has equivalent regulatory status. That is the gap between what the term sounds like and what currently exists.
The substrate that quietly arrived
Before talking about models, it is worth noticing that the substrate an AI Doctor needs already exists in the United States, and that almost nobody outside healthcare policy circles can name how it got built.
The substrate that quietly arrived
Three decades of policy made an AI Doctor possible
2009
HITECH Act
$27B in incentives → 96% hospital EHR adoption
2016
21st Century Cures
Patient API access becomes legal
2018
IDx-DR
First FDA-cleared autonomous AI in medicine
2018
Apple Health Records
FHIR records inside the iPhone
2020
USCDI v1
Standardized clinical data shape goes live
2024
TEFCA QHINs
National FHIR-based exchange
2024
Stelo OTC CGM
Continuous glucose for non-diabetics
Today
Substrate complete
Bottleneck moves to integration, consent, trust
The HITECH Act in 2009 dropped roughly twenty-seven billion dollars of incentives into electronic health record adoption. In 2008, about ten percent of US hospitals had an EHR. By 2017, the number was ninety-six percent. Office-based physicians went from eighteen percent in 2001 to seventy-eight percent in 2021. The annual hospital adoption rate jumped from 3.2% per year before HITECH to 14.2% per year after. That is the substrate. Every clinical encounter started turning into a structured digital record because the federal government paid hospitals to make it so.
The 21st Century Cures Act in 2016 then made that record legally portable. The information-blocking rule that came out of it gives patients the right to demand all of their electronic health information, structured and unstructured, electronically and at no cost. Penalties run up to one million dollars per violation. The Cures Act is the reason an app on your phone can ask Epic for your records without anyone faxing anything.
USCDI standardized the shape of that data: what fields a patient bundle has to contain, in what format. FHIR is the standard those fields are exchanged in. TEFCA, with its first Qualified Health Information Networks designated in 2023 and 2024, is the national exchange backbone, FHIR-based, piloting through 2025 and 2026. Apple Health Records and Google Health Connect put FHIR-shaped records into the phone people open daily. Stelo, the first over-the-counter continuous glucose monitor, shipped in 2024, putting a real metabolic stream in the hands of people who do not have diabetes.
The shorthand version of the timeline is this: the bottleneck for AI in healthcare moved from data, to models, to integration, consent, and trust. The first bottleneck broke in 2009 when HITECH began. The second broke around 2023 when frontier models started clearing medical board exams. The third is what the next decade is actually about, and where most of the work to build something credibly called an AI Doctor will sit.
The data the AI has to see
A serious AI Doctor needs to read a stack of incompatible APIs as if they were one record. The HealthKit feed and the Google Health Connect feed each expose around 150 data types. The CGMs sync into both. The wearables (Whoop, Oura, Garmin, Apple Watch) go through HealthKit or their own SDKs. The labs come from Quest and Labcorp through patient portals or aggregators like Health Gorilla. The pharmacy data goes through Surescripts, which connects roughly ninety-five percent of US pharmacies. The EHR comes through SMART-on-FHIR for Epic and Oracle Cerner; Oracle moved to R4-only in late 2025. The patient-reported data, meaning symptoms and journals and notes, has no FHIR standard at all and gets stored in whatever schema the app decides on.
The integration tax is not building one of these. It is keeping eight live, with separate consent flows, each gated by per-user OAuth, with timestamp drift between wearables, with lab results lagging twenty-four to seventy-two hours, with EHR write-back limited to a handful of FHIR resources per vendor, and with roughly thirty percent of clinically relevant signal still in unstructured notes that none of the structured APIs see.
The harder question is what the AI is actually allowed to read about a given patient. That is not a technical question. It is a consent question. And the answer changes what the AI can reason about.
Context completeness
What the AI Knows About You
Select what you would actually let an AI Doctor see. The label below reflects what kind of reasoning becomes possible at that level of completeness.
Patient access rights, the legal scaffolding that lets a patient pull all of this together, exist now because of Cures. The technical scaffolding exists because of FHIR and HealthKit and Health Connect. The cultural scaffolding, meaning granular, durable, revocable, inspectable consent that a patient actually understands, does not yet exist as a default. The two best-funded "carry-it-with-you" personal health records are Apple Health Records and CommonHealth. The personal health record graveyard (Google Health from 2008 to 2011, Microsoft HealthVault from 2007 to 2019) is the cautionary tale: standalone PHRs failed because they had nothing to import. With FHIR and Cures, that no longer holds.
What is still walled off is also worth naming honestly. Specialty clinical notes in psychiatry and oncology are commonly excluded from patient-facing APIs under "psychotherapy notes" and reproductive-health carveouts. Imaging pixels live in DICOMweb, separate from FHIR. Genomics varies wildly between consumer (23andMe-style) and clinical (Invitae, Color, Helix). Social determinants and lifestyle data are shallow in USCDI. Specialty device telemetry from pacemakers, CPAPs, and smart inhalers is mostly OEM-walled. An AI Doctor that pretends to see all of this is bluffing.
Memory and reasoning
The frontier models in 2026 pass medical licensing exams on paper. GPT-5 reaches 95.84% on MedQA. Claude 3.7 Sonnet reaches 92.3%. Gemini 2.5 Pro reaches 94.6%. On MedXpertQA-MM, the multimodal expert-board reasoning benchmark across seventeen specialties, GPT-5 has surpassed trained medical professionals by large margins, a published ICML 2025 result. On HealthBench Hard, where 5,000 multi-turn conversations are rubric'd by 262 physicians across 48,562 criteria, OpenAI's gpt-5-thinking improved the score from 31.6% on o3 to 46.2%. Most importantly, the urgent-situation hallucination rate dropped more than fifty times between GPT-4o and gpt-5-thinking. Not a typo. Fifty.
That is the optimistic frame. The honest frame is that "passes the boards" and "can be a doctor" are not the same regime, and that confusing them is the error the field has been making most consistently.
Two regimes
Why “passes the boards” doesn’t mean “can be a doctor”
What the test asks
USMLE-style vignette
A 47-year-old woman presents with three weeks of fatigue. Vital signs are stable. Labs show TSH 8.2, free T4 0.7. Which of the following is the most appropriate next step?
- A. Order a thyroid ultrasound
- B. Begin levothyroxine 50 mcg daily
- C. Repeat TSH in 6 weeks
- D. Refer to endocrinology
Three paragraphs. One right answer. No prior context. No uncertainty.
What reality looks like
The same patient, eight years
- 2018 · First TSH elevated. Started levothyroxine 50 mcg.
- 2020 · PCOS diagnosis. Metformin added.
- 2022 · Sleep apnea found. CPAP started.
- 2024 · Pregnancy. Levothyroxine titrated three times.
- 2025 · Postpartum fatigue. Ferritin 12. Wearable shows HRV declining.
- 2026 · She asks the AI: “Why am I tired again?”
Eight years of context. No single right answer. The diagnosis is in the timeline, not in any single visit.
USMLE-style vignettes test whether a model can pick the right answer from four options after reading three paragraphs about a hypothetical patient. Real clinical reasoning is incomplete data, ambiguous timeline, multiple comorbidities, and probabilistic action over years. A 2025 multi-model evaluation found that 64% to 72% of residual hallucinations from frontier medical models are reasoning failures, not knowledge gaps. The model can recite the textbook. What it cannot do reliably is reason about cause and time. Whether the symptom on Monday means something different because of the medication change in March, because of the lab trend that started last summer.
The same study found something that is worth pausing on. Domain-specific medical fine-tunes hallucinated more than frontier general-purpose models. The general models produced 76.6% hallucination-free responses. The medically-fine-tuned models produced 51.3%. That inverts the conventional wisdom about specializing models for healthcare. The implication is that current medical fine-tunes overfit on knowledge and underfit on reasoning, and that GPT-5 or Claude with good retrieval over the patient's record is a better doctor than a smaller model that was trained on PubMed.
The deeper failure mode is calibration. Autoregressive language models optimize for token likelihood, not epistemic accuracy. Asked a question they have a confident answer to, they give it. Asked a question with a confident wrong answer in their training data, they give that. The model gives a confident wrong answer instead of saying I don't know, and ninety-one point eight percent of surveyed clinicians have encountered a medical hallucination, with eighty-four point seven percent considering them capable of patient harm. Better calibration is the single biggest unlock for clinical safety, and it is the thing the benchmarks are slowest to capture.
The pattern that does work, the one that turns a frontier model into something resembling a clinician, is agentic. The model uses tools: it queries the patient's own longitudinal record, calls a guideline lookup, runs a drug-interaction checker, orders a lab, and escalates to a human at a defined threshold. It runs verifier loops where a second model checks the plan against guidelines and a third verifies medication safety. Hippocratic AI's Polaris architecture orchestrates more than twenty specialty LLMs with hard escalation rules. Doctronic published a prospective benchmark where their multi-agent system was evaluated against board-certified clinicians on 500 consecutive urgent-care telehealth visits and showed top-1 diagnostic concordance of 81%, treatment-plan concordance of 99.2%, and zero hallucinations across the sample. The model in the middle is not the entire system. The system is what makes it safe.
The five levels of autonomy
When most people ask "is this an AI Doctor?" they are asking the wrong question. The right question is which level. The published taxonomies converge on a similar idea: that autonomy in clinical AI is a spectrum, and that products sit at very different points on it. The AMA's CPT Appendix S lays out one such taxonomy. So does the Lancet Digital Health "Approaching Autonomy" spectrum, the ADAM framework, and the regulatory precedent set by IDx-DR.
Synthesizing those into a single legible scale gives the framework that follows. The widget is the centerpiece of this essay. Drag the slider. Watch what shifts.
Levels of autonomy
The five levels of an AI Doctor
Drag the slider to see what shifts at each level — who decides, who's liable, what the regulator demands, and what currently exists.
Augmentative
Who decides
A clinician acts on AI-suggested analysis.
Who is liable
Clinician under FSMB 2024 guidance. AI is a clinical decision support tool.
Regulatory pathway
Cures Act §3060 carve-out if criteria met. 510(k) otherwise.
Current example
OpenEvidence · Glass Health · ambient scribes · K Health · Hippocratic AI
In practice
Glass Health drafts a differential diagnosis. A physician reviews and acts on it.
A few observations are worth making explicit, because they are the points the field tends to gloss over.
LumineticsCore (the system formerly known as IDx-DR) is the only Level 4 autonomous AI in medicine that has an FDA clearance and has held that clearance long enough to accumulate real-world outcomes data. It is a single-task system. It reads a single retinal image and outputs a single clinical recommendation. It has held that status since 2018. Eight years of clearance, one disease. That is the right calibration for "are we close to a general-purpose autonomous AI Doctor": there is one example, it is narrow, it is old, and it has not been replicated for any other condition.
In December 2025, Utah quietly authorized Doctronic to renew prescriptions autonomously. This is the regulatory event most people in the AI-and-healthcare conversation have not noticed. It is the first time a US state has agreed that an AI can independently make a clinical decision (a script renewal) that previously required a licensed human. It is one task in one state. It is also a signal that the regulatory ground is moving, even if it is moving quietly.
Level 5, an AI Doctor that initiates management as the default action and where a clinician must take initiative to contest it, has no analogous regulatory precedent. There is no FDA category for it. No state has authorized it. The Federation of State Medical Boards advised in April 2024 that clinicians, not AI vendors, hold liability when AI tools influence care. That guidance assumes a human in the loop. Level 5 dissolves the human in the loop. The doctrine has no answer for what happens then.
Most current products live at Levels 1 and 2. Anyone claiming Level 5 is selling.
What it would take to get to Level 5
If the gap between Level 2 and Level 5 is mostly a list of unsolved problems, it is worth being concrete about what is on that list. The transitions between levels are not a single threshold. They are stacks of gating criteria across four categories (technical, regulatory, evidentiary, cultural) that have to be cleared in roughly that order.
What gates each level
Climbing the autonomy ladder
Click a transition between adjacent levels to see what actually gates the jump — by category, with an honest read on what's solved and what isn't.
Technical
Partially solvedFrontier models pass MedQA at 95%+. Calibration on the long tail and longitudinal reasoning still leak — about 70% of residual hallucinations are causal/temporal failures.
Regulatory
Partially solvedClears the Cures Act §3060 CDS carve-out for many tasks. Anything that diagnoses or prescribes still needs FDA 510(k) or De Novo per indication.
Evidentiary
Partially solvedA few prospective benchmarks against board-certified clinicians (Doctronic urgent care, Hippocratic RWE-LLM). Most products still ship on retrospective accuracy.
Cultural
Largely solvedPatients accept AI suggestions when a clinician signs off. The pattern is familiar — closer to a second opinion than a replacement.
Two of these deserve a closer look, because they tend to be the ones the technical community underestimates and the regulatory community knows by heart.
The first is causal and temporal reasoning at expert level. Current frontier models can pass a board exam and still leak two-thirds of their hallucinations through reasoning failures rather than knowledge failures. Chronic care is, by definition, causal and temporal. A symptom on Tuesday is one data point. A symptom on Tuesday that started after a medication change six weeks ago, that follows a lab trend that started a year before that, in a patient with two comorbidities and a family history of one of them, is a chain of inference, not a fact lookup. The benchmarks that exist today do not measure this well. The benchmarks that would measure it well do not exist yet. Until they do, claims about Level 5 capability are claims about something that has not been measured.
The second is the liability resolution. Today, the legal system implicitly assumes a licensed human is the responsible agent in a medical decision. The "physician-in-the-loop" pattern is the de facto liability shield: if a physician meaningfully reviewed the AI output and signed off, traditional malpractice doctrine applies and the AI is treated like a textbook or a piece of imaging equipment. Level 5 has no physician in the loop. The shield has nothing to protect. Three paths exist in the academic literature. Strict product liability for the vendor. Statutory creation of an "AI medical practitioner" entity. A captured-cost insurance pool where every autonomous interaction pays a small per-decision premium into a no-fault compensation fund. None of them is a doctrine. They are sketches.
State Corporate Practice of Medicine doctrines in California, Texas, and New York actively forbid non-physician entities from practicing medicine. A general-purpose autonomous AI Doctor is, by definition, a non-physician entity. Either preemptive federal action or fifty separate state-level legislative changes would be required before such a system could legally operate. The WHO's January 2024 guidance on large multi-modal models in health states a default principle that "humans should remain in control of medical decisions." Public consent for anything else has not been measured because nothing has yet existed to vote on.
The honest summary is that Level 5 is not gated by model capability alone. It is gated by an interlocking set of regulatory, legal, and cultural agreements that do not currently exist. Any company saying they will ship Level 5 by a specific date is making a prediction not about their engineering, but about the speed of doctrine.
Liability — who pays when the AI is wrong
There has not yet been a major US malpractice case that centers on AI as cause. The legal architecture is being built before the first big case lands. That is rare in malpractice doctrine, where the law usually moves only after harm.
Liability across the autonomy spectrum
The physician shield, at four points
L2
Physician decides
Physician — AI is a clinical decision support tool.
Shield holds
L3
AI suggests, physician approves
Shared. Vendor for the model, physician for the acceptance.
Shield thinning
L4
AI acts, physician may override
Vendor primarily. Physician oversight is statutory.
Shield dissolving
L5
AI acts, no physician in loop
Unsettled. Malpractice doctrine has no party to apply against.
Shield gone
The April 2024 Federation of State Medical Boards guidance is the cleanest current statement of position. It tells state medical boards to hold clinicians, not AI vendors, liable for AI-influenced errors, on the reasoning that the AI is a tool and the doctor is the practitioner. That position holds at Levels 0 through 2 with very little friction. It thins at Level 3, where the AI generates a plan and the physician approves it. It dissolves at Level 4, where the AI acts and the physician's role is to override. At Level 5, where there is no physician in the loop, the doctrine has no party to apply against.
The "physician-in-the-loop" shield has a contrarian read worth airing, because it is the kind of thing that becomes obvious in hindsight only. If physicians are protected when they review AI output but not when an AI acts alone, the system pushes toward rubber-stamp review. Physicians click "approve" to retain the shield, without doing the work that would justify the protection. The careful middle path looks safer than full autonomy, but it can produce its own failure mode: review theater that lets bad model outputs through under the cover of human attestation. Forward Health failed in part by removing humans entirely. The middle path may be how the bad outcomes show up next.
The insurance landscape is moving slowly but predictably. The Doctors Company, one of the largest US medical malpractice carriers, has no AI exclusion as of late 2025 and has stated it would defend a physician whose AI played a role. European insurers have begun adding sublimits and higher deductibles when tools are unvalidated or used off-label, but US-side AI-specific exclusions remain uncommon. State legislatures are starting to act ahead of the courts: 2025 saw a wave of state laws expanding liability and insurer obligations for AI-influenced clinical decisions, and OCR's January 2026 NPRM on the HIPAA Security Rule was the first major update in two decades.
The honest take is that this is an unsolved doctrinal question, and that it gates progress more than the technical work does. A system that could safely run at Level 5 tomorrow would still not be allowed to, because the answer to "who pays when it is wrong" is not yet doctrine in any state.
Why most "AI Doctors" today aren't (and which are closer)
The respectful but honest version of where the field actually sits.
The landscape, honest
Who clears what
Filter and sort to see how each product currently maps to the eight criteria. Most clear one or two. The check-marks tell a different story than the marketing.
| Product | Level | Longitudinal | Patient RAG | Acts | Chronic care | Compliant | Responsibility | Eval published | Boundaries |
|---|---|---|---|---|---|---|---|---|---|
| LumineticsCore (IDx-DR)First FDA-cleared autonomous AI in medicine (2018) — diabetic retinopathy only | L4 (single condition) | ||||||||
| DoctronicMulti-agent autonomous; Utah authorized Rx renewal Dec 2025 | L3 (4 in one task) | ||||||||
| Glass HealthClinical reasoning for clinicians (Deep Reasoning, ambient scribe) | L2 | ||||||||
| OpenEvidenceRetrieval-grounded CDS for physicians ($12B as of Jan 2026) | L2 | ||||||||
| K HealthSymptom triage + virtual primary care; AI does triage, humans decide | L2 | ||||||||
| Hippocratic AIB2B patient-facing voice; constellation of >20 specialty LLMs | L2 | ||||||||
| Superpower$199/yr biomarker membership + concierge clinicians + AI advisor | L1–2 | ||||||||
| Function / Lifeforce / Fountain LifeLongevity platforms with an AI dashboard layer | L1 | ||||||||
| ChatGPT / Claude / Gemini (direct)Consumer chat used for health questions | L0–1 | ||||||||
| Forward Health$650M raised; CarePods at $150/mo; closed November 2024 | L— | ||||||||
| Babylon HealthOnce $4.2B IPO valuation; bankrupt August 2023; the 'AI' was reportedly a decision tree | L— |
Highlighted rows are the closest current credible attempts at the term AI Doctor. Greyed rows are public failure cases worth learning from.
A few observations about the matrix that the prose can carry better than the table.
ChatGPT, Claude, and Gemini used directly for health questions are the regime most people are referring to when they talk casually about "AI doctors." Roughly one in six US adults uses them monthly for health advice, with twenty-four percent of under-30s doing so. Studies show task-dependent accuracy from twenty to ninety-five percent. The consumer tiers are not HIPAA-covered, do not sign business associate agreements, do not retain longitudinal patient records in any clinical sense, do not prescribe, do not assume liability, and do not implement consent architecture. They are search engines in conversation form. Useful as research aides. Not AI Doctors.
Superpower, Function, Lifeforce, Fountain Life, and the other longevity platforms are concierge medicine with an AI dashboard layer. The clinicians are real and meaningful. The biomarker baseline is real and useful. The AI is a coach over a service. Humans make every meaningful clinical decision. The label at the top of the autonomy ladder is somewhere in Level 1 to Level 2. The product is good at what it does. It is also not what the term AI Doctor implies.
OpenEvidence is the most strategically interesting Level 2 product in the field. It is retrieval-grounded clinical decision support over peer-reviewed literature with explicit citations. As of January 2026 it is registered with about forty percent of US physicians (roughly 430,000), and serves about 8.5 million consultations per month. It closed a Series D at a twelve-billion-dollar valuation. Notably, it is monetized on advertising to physicians and is free at the point of use. By design, it never sees a patient. It does not act. It is brilliant in scope. It is also not an AI Doctor by any of the eight criteria, because it was never trying to be.
Doctronic is, currently, the closest credible attempt at the term in the United States. The published prospective benchmark is the kind of evidence that a system seriously aiming at the term has to produce: eighty-one percent top-1 diagnostic concordance, ninety-nine point two percent treatment-plan concordance, zero hallucinations in the sample, multi-agent autonomy, real prescribing authorization in one state. Their scope is urgent care. They have not demonstrated chronic complex management over years. They are operating in fifty states clinically and have one state's autonomous-prescribing blessing for one task. The label is something like Level 3 in scope, Level 4 for one task in one state. Closer than anyone else. Not yet what the term promises.
Hippocratic AI, K Health, and Glass Health are all serious Level 2 products. Hippocratic's safety architecture (twenty-plus specialty LLMs with hard escalation rules, validation against 6,234 clinicians via the RWE-LLM framework) is the most carefully thought-through patient-facing voice system in the field. K Health has run real virtual primary care with AI-first triage at scale. Glass Health has reasoned carefully about the clinician decision-support interaction. None of them owns a longitudinal patient relationship in the way the term AI Doctor implies. They are excellent at Level 2.
LumineticsCore is the existence proof of Level 4 in any field of medicine. It does not feel like an "AI Doctor" because it does one thing for one condition with one input modality. That is the regime in which Level 4 has actually shipped. It is also the regime that, every year for eight years, has not been replicated for any other condition.
Forward Health and Babylon are the two failure cases that set the credibility bar for the category. Forward raised six hundred and fifty million dollars and shipped CarePods at one hundred and fifty dollars a month. It closed in November 2024. The lessons were not subtle. Consumers expect insurance to pay. Healthcare hardware is brutal. The UX of "remove humans entirely" did not survive contact with users getting stuck inside the kiosks. Babylon Health was once a four-point-two-billion-dollar IPO. It went bankrupt in August 2023 with a roughly nine-hundred-million-dollar deficit, after a former employee said publicly that the "AI" was closer to an Excel decision tree than to a model. The Lancet found no evidence the system outperformed clinicians and explicitly raised the possibility it performed worse. Public failures at this scale set the bar for what credibility looks like in the category. Anyone calling themselves an AI Doctor now has to clear a higher evidentiary bar than they would have had to clear in 2019, because the public has been shown what a four-billion-dollar bluff looks like.
The synthesis the matrix points to is straightforward. To credibly hold the title AI Doctor, a system has to clear roughly eight criteria: own the longitudinal record, reason over the patient's own data, act autonomously within a defined safety basin, manage chronic complex disease rather than acute episodes, sit inside a HIPAA and CPOM and FDA-compliant entity, carry clinical responsibility transparently, publish prospective evaluation against board-certified clinicians on real-world cases, and have an explicit set of things it will not do. Most products today clear one or two of those. A few clear three or four. None clears all eight.
Where CONY fits
CONY is at Level 2 today. We are building toward Level 5.
Today, CONY ingests every health stream a patient can legally surface (medical records, labs, wearables, continuous glucose, medications, the symptoms you journal) and reasons over the lot as a single longitudinal record. It produces personalized health plans that adapt as your data changes, surfaces patterns between appointments, and prepares clinical-grade summaries for the doctors still doing the meaningful clinical work. We do not diagnose. We do everything else.
Most medical AI in the field today is trained on population data and reasons about you as if you were the average patient. CONY is built on the opposite thesis: chronic care is N-of-1, the average patient is a statistical fiction, and the right model for you is the one that knows your trajectory.
That is where we are. Where we are going is the AI Doctor the term has always sounded like: personal, longitudinal, autonomous, capable of a continuity of care that no episodic system can ever match. And we are building it.
If you want to see what Level 2 looks like in practice, try CONY between your appointments. The category will mean something within the next decade. We are building to be the people who define what it means.