Jan 1, 2026

The Fallacy of Data-First AI: Why LLMs are Failing the Operating Room

We don’t need more data; we need persistent clinical reasoning.

BG Circle V1 - Expert X Webflow Template
The Fallacy of Data-First AI: Why LLMs are Failing the Operating Room

I recently sat through a demo where a team promised to "revolutionize" surgical workflows by using Large Language Models to summarize patient charts. They showed me how quickly it could digest 500 pages of history. It was impressive as a parlor trick, but as a surgical tool, it was useless.

In the OR, we don’t have a data problem; we have a logic problem.

The current obsession with "data-first" AI assumes that if we just feed enough information into a model, the right clinical decisions will magically emerge. But data is just an ingredient. It isn't the recipe, and it certainly isn't the chef. In surgery, we don't just need a summary of what happened; we need a persistent understanding of what should happen next based on a specific clinical logic that spans from the first consultation to six months post-op.

Why Generic AI Creates Friction

The friction we feel with current Generative AI in healthcare stems from its probabilistic nature. LLMs are designed to predict the next likely word in a sentence. Surgery, however, is deterministic. When I’m performing a complex joint reconstruction, I’m not looking for the "most likely" next step based on a billion internet parameters. I am executing a specific, logical sequence based on the patient’s unique anatomy and the physiological goals of the procedure.

Generic AI lacks persistence. It treats every prompt as a new conversation. In a clinical setting, this is a fatal flaw. A surgeon’s reasoning is a continuous thread. If an AI doesn't understand the specific "state" of the patient—where they are in their recovery arc, their specific comorbidities, and the precise goals of their surgical plan—it’s just a high-speed distraction. It’s noise when we need signal.

The Case for Cognitive Middleware

We need to move away from the idea of AI as a standalone "brain" and start thinking of it as Cognitive Middleware.

Think of this as an architectural layer that sits between the raw data and the clinician. Unlike a standard LLM, cognitive middleware is grounded in hard clinical logic. It doesn't just "read" data; it reasons through it.

  • State Awareness: It maintains a persistent record of the patient’s journey, understanding that a lab result today means something different because of a complication that occurred three weeks ago.
  • Clinical Constraints: It operates within the guardrails of proven surgical protocols. It doesn't hallucinate because its "logic" is tethered to medical reality, not just word associations.
  • Active Reasoning: Instead of waiting for a prompt, it identifies gaps in the care logic—notifying the team when a patient's recovery trajectory deviates from the expected path.

Data Isn’t Infrastructure

We have to stop treating data as if it were the infrastructure for recovery. You can have a mountain of data and still have a patient who can't walk three months after surgery. Infrastructure is the logic that connects the data points to a functional outcome.

When we built our clinical models at Arthur Health, the goal wasn't to see how much data we could collect. It was to see how precisely we could map the "Clinical Logic" of a specific procedure. We focused on the architecture of the decision-making process.

The future of AI in surgery isn't about bigger models or more parameters. It’s about building systems that actually understand the work we do. We need tools that reason alongside us, maintaining the clinical thread when the environment gets loud and the data gets messy.