On Becoming a Medical Device

TORTUS is now a Class IIa medical device for ambient voice technology, the first UKCA mark of its kind. Over a series of blogs and webinars we are going to open up the journey and the technology behind that, and you can follow the whole thing here.

It starts, as these things should, with the why. Why is clinical risk with ambient voice worth the trouble of managing at all, and why certify as a medical device in the first place?

A lesson from Vienna

In 1847, in the maternity wards of the Vienna General Hospital, a Hungarian obstetrician called Ignaz Semmelweis worked out that the doctors' ward was killing mothers at several times the rate of the midwives' ward next door. The doctors, unlike the midwives, came to the deliveries straight from the autopsy room. Semmelweis made them wash their hands in chlorinated lime, and the deaths collapsed almost overnight. For this he was mocked, ignored, and eventually committed to an asylum, where he died. It took decades, and Pasteur and Lister after him, before the profession accepted what he had already proven: that the invisible thing, carried on the hands of the person trying to help, is often the thing that does the harm.

It is worth sitting with that, because we are living through the same kind of moment with artificial intelligence in medicine. Something invisible and enormously powerful, carried into the consultation by the people trying to help, and the question of what standard we hold it to is, for now, mostly unanswered. From the day TORTUS started four years ago, the single largest expense has not been engineering, or sales. It has been compliance. Not because anyone forced it, but because in medicine the cost of getting it wrong is not a refund or a churned account. It is a life. You cannot rush the thing that touches that.

A medical note is not a meeting summary

To see why this needs regulating at all, it helps to be clear about what is actually being generated.

When an AI takes notes in a business meeting, those notes are a fair-enough record to glance at later. Nobody reads them line by line in a courtroom. A medical note is a different object entirely. It is a medico-legal defence. It is the account a clinician writes, for the next clinician and ultimately for a coroner, of what was found, what was decided, and why. An omission in that note matters. A hallucinated finding matters. A suggested diagnosis, sitting quietly in the record, changes how the next person who reads it thinks.

And these tools genuinely change the record, mostly for the better. The anecdotal data on human performance is striking: clinicians omit something like 10% of the important clinical facts from their own notes. The incidental blood pressure. The "oh, by the way, this has been hurting too", offered with a hand already on the door handle, the doorknob complaint, as it used to be called. It is usually right there in the consultation. Tired, biased, human brains just lose it between the ear and the keyboard. An attentive extra pair of ears catches it, and that is a real improvement to care. But anything powerful enough to improve the record that much is, by the same token, powerful enough to harm it. There is no upside here without a downside to own.

A simpler way to think about it

Set the regulations aside for a moment. There is a much simpler, clinician's way of thinking about this. If the patient can come to harm when the system goes wrong, it is a medical device. That is the whole test.

A tool like this obviously meets it. Its entire purpose is to influence care: to give clinicians time, and to capture what was actually said. By exactly the same logic, when it errs it can derail the clinician in front of the patient, and everyone downstream who trusts the note. Something built to assist the diagnostic pathway cannot also claim to have no bearing on the diagnosis.

That word, assist, is the one that matters. The regulated device is not the microphone. It is the whole of TORTUS, the tool that transcribes the consultation and generates the summary, note, letter and coding a clinician relies on. It is Class IIa, rather than a simple scribe, for one reason: it assists the diagnostic pathway. A tool that only transcribes is Class I. The jump from one to the other is the clinical decision support.

And the limits matter just as much, because they are written into the certificate. TORTUS does not diagnose, does not recommend treatment, and does not replace clinical judgement. Every output requires review and approval by the responsible clinician before it reaches the record. The human is not a rubber stamp at the end. The human is the point.

Class I proves nothing. Class IIa shares the liability

Which brings us to the gulf nobody talks about: the distance between a Class I device and a Class IIa one. It is enormous, and almost nobody outside the regulatory world understands it.

A Class I device is, in itself, no assurance at all. No independent audit. The hospital never sees the quality management system, because none is required. It is entirely possible to register a vacuum cleaner as a Class I medical device. Some tools in the market carry exactly that level of assurance while sounding far grander.

Class IIa is a different universe. Every claim has to be written down, a real quality management system maintained, and all of it submitted to an independent approved body to audit. For TORTUS that meant roughly £1.5 million, eighteen months, and around 5,000 pages of evidence. One of our product managers put it better than any figure can: Class IIa was more actual work than my master's thesis. Every claim, the accuracy of the transcript, of the codes, of the decision support, has to have something underneath it that an auditor can pull on. (For the record: the approved body certifies, the MHRA registers. Different jobs, and only one of them audits your homework.)

The decisive difference is shared liability. With a Class IIa device, the approved body carries the liability alongside the manufacturer. If the device is found wanting, the assessment that let it to market is on them too. They have put their name on the line beside yours. That is what allows a clinician to be told, hand on heart: this is a safe system, these are the limits of its safety, this is its intended use.

Which takes us back to Vienna. Consider a sterile bag of saline and some salt water mixed up in a kitchen. Both are, near enough, 0.9% sodium chloride, isotonic saline, the same fluid that hangs on a drip stand in nearly every ward in the country and goes into patients all day long. On paper they are the same solution. Yet almost anyone would recoil at the idea of cooking it up at home, or buying it in a non-standard pack, and run it into a vein. We recoil because we understand, deep in our bones, the microbial risk: that it is not the salt and water that harm the patient, but the invisible contamination of an unsterile process. The standard, the sterility, the chain of assurance, that is the entire point. It is Semmelweis again. And we are going to have to accept, faster than it took the profession to believe him, that an AI sitting this deep in a clinician's thinking deserves the same.

What comes next

The clearest sign of where this is heading is the road every serious company in the field is walking: deeper into clinical decision support, and one day, carefully, more of the workflow around the decision itself. Software that reaches further into the diagnostic pathway is, unarguably, higher-risk and rigorously regulated. Getting it wrong, even rarely, is intolerable.

So you cannot have one without the other. You cannot put a natural-language AI into every conversation, in every room, and have it be clinically unqualified to be there. It is the difference between a layperson sitting in on the consultation and a junior colleague taking part in it, someone qualified, and privileged, to be in the room at all. That privilege is what certification confers. It is the foundation for everything that comes after. Which is why it would be short-sighted to choose a scribe today on the assumption that a scribe is all this will ever be.

For TORTUS, Class IIa is done. Not a box ticked, but the ground that everything else is built on.

The slow road was the point. So was the name. A tortoise goes slowly, takes its time, and when it arrives, it stays. As fast as we can, as slow as we need. It took years to get here. But in the moment that matters in medicine, the moment a clinician leans on this in front of a frightened patient, slow is exactly the quality you want underneath you.

Dr Dom Pimenta

You can read more about TORTUS here. Next in the series: how we actually did it, the trials, the evidence, and eighteen months inside a certification submission. Follow the whole series at Years in the making.