We’ve come to the end of our first proper sprint this week using Scrum and by all accounts it went reasonably well. Productivity and work done was certainly way up. The other great thing about Scrum is it forced a function across everything that we do; a ‘Definition of Done’, or in other words, an observable metric we were trying to hit.
Start-ups are ‘exercises in de-risking uncertainty’ (Paul Graham). Building a computer-controlling agent AI for healthcare is a complex task, riddled with uncertainty in multiple dimensions. Probably my biggest mistake to date was not having a systematic way to chunk up that uncertainty and ‘de-risk’ it methodically. Meanwhile, we have quietly been building quite a busy research team, having some early work now accepted to present at some autumn conferences, alongside progressing several larger pieces of research for publication.
We have four key values at TORTUS: Patient First, Be Kind, Kick Ass, and Stay Curious. That last one is about applying the scientific method to everything that we do, from marketing to machine learning, and these last two weeks I’ve realised how incredibly important that is. So this blog is dedicated to my favourite word in the English language and the embodiment of the TORTUS approach to everything in health-tech: Science.
“Let’s science the sh*t out of this”
The Martian
- ‘Science’ is a verb
When we talk about ‘science’ or ‘scientists’ we too often think about a monolithic establishment, a body of work that we name ‘Science’ that then ‘speaks’ on a topic. We’ve all read the phrases ‘scientists say’ or ‘science says’ in the media etc. However, science isn’t a noun at all; it’s a verb, it’s a process. The scientific process is quite simple; make an observation (e.g. there’s a shiny object on the floor over there), make an hypothesis (e.g. I think that’s a coin), and then devise an experiment to prove or disprove that hypothesis (e.g. go and pick it up), updating your knowledge about the world as result (e.g. some bottle caps are gold and resemble coins). This is a very simple process, and like all simple things, can be compounded together to make highly complex things (e.g. nuclear fission), but fundamentally the core unit is always the same. - We are all scientists
This process isn’t unique to labs and industrial scientists – it’s the process by how all humans reason and discover about the world. We are born with it – if you have children or nieces/nephews you’ve already seen this process when you introduce a new toy, or food, or (hilariously) a new pet. Observation (e.g. there’s a new big furry thing), hypothesis (e.g. It’s going to eat me), test (e.g. I’m going to stroke it and see if it eats me), knowledge (e.g. it didn’t eat me and it’s a cat/ my new best friend). We don’t lose this as we get older, but we maybe let other things (experience/‘common sense’/general life) get in the way. But if you unpick any highly successful company, especially those performing at the highest level in their vertical, you will find a scientific methodology. In product development terms an A/B test is a great example of this, and the same idea flows through the marketing department and sales as well. We are all scientists, and if we are paying attention, we are all constantly learning as well. - Everything is Science
If we are all scientists, then everything is science. The team spent a lot of this sprint just figuring out how we can evaluate what we are doing; what does ‘done’ look like? What does ‘better’ look like? In healthcare this is particularly important as we have to a) rigorously test and constantly improve every system used clinically and b) we have to have those evaluations to pass to clinicians who will subsume liability (or not) for the systems performance with informed consent. So we observed individually every process in our current AI tech stack (navigation, speech-to-text AI, AI generation of clinical letters), made hypotheses about their performance and then devised frameworks to conduct repeatable experiments on each process. This allows us to not only see exactly where we are, but to rapidly evaluate and compare every new AI technology as they come. - Evaluating AI is hard.
In AI, evaluation of the thing requires 10x more energy than building the thing* . Take for example a very simple system we use currently; speech to text transcription. The “job” is to capture the spoken words in a consult and write them down. Speech-to-text already has an established metric for performance; the word error rate e.g. how many words per minute of speech or total words does the AI mistranscribe or omit? But how do you evaluate the performance of a clinical speech to text model? The job in the clinical setting is not to transcribe every word spoken but to derive the clinical information to inform a diagnosis or treatment plan. Most of the consultation words therefore are irrelevant to the actual job and therefore WER alone is not a sufficient test for this, we need to understand clinical accuracy. We therefore had to create a new metric to assess this function – clinically significant word error rate. We then had to curate a standardised dataset, accurately check the dataset transcripts by hand to ensure we have the real ground truth, and then set up our own infrastructure to repeatably test and evaluate the outputs. We will need to devise bespoke frameworks for every element and every tool we give our agent over time. - Let’s Collaborate
While we have a clear commercial need to do these things, fundamentally as pioneers in this space, we have a responsibility to produce repeatable methodology for the industry as a whole, and to learn from others doing similar things, through publication ie the scientific method. I was discussing the future of AI in general with a fellow AI founder, who was quite pessimistic about the future AI might create. But I was of the opposite opinion, we aren’t passive observers on this journey, particularly those of us actually working on the hardcore applications of this technology. Where AI goes in the next decade is up to us, the builders. So the steps and frameworks we establish now will determine what the future looks like – hence why this is so vital. - First Do No Harm
Lastly, in medicine specifically, we already have established frameworks learnt from six decades of mistakes; clinical trials. Implementing highly powerful and highly dangerous medicines into the real-world has given us a system of rules of evidence generation and regulation we already understand. AI is no different, and we must embed the same scientific processes, ethical and intellectual rigour to AI as we do to drug development.
First: Do No Harm. We need to measure in great depth how we can implement AI safely, how we can study it’s direct and indirect cognitive effects on every interaction either between clinician-AI or patient-AI. Second: Beneficence. We must seek and demonstrate actual utility and benefit to everything we do, not just tech for the sake of tech. Thirdly: Autonomy. Introducing AI cannot override or remove a patients ability to choose – I hope in fact this will greatly increase autonomy of patients in healthcare with AI leverage, but again we will need to really measure this. Lastly, Justice. AI cannot exacerbate the existing inequity we have in healthcare, both inside individual systems and globally as well. Again, my hope it that AI will unlock the opposite, removing language barriers and expertise resource distribution, but time will tell.
I once got lost while skiing in a total white out. You couldn’t see more than a foot ahead in any direction, and all you could see was white. I could just about see my own skis. All we could do was follow the slope of the mountain and hope that we would find ourselves at the bottom. Running a start-up is a lot like this – although nowadays it’s more of a ship in deep fog rather than an individual skier. Increasingly we are finding each objective goal we set and meet is a lighthouse, and the scientific method is our compass. Although our particular ship is trying to go somewhere no one has ever gone before, I am increasingly confident, if we stick to the science, we will get there.