Three questions and answers: Only the doctor can protect AI from wrong diagnoses

Health data is not normally easy to collect from the Internet – it is the opposite of free text and images, which are the basis for large AI generators such as StableDiffusion and ChatGPT. We talk to Daniel Beck about what is important when collecting data for health AI.

In Germany there is currently a lot of discussion about the use of (anonymized) health data for research – most recently CCC spokeswoman Constanze Kurz complained about this. How dependent are the AI applications on access to such data and are there geographical differences in the use of health data?

A central point in the discussion is the collection of all data in one place and the associated risks. From a technical perspective, there would certainly have been alternatives that could have served the same requirements. Health data is required for medical research as well as for training, evaluation and the use of AI applications in the medical field – regardless of the type of data collection and gathering.

It is therefore important to find a way to balance patient data confidentiality, data availability for AI applications and technical access to them. Depending on the weighting of these factors, the solution will be different. In Europe, data protection plays a very important role, in China the availability of data is rated higher. In the USA, a uniform interface for accessing health data has been defined by law.

From the current ChatGPT example, we know that AI programs can provide good answers to general topics. When it comes to exact questions or specialist topics, the program is often wrong or invents answers. How to avoid such inaccuracy in the field of medical AI?

The approach behind very large language models is to train the system with as many texts as possible. In this way, answers can be synthesized that are linguistically convincing and also somehow fit the question in terms of content. That in itself is a remarkable achievement, but ChatGPT does not understand the texts used for training or the content of the questions like a human in any way. Put simplistically, such a system returns the word sequence that is most probable based on the average of the training texts. The actual meaning, which does not result entirely from the texts alone, remains hidden from the system.

Medical AI applications often involve special areas of medicine. But here, too, it is possible to overlook important indicators for a decision. This may be due to the low frequency with which these features occur. In order to prevent errors, elaborate evaluation methods with extensive data sets for medical AI systems are required.

cognotect

)

Daniel Beck is a computer scientist, software developer and partner of the consulting company Cognotekt. His expertise lies primarily in the processing of raw data into a machine-readable dataset in order to draw data-driven conclusions. He advises bio and healthtech companies on their data strategy and the implementation of AI.

A team of researchers has just unmasked an AI for diagnosing COVID-19 based on coughing sounds as a dud. How can you be sure that programs recognize the correct diagnosis and do not conclude certain diseases based on wrong characteristics?

This is a classic example where a distortion in the training dataset has led to features being classified as particularly relevant that have only a subordinate importance or even no causal connection to the facts to be predicted. Unfortunately, this can happen very quickly and shows once again how important it is to generate representative data sets. In other words, data sets that are similar in their distribution to the distribution found in the real world. It’s not that easy, it’s very time-consuming and therefore expensive.

If the condition in reality does not correspond to the condition to be expected based on the evaluation data, such problems are quickly identified – this was also the case with the cough detection app. Cases in which one does not recognize the distortions are more problematic, for example because they only lead to incorrect decisions in a small number of cases. It is therefore important with stochastic AI methods to check which characteristics of the input data can be assessed as particularly relevant. In the medical environment, you then have to check whether there is agreement with this assessment from a professional point of view. Ultimately, stochastic methods can only be used in medicine if the doctors know that the information can always be incorrect. An alternative is the use of deterministic methods.

Mr Beck, thank you very much for your replies.

The relevance of handling patient data is also shown by a current research project in which AI is intended to improve drug safety with the help of billing data from health insurance companies. In another interview, Daniel Beck explains for which areas of application in medicine AI is nonsense, where it is particularly suitable and what the liability situation is in the event of incorrect diagnoses.

In the “Three Questions and Answers” series, iX wants to get to the heart of today’s IT challenges – whether it’s the user’s point of view in front of the PC, the manager’s point of view or the everyday life of an administrator. Do you have suggestions from your daily practice or that of your users? Whose tips on which topic would you like to read in a nutshell? Then please write to us or leave a comment in the forum.