A new AI model can predict how diseases develop across a person’s life

07-Oct-2025

Healthcare depends on knowing how illnesses unfold over time. Doctors use patient histories to predict future risks and make decisions. A new artificial intelligence (AI) model called Delphi-2M takes this idea further. It learns how diseases develop, connect, and compete across an entire lifetime.

Created by researchers from the European Bioinformatics Institute, the German Cancer Research Centre, and several universities, Delphi-2M uses the same type of technology that powers chatbots like GPT. Instead of predicting words, it predicts diseases. This AI can read millions of medical histories and learn the language of human health.

Training on real human lives

Delphi-2M was trained using data from more than 400,000 participants in the UK Biobank and tested on 1.9 million people from Denmark. It analyzed more than a thousand diseases defined in the international ICD-10 system.

Each person’s record was transformed into a timeline of health events, such as “asthma at age 10” or “diabetes at age 45.” The model also included lifestyle data like smoking, alcohol intake, and body mass index. These details helped the AI understand how habits shape disease risk.

The researchers treated every diagnosis like a “token,” similar to words in a sentence. The AI learned to predict which “word” — or disease — comes next and when it might appear.

A transformer that thinks in time

Delphi-2M extends the GPT-2 transformer to handle time, not just text. In language, words follow each other in sequence. In health, diseases appear at different ages and sometimes together. To handle this, the researchers replaced GPT’s position encoding with age encoding.

This allows the model to track a person’s age continuously instead of by position. Delphi-2M also predicts how long it will take before the next diagnosis, using an exponential time model. These additions help the AI mirror how real health changes — not only what happens next, but when.

With about two million parameters, Delphi-2M is smaller than most large language models but perfectly tuned to medical prediction.

Predicting more than a thousand diseases

When tested, Delphi-2M predicted over a thousand diseases with impressive accuracy. It achieved an average area under the curve (AUC) score of 0.76, meaning it could correctly rank high-risk individuals in most cases. For 97 percent of diagnoses, its performance was better than chance.

The AI predicted many diseases as accurately as specialized clinical models — and sometimes better. For example, it matched established cardiovascular and dementia risk tools and outperformed them in forecasting death.

Even ten years after the last recorded diagnosis, the model still made relevant predictions, showing it could look far into a patient’s future.

Sampling the future of health

One remarkable feature of Delphi-2M is its generative ability. It can simulate a person’s health future, step by step, for up to 20 years. Researchers tested this by giving the model each person’s medical record until age 60, then letting it “imagine” what happened next.

The generated timelines closely matched real health outcomes. The AI correctly predicted around 17 percent of future disease events in the first year, which fell slightly over two decades but remained higher than chance. It could also estimate how smoking, alcohol, or body weight change disease risks later in life.

By generating synthetic medical data, Delphi-2M also protects privacy. A version trained entirely on AI-created data performed nearly as well as the real one, proving that synthetic health records can be useful for research without exposing individuals.

Revealing the hidden logic of disease

Using explainable AI tools like SHAP (Shapley Additive Explanations), the team explored how Delphi-2M “thinks.” The AI built an internal map of disease relationships. In this map, illnesses that often occur together, like diabetes and nerve damage, appear close to one another.

For example, the model learned that digestive diseases raise the risk of pancreatic cancer about 19 times, and that pancreatic cancer increases the risk of death nearly 10,000 times. These patterns mirror known medical knowledge, showing the AI can rediscover real biological links.

The analysis also revealed time-based effects. Some diseases, like cancer, have long-term impacts on survival, while others, like infection, affect mortality for only a short period.

Testing across countries and biases

To check if the model’s insights generalize, researchers applied Delphi-2M to Danish national health records. Without retraining, it still achieved a strong AUC of 0.67. The model recognized similar patterns in both populations, suggesting it had captured universal aspects of disease progression.

However, the study also found biases in the UK Biobank data. The cohort included mostly white, healthier, and more educated participants, missing many early deaths and serious diseases. Delphi-2M sometimes reflected these biases, predicting lower risks for under-represented groups.

The authors emphasized that such AI systems should assist, not replace, medical judgment. They warned against assuming the model’s predictions are causal rather than statistical.

What Delphi-2M means for healthcare

Delphi-2M opens a new path for predictive and preventive medicine. It can identify people likely to develop multiple conditions years before symptoms appear. Health systems could use this to plan screening, allocate resources, and prevent future disease burdens.

Because the model can simulate entire populations, it can also support policy planning. Governments could forecast how ageing or lifestyle changes affect national health needs decades ahead.

The AI’s design also makes it easy to expand. Future versions could integrate genetic data, wearable sensors, blood tests, and imaging, creating a truly multi-dimensional picture of human health.

A glimpse of medicine’s AI future

The study highlights how transformer models, first built for language, can now model life itself. Delphi-2M treats the sequence of diseases as sentences written in the body’s language. By reading millions of “health stories,” it learns how human biology evolves.

While the technology remains young, its potential is vast. With further safeguards, models like Delphi-2M could power medical assistants that advise doctors, design public-health strategies, or train future AI systems without exposing private data.

As Professor Ewan Birney and colleagues note, this is just the beginning. The next generation of AI will not only talk — it will learn the natural history of human disease.

The study is published in the journal Nature. It was led by Ewan Birney from European Bioinformatics Institute EMBL-EBI.