
AI Models Forecast Viral Mutations, Enhancing Pandemic Preparedness
- Written by Kiara Fabbri Former Tech News Writer
- Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor
AI is playing a significant role in helping scientists predict how viruses evolve, potentially improving pandemic preparedness and aiding in the development of vaccines and antiviral treatments.
In a Rush? Here are the Quick Facts!
- AI is helping predict viral evolution, improving vaccine and treatment development.
- RNA viruses like SARS-CoV-2 and influenza constantly mutate, evading immune detection.
- AI tools forecast short-term mutations, but long-term viral changes remain unpredictable.
While predicting viral evolution is still in its infancy, researchers are using AI to forecast how RNA viruses, like SARS-CoV-2 and influenza, will mutate, as detailed in a new report by Nature .
Currently, AI tools are able to predict which single mutations are likely to be successful and which viral variants might dominate in the short term. However, Nature says that predicting long-term changes or complex combinations of mutations remains a challenge.
AI’s role in this field has been bolstered by advanced protein-structure prediction models like AlphaFold, ESM-2, and ESMFold, which analyze how mutations affect viral proteins. Nature says that these tools are revolutionizing the ability to simulate viral evolution and help scientists understand how viruses like SARS-CoV-2 adapt over time.
The availability of massive amounts of genetic data is crucial for AI models to predict viral evolution. With nearly 17 million sequenced SARS-CoV-2 genomes, AI models have a wealth of data to train on, allowing researchers to simulate potential future variants, says Nature.
For example, the CoVFit model, developed by Jumpei Ito’s team at the University of Tokyo, has been instrumental in predicting which SARS-CoV-2 variants are likely to spread and dominate in the population, as reported by Nature.
In addition to tracking known viruses, AI is also helping scientists uncover new ones. A study published in October revealed that researchers used AI to identify 70,500 new RNA viruses , many of which thrive in extreme environments such as salt lakes and hydrothermal vents.
This study applies metagenomics, allowing scientists to analyze genetic material from diverse ecosystems without growing individual viruses in the lab.
Despite progress, Nature says that challenges remain when trying to accurately predict sudden viral leaps, such as the emergence of the Omicron variant, which introduced more than 50 mutations in a single leap.
For these AI models to become even more accurate, they need more than five years of data on viral evolution, says Nature. Combining surveillance sequencing with experimental data will improve predictions and help researchers stay ahead of evolving viral threats.

AI Faces Data Crisis: Musk Warns Of Exhausted Human Knowledge
- Written by Kiara Fabbri Former Tech News Writer
- Fact-Checked by Justyn Newman Former Lead Cybersecurity Editor
Artificial intelligence companies have depleted available human knowledge for training their models, Elon Musk revealed during a livestreamed interview, as reported by The Guardian .
In a Rush? Here are the Quick Facts!
- Elon Musk says AI firms have exhausted human knowledge for model training.
- Musk suggests “synthetic data” is essential for advancing AI systems.
- AI hallucinations complicate using synthetic data, risking errors in generated content.
The billionaire suggested that firms must increasingly rely on “synthetic” data—content generated by AI itself—to develop new systems, a method already gaining traction. “The cumulative sum of human knowledge has been exhausted in AI training. That happened basically last year,” Musk said, as reported by The Guardian.
This is set to mark a significant challenge for AI models like GPT-4, which rely on massive datasets sourced from the internet to identify patterns and predict text outputs.
Musk, who founded xAI in 2023, highlighted synthetic data as the primary solution for advancing AI. However, he cautioned about the risks associated with the practice, particularly AI “hallucinations,” where models generate inaccurate or nonsensical information, as reported by The Guardian.
The Guardian notes that leading tech companies, including Meta and Microsoft, have adopted synthetic data for their AI models, such as Llama and Phi-4. Google and OpenAI have also incorporated this approach.
For example, Gartner estimates that 60% of the data used for AI and analytics projects in 2024 was synthetically generated, as reported by TechCrunch .
Additionally, training on synthetic data offers significant cost savings. TechCrunch notes that AI startup Writer claims its Palmyra X 004 model, developed using almost entirely synthetic sources, cost just $700,000 to create.
In comparison, estimates suggest a similar-sized model from OpenAI would cost around $4.6 million to develop, said TechCrunch. However, while synthetic data enables continued model refinement, experts warn of potential drawbacks.
The Guardian reported that Andrew Duncan, director of foundational AI at the Alan Turing Institute, noted that reliance on synthetic data risks “model collapse,” where outputs lose quality over time.
“When you start to feed a model synthetic stuff you start to get diminishing returns,” Duncan said, adding that biases and reduced creativity could also arise.
The growing prevalence of AI-generated content online poses another concern. Duncan warned that such material might inadvertently enter training datasets, further compounding the challenges, as reported by The Guardian.
Duncan referenced a study published in 2022 that predicted high-quality text data for AI training could be depleted by 2026 if current trends persist. The researchers also projected that low-quality language data might run out between 2030 and 2050, while low-quality image data could be exhausted between 2030 and 2060.
Furthermore, a more recent study published in July warns that AI models risk degradation as AI-generated data increasingly saturates the internet. Researchers found that models trained on AI-generated outputs produce nonsensical results over time , a phenomenon termed “model collapse.”
This degradation could slow AI advancements, emphasizing the need for high-quality, diverse, and human-generated data sources.
Watch Stagwell’s CEO Mark Penn interview Elon Musk at CES! https://t.co/BO3Z7bbHOZ — Live (@Live) January 9, 2025