Is AI Triggering a Crisis in Biomedical Research? Top Scientist Nearly Fooled by AI-Fabricated Citations — An Exclusive Interview with Lancet Author Maxim Topaz - National Business Daily

Is AI Triggering a Crisis in Biomedical Research? Top Scientist Nearly Fooled by AI-Fabricated Citations — An Exclusive Interview with Lancet Author Maxim Topaz

NBD

In May 2026, a correspondence published in The Lancet on AI-generated fake citations attracted widespread attention in China's medical research community.

Based on a screening of approximately 2.5 million biomedical papers indexed in PubMed Central, the study found that the rate of fabricated references in biomedical literature has increased more than twelvefold over the past few years. In 2023, roughly four fabricated references appeared per 10,000 papers; by early 2026, that figure had surged to 56.9 per 10,000 papers.

Ironically, the study's lead author, Maxim Topaz, is not only an Associate Professor at Columbia University School of Nursing and an AI researcher in healthcare, but also one of the world's top 2% most-cited scientists. Yet even this leading AI expert once nearly cited an AI-generated fake paper while preparing a manuscript.

What can researchers do to address this growing challenge?

National Business Daily (NBD) recently conducted an exclusive interview with Maxim Topaz. The following is an edited transcript.

Maxim Topaz Photo/provided by interviewee

Fake Citations Are Spreading Across Biomedical Literature: 98.4% of Problematic Papers Remain Uncorrected or Unretracted

NBD: What prompted you to investigate fabricated citations in biomedical literature?

Maxim Topaz: It all started with a close call of my own. While preparing a commentary manuscript for journal submission, I used an AI chatbot to polish the language. Since I work in AI research myself, I was fully aware of hallucination issues and carefully verified every reference to ensure its accuracy.

Even after multiple rounds of revisions and fact-checking, the journal editor questioned one citation. It turned out that the AI tool had quietly inserted a completely fabricated reference—something I had failed to detect during my own review.

That experience was a wake-up call. What concerned me wasn't simply making a mistake; it was realizing that if someone who works with AI every day could be fooled, then ordinary researchers would be even more vulnerable.

That realization inspired this study. Before our work, no one had systematically measured how often fabricated references actually made their way into peer-reviewed, published biomedical literature. References form the foundation of scientific research. Once references lose credibility, the integrity of the scientific record is at risk. Our study was designed to fill that critical knowledge gap.

NBD: As someone affiliated with both Columbia University's School of Nursing and its Data Science Institute, how did your interdisciplinary background help build this automated citation verification system? What was the biggest technical challenge?

Maxim Topaz: Expertise in both clinical medicine and data science was indispensable.

Clinical knowledge enabled us to understand which citation problems have real-world consequences and to distinguish between genuine citation errors and deliberate fabrication based on discipline-specific citation patterns. Data science, on the other hand, made large-scale automated verification possible—far beyond what manual checking could achieve.

The greatest challenge was minimizing false positives. We examined more than 97 million references. Even an extremely low error rate would generate an enormous number of incorrect alerts.

Our core task was to accurately differentiate intentional fabrication from unintentional typographical errors or legitimate citation variations, such as abbreviated titles.

To address this, we designed a multi-layer verification pipeline that included large language models for initial screening, followed by independent human reviewers who validated the results. Ultimately, the system achieved an accuracy rate of 91%.

Building a trustworthy verification system capable of handling such massive datasets was by far the most difficult part of the project.

NBD: Why did your team choose such a large-scale analysis—covering roughly 2.5 million papers and 125 million references? How different were your findings from the research community's previous understanding?

Maxim Topaz: Because citation fabrication is relatively rare at the individual paper level, isolated case studies cannot produce reliable conclusions.

We analyzed 2,471,758 open-access biomedical articles and over 125 million references. Only at this scale could we accurately estimate the overall prevalence of fabricated citations and, more importantly, identify long-term trends.

The gap between perception and reality turned out to be enormous.

Previously, most researchers regarded fabricated citations as isolated incidents caused either by unethical authors or careless mistakes. Our findings suggest otherwise. Fabricated citations now appear across virtually every category of biomedical literature.

Since 2023, the incidence of fabricated citations has increased more than twelvefold. At the time of our analysis, 98.4% of papers containing fabricated references had neither been corrected nor retracted.

In short, both the scale of the problem and the lack of remediation far exceeded what the scientific community had anticipated.

Quarterly incidence of fabricated references per 10,000 papers in PubMed Central, January 2023–February 2026

Source: "Fabricated Citations: A Screening Analysis of 2.5 Million Biomedical Papers"

Review Articles Have Become Ground Zero: The Consequences Could Reach Clinical Practice and Health Policy

NBD: Why did the incidence of fabricated citations begin rising so sharply around mid-2024? Do you think AI, paper mills, or weaknesses in peer review are primarily responsible?

Maxim Topaz: The timing is highly suggestive.

Large language models became widely available in late 2022 and throughout 2023. Biomedical papers typically require 100 to 200 days from submission to publication. Consequently, manuscripts written with AI assistance began appearing in major biomedical databases around mid-2024, precisely when we observed the sharp increase.

That said, our study identifies the phenomenon rather than its causes.

Paper mills, changes in journal indexing policies, and weaknesses in editorial review have all likely contributed. These factors reinforce one another: AI and paper mills make it easy to generate fabricated citations, while insufficient editorial verification allows them to enter the published literature.

So it would be inaccurate to attribute the problem to any single factor.

Objectively speaking, AI has dramatically lowered the barrier to fabricating convincing references, while current peer-review systems were never designed to detect this type of fabrication.

NBD: How do AI-generated fake citations differ from traditional citation errors, and why are they more concerning?

Maxim Topaz: The key difference lies in the nature of the error.

Traditionally, citation problems were usually accidental—incorrect page numbers or misquoted conclusions—but the cited paper itself actually existed.

Today's AI-generated citations often refer to papers that never actually existed.

These fabricated references are remarkably convincing. They follow proper citation formats, include the names of real and well-known researchers, match the manuscript's subject matter, and even assign plausible publication dates.

As a result, they can easily pass preliminary checks and often escape detection during conventional peer review.

The broader concern is that references serve as the evidence supporting scientific claims.

The problem has evolved from "incorrect evidence" to "nonexistent evidence." That represents not merely a decline in citation quality but a fundamental breakdown in the scientific evidence chain.

NBD: What was the most shocking case your team encountered?

Maxim Topaz: One particularly striking example involved a paper published in an open-access oncology journal in 2025.

Among its 30 verified references, 18 were fabricated.

These fake citations matched the paper's specialized surgical topic remarkably well, listed real experts as authors, and assigned publication dates between 2023 and 2024.

Another alarming pattern emerged within a single journal, where eleven papers published over one year repeatedly listed the same two authors and collectively contained fifteen fabricated citations spanning multiple unrelated cutting-edge research areas.

I'm more concerned about this kind of systematic fabrication than isolated problematic papers.

What's even more troubling is that these papers remain publicly accessible, continue to be cited by subsequent studies, and carry no warning labels, corrections, or expressions of concern.

NBD: Your study found that review articles have a 57% higher rate of fabricated citations than other paper types. Since reviews underpin clinical guidelines, why are they especially vulnerable?

Maxim Topaz: Several factors contribute.

First, review articles contain much longer reference lists, making fake citations easier to hide.

Second, writing reviews requires researchers to synthesize vast amounts of literature, precisely the task for which many authors rely on AI tools. Unfortunately, this is also where fabricated citations are most likely to emerge.

Most importantly, review articles sit at the very top of the evidence hierarchy.

Systematic reviews build upon narrative reviews, while clinical practice guidelines rely heavily on systematic reviews.

Our data show that review articles contain 16.7 fabricated citations per 10,000 papers, compared with 10.6 in other publication types—a 57% increase.

The danger extends far beyond the numbers themselves. Fabricated citations in reviews propagate throughout the evidence hierarchy and can ultimately influence the evidence that clinicians and policymakers depend upon.

China's biomedical research has become deeply integrated into the global scientific ecosystem.

Source: Illustration by NBD based on publicly available information

Without Timely Intervention, the Scientific Literature Could Become Irreversibly Contaminated

NBD: How might fabricated citations affect clinical decision-making and patient safety? Has the medical community underestimated these risks?

Maxim Topaz: Fabricated citations can undermine the entire evidence chain.

Clinical guidelines are based on systematic reviews, and there is already evidence that some papers produced by paper mills have been incorporated into reviews used for guideline development.

If the studies underpinning those guidelines contain numerous fabricated citations, then the scientific foundation supporting treatment recommendations becomes significantly weaker.

To be clear, our study did not track patient outcomes, so we cannot quantify direct harm to patients, nor would we claim to do so.

However, we believe there is a structural vulnerability within today's scientific evidence system, and the medical community has underestimated that risk.

Previous studies have shown that roughly one-quarter of references in medical papers contain some form of citation error. This alone suggests that reference verification has never been a routine component of peer review.

If journals struggle to detect ordinary citation mistakes, identifying sophisticated AI-generated fabricated citations becomes even more challenging.

NBD: Your paper proposes four recommendations for the research community. Which do you believe is both the most urgent and the hardest to implement?

Maxim Topaz: The most urgent recommendation is for publishers to integrate automated citation verification into the editorial workflow before peer review begins.

The technology already exists.

The primary obstacles are institutional rather than technical. Publishers must invest resources and modify long-established editorial processes, which makes implementation challenging despite its feasibility.

The most difficult task, however, is cleaning up the existing scientific literature.

Screening millions of published papers and issuing corrections would be enormously expensive, and currently no single organization is willing—or empowered—to take responsibility. There is also limited incentive within academia to revisit papers that have already been published.

In short, the immediate priority is preventing new problematic papers from entering the literature through mandatory pre-publication citation verification.

Cleaning up the existing literature will be a much greater challenge.

NBD: As one of the first researchers to systematically expose this emerging crisis, what concerns you most over the next three to five years? What immediate action would you urge the global research community, publishers, and regulators to take?

Maxim Topaz: My greatest concern is the emergence of a self-reinforcing cycle.

Once a paper containing fabricated citations is published, it can be cited by future papers and even used to train the next generation of AI models. This allows fabricated information to spread and amplify itself over time.

If the problem is not addressed quickly, contamination of the scientific literature may outpace our ability to clean it up.

My message to researchers, publishers, and regulators is straightforward:

Make automated citation verification a mandatory step before peer review. The problem is not AI itself. The real danger arises when AI-generated content enters the permanent scientific record without adequate verification. We should not ban AI tools. Instead, we must integrate robust verification into the research workflow. AI is not the threat. Unchecked AI-generated content is the real threat.

*This English version is a translation of the original Chinese interview. In the event of any discrepancy, the Chinese version shall prevail.

Editor: Gao Han