When scientists sequence tumor DNA, they typically find small amounts of genetic code from bacteria, viruses and fungi – microorganisms that, if actually present in tumor tissues, could influence how they grow, evade immunity or respond to treatment. But do microorganisms truly reside in tumors, or do the samples become contaminated before sequencing occurs? Independent analyses of the same genomic data have reached wildly different conclusions. Now, researchers at Rutgers Cancer Institute, the state’s only National Cancer Institute-designated Comprehensive Cancer Center, have developed a computational tool that settles the controversy by distinguishing genuine microbial signals from artifacts. Their findings are published in Cancer Cell.

“There are microbes all over the environment, on our skin and in our breath,” said Subhajyoti De, a member of the Genomic Instability and Cancer Genetics Program at Rutgers Cancer Institute and the senior author of the study. “There could be DNA particles floating in the air. How do you know what you’re finding came from the tissue you were interested in, or was something introduced along the way?”

The tool, called PRISM (Precise Identification of Species of the Microbiome), addresses all those issues. It uses rapid screening for an initial overview, then applies more stringent steps to remove lingering human sequences and perform full-length alignment of measured genomic sequences to microbial reference databases. Finally, it uses a machine-learning model trained to predict whether each detected microbe is truly present or a contaminant. To read the full story.