UVA Health researchers are developing a new tool to advance genomics and disease research

UVA Health researchers are developing a new tool to advance genomics and disease research

UVA Health researchers have developed an important new tool that will help scientists distinguish signal from noise while investigating the genetic causes of cancer and other diseases. In addition to advancing research and potentially accelerating new treatments, the new tool could help improve cancer diagnosis by making it easier for doctors to identify cancer cells.

The new tool, developed by Chongzhi Zang, PhD, UVA, and his team and collaborators, is a mathematical model that will help determine the integrity of “big data” about the building blocks of our chromosomes, genetic material called chromatin , ensure. Chromatin – a combination of DNA and protein – plays an important role in controlling the activity of our genes. When chromatin goes wrong, it can turn a healthy cell into cancer or contribute to other diseases.

Scientists can now study chromatin in single cells using a cutting-edge technology called ‘single-cell ATAC-seq’, but this generates an enormous amount of data, including lots of noise and bias. Zang’s new tool cuts through that, saving scientists from misleading clues and wasted effort.

In the best of times, large-scale genome research on single cells is like “hunting for a needle in a haystack,” says Zang. But his new tool will make it a lot easier by clearing away a bunch of bad hay.

With the traditional way of analyzing data, you may see some patterns that look like real signals of a specific chromatin state, but are false due to the bias of the experimental technology itself. Such fake signals can confuse scientists. We have developed a model to better capture and filter out such false signals so that the real needle we are looking for can more easily stick out of the hay.”

Chongzhi Zang, PhD, Computational Biologist at the UVA Center for Public Health Genomics and UVA Health Cancer Center

About the genomics tool

Zang’s new tool adapts a model from number theory and cryptology called “simplex encoding”. He and his colleagues used this to encode DNA sequences into mathematical forms, and eventually transformed the complex genome sequence into a much simpler mathematical form. You can then compare different shapes to detect distortions and noise in the sequence data that are not easy to find using traditional approaches.

“The complexity of DNA sequences increases exponentially as they get longer. They’re difficult to model because a typical data set contains millions of sequences from thousands of cells,” said Shengen Shawn Hu, PhD, a researcher in Zang’s lab and lead author of this paper. “But the simplex coding model can provide an accurate estimate of sequence distortions because of its beautiful mathematical property.”

Tests of the tool showed that it was significantly better at analyzing complex single-cell data to characterize different cell types. This is important for both basic biological research and disease diagnosis, where doctors need to detect tiny numbers of disease cells in much larger samples ranging from tens of thousands to millions of cells.

“The distortions were not easy to find because they were intertwined with real signals and hidden in the large amounts of data. It might not be a big deal if people just picked the strongest signals from a large number of cells,” Zang said. who recently co-led several other single-cell genomics studies investigating coronary artery disease and gut development. “But if you look at single-cell data, there’s no longer any low-hanging fruit. Signals are always weak at the individual cell level, and the effects of noise and distortion can be catastrophic. Bias correction is often ignored but can be crucial in single-cell data analysis.”

To make their new tool widely available, the researchers developed free, open-source software and put it online. The software can be found at https://github.com/zang-lab/SELMA and at https://doi.org/10.5281/zenodo.7048767.

“We hope this tool can benefit the biomedical research community in the study of chromatin biology and genomics, and eventually aid in disease research,” said Zang. “It’s always exciting to see how our colleagues are using the tools we’ve developed to make important scientific discoveries in their own research.”

results published

The researchers published their results in the journal nature communication. (The article is open access, i.e. freely readable.) The team consisted of Shengen Shawn Hu, Lin Liu, Qi Li, Wenjing Ma, Michael J. Guertin, Clifford A. Meyer, Ke Deng, Tingting Zhang, and Chongzhi Zang.

Zang is part of UVA’s Public Health Sciences, Biochemistry and Molecular Genetics, and Biomedical Engineering departments. The Department of Biomedical Engineering is a collaboration of the UVA School of Medicine and School of Engineering.

Work was supported by the National Institutes of Health grants R35GM133712, K22CA204439, and R35GM128635; the National Science Foundation, Grant NSF-796 2048991; the University of Pittsburgh Center for Research Computing; UVA Cancer Center; and the National Cancer Institute of the NIH, Cancer Center Support Grant P30 CA44579.


University of Virginia Health System

Magazine reference:

Huh SS et al. (2022) Intrinsic bias estimation for improved analysis of mass and single-cell chromatin accessibility profiles using SELMA. nature communication. doi.org/10.1038/s41467-022-33194-z.

#UVA #Health #researchers #developing #tool #advance #genomics #disease #research

Leave a Comment

Your email address will not be published. Required fields are marked *