Systematic Pan-Cancer Analysis of Somatic Allele Frequency Open Access
Downloadable ContentDownload PDF
Imbalanced expression of somatic alleles in cancer can suggest functional and selective features, and can therefore indicate possible driving potential of the underlying genetic variants. To explore the correlation between allele frequency of somatic variants, and total gene expression of their harboring gene, we used the unique data set of matched tumor and normal RNA and DNA sequencing data of 5523 distinct single nucleotide variants in 381 individuals across 10 cancer types obtained from The Cancer Genome Atlas (TCGA). We analyzed the purity-adjusted allele frequency in the context of the variant and gene functional features, and linked it with changes in the total gene expression. We documented higher allele frequency of somatic variants in cancer-implicated genes (Cancer Gene Census, CGC). Furthermore, somatic alleles bearing premature terminating variants (PTVs), when positioned in CGC genes, appeared to be less frequently degraded via nonsense-mediated mRNA decay, indicating possible favoring of truncated proteins by the tumor transcriptome. Among the genes with multiple PTVs with high allele frequency were key cancer genes including ARID1, TP53 and NSD1. Altogether, our analysis suggests that high allele frequency of tumor somatic variants can indicate driving functionality, and can serve to identify potential cancer-implicated genes.