It is complicated to identify cancer-causing mutations. The recurrence of a mutation in patients remains one of the most reliable features of mutation driver status. However, some mutations are more likely to happen than others for various reasons. Different sequencing analysis has revealed that cancer driver genes operate across complex pathways and networks, with mutations often arising in a mutually exclusive pattern. Genes with low-frequency mutations are understudied as cancer-related genes, especially in the context of networks. Here we propose a machine learning method to study the functionality of mutually exclusive genes in the networks derived from mutation associations, gene-gene interactions, and graph clustering. These networks have indicated critical biological components in the essential pathways, especially those mutated at low frequency. Studying the network and not just the impact of a single gene significantly increases the statistical power of clinical analysis. The proposed method identified important driver genes with different frequencies. We studied the function and the associated pathways in which the candidate driver genes participate. By introducing lower-frequency genes, we recognized less studied cancer-related pathways. We also proposed a novel clustering method to specify driver modules. We evaluated each driver module with different criteria, including the terms of biological processes and the number of simultaneous mutations in each cancer. Materials and implementations are available at: https://github.com/MahnazHabibi/MutationAnalysis. Author summary It can be challenging to find mutations that cause cancer. One of the most trustworthy characteristics for identifying cancer-causing mutations is the recurrence of a mutation in patients. However, some uncommon and low-frequency mutations should also be explored as cancer-related mutations, particularly in the setting of networks. In this study, we suggested a unique approach to discover prospective driver genes and investigate the functionality of mutually exclusive genes in networks formed from mutation connections and gene-gene interactions. These networks have identified critical biological elements in the vital pathways, notably in those that experience infrequent mutations. In the first step, we established six enlightening topological features for each gene acting as a network node. For each gene, we computed the score for our predefined features. Then, we suggested the high-scoring genes with significant connections to cancer as potential targets for further research. In the second step, we constructed a network based on the relationships between the high-score genes to find the cancer-related modules. We used what we had learned in the first step about how the high-score potential driver genes interact physically, biologically, and in terms of how they work to build this network.
QC 20230320