Mitra Basu
$200,000
University of Minnesota-Twin Cities
Minnesota
Computer and Information Science and Engineering (CISE)
Pathogens such as the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) affect different people differently. Whether an individual mounts a strong response or not depends, at least in part, on their genes. Specific genes code for the proteins on the surface of cells that present viral protein fragments to the immune system. Killer T cells recognize these fragments and kill the infected cells. The immune response to SARS-CoV-2 hinges on whether the viral protein fragments bind into a groove in these cell-surface proteins -- like a key into a lock. The molecular biology is well understood. Whether a protein fragment binds or not is a question of 3D structure and simple atomic force calculations. The full set of proteins associated with SARS-CoV-2 was published in March, so the requisite data is available. This project will predict, through purely computational means, whether such binding happens for all viral protein fragments, for all common variants of the cell surface proteins -- so for all keys into all types of locks. This computational ability will be transformative for a scientific understanding of the pandemic If successful, the same computational infrastructure could be deployed in the future for other pandemics -- those caused by viruses or by bacteria. It could also be transformative in characterizing the human immune system, in general, and its response to pathogens. In technical terms, the goal of the project is to predict, through computational means, which peptides derived from SARS- CoV-2 will bind to each allelic variant of MHC-I molecule commonly found in the U.S. population. Human leukocyte antigen (HLA) typing can be performed to establish the allelic variants of MHC-I molecules of individuals. With population-wide typing, the tools developed by this project will predict which individuals in a population are most likely to mount a strong antiviral immune response to the virus, given their MHC-I alleles. Immunopeptidome profiling will be performed of all common allelic variants of MHC-I molecules, first using machine-learning algorithms. Next immunopeptidome profiling will be performed using custom-developed atomic-level simulation software, deployed on graphical processing units.The project will provide a public implementation of the tool set. The results of the research will be promptly disseminated on a website hosted by the University of Minnesota. The front-end will exploit modern software infrastructure for data analytics and visualization. The back-end will consist of a MySQL database, directly linked to the computational engine, running on a distributed platform.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.