Hector Munoz-Avila
$373,667
Li Xiong
Emory University
Georgia
Computer and Information Science and Engineering (CISE)
Artificial intelligence (AI) provides powerful techniques for understanding and prediction of complex systems such as modeling and predicting the spread of infectious diseases. Despite this, these predictive capabilities are rarely adopted by public health decision-makers to support policy interventions. One of the issues preventing their adoption is that AI methods are known to amplify the bias in the data they are trained on. This is especially problematic in infectious disease models which leverage available large and inherently biased spatiotemporal data. These biases may propagate through the modeling pipeline to decision-making, resulting in inequitable and ineffective policy interventions. This project investigates how the AI disease modeling pipeline can lead from biased data to biased predictions and to derive solutions that mitigate this bias in three aims: 1) creating an AI system to predict the spread of emerging infectious diseases in space and time, 2) simulating a population from which we will collect data often used as input for AI systems in a way that the bias is controlled, and 3) exploring links between bias in the collected data and the resulting bias in the AI model and deriving solutions for their mitigation. The project will enable AI-driven infectious disease models and predictions that will support fair and equitable decision-making and interventions. The project will enrich education and training related to ethical AI practices and will support professional development opportunities for early-career researchers, graduate, undergraduate, and high school students in the United States and Australia. In Aim 1, the team of researchers will use a self-supervised contrastive learning approach that uses mobility prediction as a pre-text task to learn representations of spatial regions. These representations can be used for infectious disease spread prediction given only very little infectious disease ground truth data. The investigators hypothesize that such a model is susceptible to data bias. Thus, in Aim 2, the team of researchers will leverage a large-scale agent-based simulation that will serve as a sandbox world for which we have perfect knowledge of and from which we can collect data and inject various types of bias. For Aim 3, the team of researchers will investigate how different types of simulated data bias leads to biased AI predictions by leveraging different metrics of fairness in AI and studying how these fairness measures can be incorporated into the AI optimization procedure to mitigate bias. By understanding, measuring, and mitigating bias inherent to traditional AI solutions, the project will enable accurate, scalable, and rapid predictions to support fair and equitable decision-making for pandemic prevention. This is a joint project between researchers in the United States and Australia funded by the Collaboration Opportunities in Responsible and Equitable AI under the U.S. NSF and the Australian Commonwealth Scientific and Industrial Research Organization (CSIRO). This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.