Amarda Shehu
$599,173
University of Washington
Washington
Computer and Information Science and Engineering (CISE)
CAREER: Realizing the Potential of Behavioral Data Science for Population HealthDetailed behavioral data from phones, watches, fitness trackers and health apps offer an unparalleled opportunity to quantify and act upon previously unmeasurable behavioral changes in order to improve mental health and accelerate responses to emerging diseases. This is possible because these conditions often manifest themselves through behavioral and physiological changes (e.g., reduced activity, increased heart rate, depressed mood). Currently, such conditions exact a massive toll, with mental health conditions representing 19% of all years of life lost to disability and premature mortality and viral infections, such as COVID-19 and influenza, rising to the third leading cause of death in the US in 2020. Despite the significant potential of increasingly available data, broad and tangible impacts have yet to be realized, in part due to the unique challenges of integrating and modeling a broad range of behavioral and health data. This project seeks to address these challenges by developing and sharing computational tools that will enable researchers, clinicians and practitioners to improve mental health treatment and more rapidly respond to emerging diseases. The project will also provide an integrated research and educational program by: (1) increasing high-school students' exposure to computer science with a focus on health and well-being applications; (2) creating and disseminating materials for high-school teachers to use in their classrooms; (3) preparing undergraduate students to address health and well-being challenges through an interdisciplinary data science class; and (4) broadly disseminating the results of this work through public open-source software and workshops for researchers and practitioners. The goal of this project is to develop a unified representation learning framework that addresses the unique challenges of modeling fine-grained behavioral data. Specifically, the learned compressed representations must: (1) be highly predictive in spite of the challenges of integrating heterogeneous data sources from sensors, devices, app use, demographic and health information (e.g., highly seasonal time series, discrete events, and static features), (2) effectively generalize to new users, populations, and outcomes outside the training data and source domain, (3) be robust to commonly missing data, and (4) protect private identifying information when considering how to share data or models with others. Additionally, due to recruiting and participation costs, most behavioral health applications represent small data problems, making it particularly challenging to learn effective predictive models from individual datasets alone. To address these challenges, this project will develop and integrate new methods for representation learning, self-supervision, transfer learning, robustness to missing data, and the protection of identifying information. The research team will demonstrate and evaluate the performance of the representation learning framework across a diverse set of health applications, including behavioral monitoring of influenza and COVID-19 symptoms and personalizing sleep and mental health interventions. With these advancements, the project seeks to enable rapid model customization, significantly reduce the expertise and effort required to build new behavioral health research and applications, and help scientists and health professionals answer fundamental research questions about the impact of behavioral health conditions and the design of personalized interventions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.