EAGER: Building a Provable Differentially Private Real-time Data-blind ML Algorithm: A case study on Enhancing STEM Student Engagement in Online Learning

See grant description on NSF site

Program Manager:

Raj Acharya

Active Dates:

June 15, 2023 -- May 31, 2024

Awarded Amount:

$150,000

Investigator(s):

Ashis Kumer Biswas

Awardee Organization:

University of Colorado at Denver
Colorado

Directorate

Computer and Information Science and Engineering (CISE)

Abstract:

The COVID-19 pandemic may be over, but transitions in course delivery format—going remote, or hybrid—are still being used and universities appreciate their potential to attract more diverse groups of students than purely on-campus classes. This flexible education format in platforms like Zoom is here to stay. To deliver better learning experiences, educators need to gauge students' engagement in courses. But, while lecturing, it is challenging to assess engagement online. Machine learning technology can help educators during lectures so that the classroom engagement dynamics can be estimated, and proper interventions can be taken in real time. However, data-driven machine learning (ML) technology puts its users at risk of privacy loss, even with distributed machine learning programs hosted in individual students’ personal workstations that learn patterns of their users and report the patterns back to a global learner that merges the resulting findings into a global ML model. Although no private data is leaving local workstations, the individual patterns distributed across the network can leak private data. This project will build innovative privacy-aware student-engagement detection technology. The main novelty of this project will be in its capacity to learn in real-time from various types of student engagement data without directly accessing it. In platformized online education, the project will add privacy guarantee to users, while underrepresented STEM students can safely interact with educators and peers to facilitate the community of inquiry model of learning. The project aims to design a distributed machine learning paradigm that introduces three hierarchical categories of learner nodes that will be facilitated by a novel neural network architecture agnostic gradient sharing algorithm that will make any coordinated attempt to reconstruct original data from the partial gradients shared between nodes provably intractable. The hierarchical organization of the framework makes it effective at providing a level of obfuscation in partial gradients coming from partially observable model architecture. The research methodology will be motivated by concepts of differential privacy in gradient sharing algorithms. The project will introduce new concepts regarding how to select the gradient components to distribute and to optimize learnable parameters without incurring any additional computational overhead in building a global model, compared to the state-of-the-art gradient-based defense algorithms. The project will be driven by two research thrusts: (1) design of a provable privacy-aware distributed machine learning framework, (2) leveraging the novel framework in estimating student engagement in platformized online STEM education at University of Colorado Denver. The research effort will solve an open problem in the distributed machine learning from a black-box perspective where both full gradients and model architecture are unknown. Therefore, it has potential to be adopted in other areas where privacy aware ML is a requirement. The project outcomes will provide immediate benefits to 1) undergraduate STEM students while improving student retention and overall learning experiences, 2) online STEM instructors who will be able to gauge student engagement in real-time with an equitable, privacy-aware and inclusive learning environment. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.