James Fowler
$300,000
University of Pittsburgh
Pennsylvania
Computer and Information Science and Engineering (CISE)
Detecting abrupt changes in the underlying statistical characteristics of online data streams is an important problem commonly encountered in many science and engineering applications. Examples include anomaly detection using video streams, line-outage detection in power grids, onset detection of a pandemic, and detection of cyber-attacks. While traditional techniques assume that probability distributions for both before and after the change are known or can be found, this assumption is unrealistic in most scenarios of practical interest. As an alternative to such traditional change-detection approaches, this project considers the use of deep neural networks to effectuate change detection. However, rather than attempting to learn probability distributions directly, the project leverages the recently-demonstrated ability of deep neural networks to learn the "score" (i.e., the gradient of logarithm of the probability density) from the data and aims to develop score-based algorithms for change detection. These scores can be learned for a large class of high-dimensional data models using modern tools of artificial intelligence and rendering the developed algorithms applicable to a broad class of change-detection problems. Fundamental mathematical theories will be developed in the project to establish the efficacy and efficiency of the proposed methods, and the developed algorithms will be validated on several publicly available machine-learning and anomaly-detection datasets. Broader-impact aspects of the project include providing algorithms to the wider community for solving change- and anomaly-detection problems across multiple, disparate fields as well as activities centered on integrating research into graduate coursework and providing opportunities for underrepresented students to participate in the project.<br/> <br/>The algorithms developed in the project will be based on the score of the data; this score can be explicitly derived for known unnormalized models or can be learned using score matching using an artificial neural network, and developed algorithms will be optimized to detect the changes with the minimum possible delay while avoiding false alarms. The project is divided into four technical thrusts. The first thrust will develop the fundamental theory for score-based quickest change detection for independent and identically distributed single-stream data under Bayesian, generalized Bayesian, and minimax problem formulations. While the performance of classical change-detection methods depends on the Kullback-Leibler distance between the distributions before and after the change, it will be established that the performance of the score-based methods depends on the Fisher distance between distributions. The second thrust will develop robust methods for detecting changes under modeling uncertainty, using the Fisher distance between the elements of the uncertainty classes. The third thrust will define the notion of scores for dependent data sequences and obtain optimal algorithms for detecting changes, with the scores in this case being based on the gradient of the logarithm of the conditional densities. The fourth and final thrust will develop algorithms for distributed change detection wherein multiple agents may have partial knowledge of the distributions and may only communicate with their neighbors in a geographical area. <br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.