This presentation will describe how data analytics and real-time monitoring can be used to ensure that boards, and systems operate as intended. In the first part of the talk, the speaker will focus on the resilience problem for complex boards; we are seeing a significant gap today between working silicon and a working board, which is reflected in failures at the board level that cannot be duplicated at the component level. The speaker will describe how machine learning, statistical techniques, and information-theoretic analysis can be used to close the gap between working silicon and a working system. Next, the presenter will describe how time-series analysis can be used to detect anomalies in complex core router systems. The effectiveness of proactive fault tolerance depends on whether anomalies can be accurately detected before a failure occurs. However, traditional anomaly detection techniques fail to detect “outliers” when the monitored data involves temporal measurements and exhibits significantly different statistical characteristics for its constituent features. The speaker will describe a feature-categorization-based hybrid method to overcome the difficulty of detecting anomalies in features with different statistical characteristics. A correlation analyzer will be described to remove irrelevant and redundant features. A comprehensive set of experimental results will be presented for data collected during 30 days of field operation from over 20 core routers deployed by customers of a major telecom company.
Krishnendu Chakrabarty received the B. Tech. degree from the Indian Institute of Technology, Kharagpur, in 1990, and the M.S.E. and Ph.D. degrees from the University of Michigan, Ann Arbor, in 1992 and 1995, respectively. He is now the William H. Younger Distinguished Professor of Engineering in the Department of Electrical and Computer Engineering and Professor of Computer Science at Duke University. He also serves as Director of Graduate Studies for Electrical and Computer Engineering. He will assume duties as Chair of the Department of Electrical and Computer Engineering on September 1, 2017. Prof. Chakrabarty is a recipient of the National Science Foundation CAREER award, the Office of Naval Research Young Investigator award, the Humboldt Research Award from the Alexander von Humboldt Foundation, Germany, the IEEE Transactions on CAD Donald O. Pederson Best Paper Award (2015), the ACM Transactions on Design Automation of Electronic Systems Best Paper Award (2016), and over a dozen best paper awards at major conferences. He is also a recipient of the IEEE Computer Society Technical Achievement Award (2015), the IEEE Circuits and Systems Society Charles A. Desoer Technical Achievement Award (2017), and the Distinguished Alumnus Award from the Indian Institute of Technology, Kharagpur (2014). He is a Research Ambassador of the University of Bremen (Germany) and a Hans Fischer Senior Fellow (named after Nobel Laureate Prof. Hans Fischer) at the Institute for Advanced Study, Technical University of Munich, Germany. Prof. Chakrabarty’s current research projects include: testing and design-for-testability of integrated circuits and systems; digital microfluidics, biochips, and cyberphysical systems; data analytics for fault diagnosis, failure prediction, anomaly detection, and hardware security; smart manufacturing. He is a Fellow of ACM, a Fellow of IEEE, and a Golden Core Member of the IEEE Computer Society. He was a 2009 Invitational Fellow of the Japan Society for the Promotion of Science (JSPS). He is a recipient of the 2008 Duke University Graduate School Dean’s Award for excellence in mentoring, and the 2010 Capers and Marion McDonald Award for Excellence in Mentoring and Advising, Pratt School of Engineering, Duke University. He has served as a Distinguished Visitor of the IEEE Computer Society (2005-2007, 2010-2012), a Distinguished Lecturer of the IEEE Circuits and Systems Society (2006-2007, 2012-2013), and an ACM Distinguished Speaker (2008-2016). Prof. Chakrabarty served as the Editor-in-Chief of IEEE Design & Test of Computers during 2010-2012 and ACM Journal on Emerging Technologies in Computing Systems during 2010-2015. Currently he serves as the Editor-in-Chief of IEEE Transactions on VLSI Systems. He is also an Associate Editor of IEEE Transactions on Biomedical Circuits and Systems, IEEE Transactions on Multiscale Computing Systems, and ACM Transactions on Design Automation of Electronic Systems.