FChain: toward black-box online fault localization for cloud systems
Presentation of the paper "FChain: toward black-box online fault localization for cloud systems" by Hiep Nguyen, Zhiming Shen, Yongmin Tan and Xiaohui Gu, presented at ICDCS 2013
fluctuations that are distinctive from the normal fluctuation patterns The abnormal system metric changes often start from the faulty components and then propagate to other non-faulty components via inter-component interactions
recall Slave modules demands < 1% CPU, ~3MB RAM per host High parallelism, high scalability Still room for improvement (e.g. adaptive look-back window)