J Liu
Online traffic-aware fault detection for networks-on-chip
Liu, J; Harkin, J; Li, Y; Maguire, L
Authors
J Harkin
Y Li
L Maguire
Abstract
A key requirement for modern Networks-on-Chip (NoC) is the ability to detect and diagnose faults and failures. This paper addresses the challenge of fault diagnosis using online testing where the interruption of the runtime operation (performance) under diagnosis is minimised. A novel Monitor Module (MM) is proposed to detect NoC interconnect faults which minimise the intrusion of the regular NoC traffic throughput by (1) using a channel tester which only examines NoC channels when they are idle; and (2) using a testing interval parameter based on the Binary Exponential Back off algorithm to dynamically balance the level of testing when recovering from temporary faults. The paper presents results on the minimal impact on NoC throughput for a range of testing conditions and also highlights the minimal area overhead of the MM (11.56%) compared with an adaptive NoC router implemented on FPGA hardware. Simulation results demonstrate non-intrusion of the NoC runtime traffic throughput when channel are fault free, and also how throughput loss is minimised when faults are identified.
Citation
Liu, J., Harkin, J., Li, Y., & Maguire, L. (2014). Online traffic-aware fault detection for networks-on-chip. https://doi.org/10.1016/j.jpdc.2013.09.001
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 5, 2013 |
Online Publication Date | Sep 16, 2013 |
Publication Date | Sep 1, 2014 |
Deposit Date | Jul 31, 2015 |
Journal | Journal of Parallel and Distributed Systems |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 74 |
Issue | 1 |
Pages | 1984-1993 |
DOI | https://doi.org/10.1016/j.jpdc.2013.09.001 |
Publisher URL | http://dx.doi.org/10.1016/j.jpdc.2013.09.001 |