Electronic International Standard Serial Number (EISSN)
Reed–Solomon erasure codes (RS-ECs) are widely used in packet communication and storage systems to recover erasures. When the RS-EC decoder is implemented on a field-programmable gate array (FPGA) in a space platform, it will suffer single-event upsets (SEUs) that can cause failures. In this article, the reliability of an RS-EC decoder implemented on an FPGA when there are errors in the user memory is first studied. Then, a fault detection and location scheme is proposed based on partial reencoding for the faults in the user memory of the RS-EC decoder. Furthermore, check bits are added in the generator matrix to improve the fault location performance. The theoretical analysis shows that the scheme could detect most faults with small missing and false detection probability. Experimental results on a case study show that more than 90% of the faults on user memory could be tolerated by the decoder, and all the other faults can be detected by the fault detection scheme when the number of erasures is smaller than the correction capability of the code. Although false alarms exist (with probability smaller than 4%), they can be used to avoid fault accumulation. Finally, the fault location scheme could accurately locate all the faults. The theoretical estimates are very close to the experiment results, which verifies the correctness of the analysis done.