论文部分内容阅读
提出了一种基于Nvidia公司Fermi架构图形处理单元(GPU,Graphic Pro-cessing Unit)的分层低密度奇偶校验LDPC(Low-Density Parity-Check)码译码算法的译码器结构优化设计.利用GPU架构的并行性特点,采用帧间与层内双重并行的处理方式,充分利用流多处理器硬件资源,有效缓解了分层译码算法并行度受限的问题.此外,通过采取片上constantmemory存储器压缩存储校验矩阵以及利用片外global memory存储器对译码迭代信息进行联合访问的优化方法,有效降低了访存延迟,提高了译码吞吐率.测试结果表明,通过采用多帧并行处理和存储器访问优化可以提升基于GPU的LDPC译码器吞吐率14.9~34.8倍.
An optimized decoder structure design for LDPC (Low-Density Parity-Check) code decoding algorithm based on Nvidia’s Fermi GPU (Graphic Pro-cessing Unit) is proposed. Taking advantage of the parallelism of GPU architecture, the dual parallel processing between frames and layers is utilized to take full advantage of the multi-processor hardware resources, which effectively alleviates the problem of parallelism of layered decoding algorithm.In addition, by using on-chip constant memory Memory compression parity check matrix and the optimization method of united access to decoded iterative information using off-chip global memory can effectively reduce the memory access latency and improve the decoding throughput.The test results show that by using multi-frame parallel processing and Memory access optimization can improve GPU-based LDPC decoder throughput 14.9 ~ 34.8 times.