Low precision processing for high order stencil computations

Modern scientific workloads have demonstrated the inefficiency of using high precision formats. Moving to a lower bit format or even to a different number system can provide tremendous gains in terms of performance and energy efficiency. In this article, we explore the applicability of different number formats and exhaustively search for the appropriate bit width for 3D complex stencil kernels, which is one of the most widely used scientific kernels. Further, we demonstrate the achievable performance of these kernels on the state-of-the-art hardware that includes CPU and FPGA, which is the only hardware supporting arbitrary fixed-point precision. Thus, this work fills the gap between current hardware capabilities and future systems for stencil-based scientific applications.