An Energy-Efficient FPGA-Based Transformer Accelerator for AIoT Consumer Electronics Using Dual-Side Diagonal Sparsity

Published in IEEE Transactions on Consumer Electronics, 2026

Abstract: Consumer electronics, such as smartphones, wearables, and smart cameras, increasingly demand real-time and energy-efficient on-device intelligence enabled by Artificial Intelligence (AI) and the Internet of Things (IoT) to support perception and control under strict latency and power constraints. Transformer-based models have been increasingly adopted in sensing and perception tasks due to their ability to capture contextual relationships via self-attention. However, their high computational and memory costs limit their practicality on resource-and power-constrained Artificial Intelligence of Things (AIoT) devices. To address these challenges, we propose a dual-side diagonal sparsity mechanism that partitions the Q and K matrices into 2×2 blocks and applies diagonal pruning within each block, reducing computation and memory overhead while preserving transposition invariance and hardware-friendly regularity. We design the Sparse Processor Architecture for Transformer Acceleration (SPATA), a Field-Programmable Gate Array (FPGA)-based framework for energy-efficient on-device Transformer inference on AIoT devices, supporting both sparse–sparse and dense–dense matrix multiplications in multi-head self-attention (MHSA). SPATA integrates a Dual-Mode Matrix Unit (DMMU) along with a diagonal prune–compress module and a vector unit supporting fused addition and rectified linear unit (ReLU) operations. Experimental results on a Xilinx Kintex-7 FPGA show that SPATA achieves 168.48 giga operations per second (GOP/s) throughput, 0.05 ms latency, and 82.18 GOP/J energy efficiency on MHSA workloads, delivering up to a 40× speedup over Central Processing Units (CPUs) and Graphics Processing Units (GPUs) baselines and outperforming prior FPGA-based Transformer accelerators.

Recommended citation: S. Liu et al., "An Energy-Efficient FPGA-Based Transformer Accelerator for AIoT Consumer Electronics Using Dual-Side Diagonal Sparsity," IEEE Trans. Consum. Electron., pp. 1-1, Apr. 2026.
Download Paper