Selected Publications
Journal Papers
- [TACO-24] Zhihua Fan, Wenming Li, Zhen Wang, Yu Yang, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Improving Utilization of Dataflow Unit for Muti-Batch Processing. ACM Transactions on Architecture and Code Optimization (TACO). Volume 21 Issue 1, Article No. 17, pp 1–26. 2024. (CCF-A)
- [TPDS-23] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-A)
- [EuroPar-23] Zhihua Fan, Wenming Li, Shengzhong Tang,Xuejun An, Xiaochun Ye, Dongrui Fan. Improving Utilization of Dataflow Architectures Through Software and Hardware Co-Design. In Euro-Par 2023: Parallel Processing. Lecture Notes in Computer Science, vol 14100. Springer 2023. (CCF-B)
- [计研发-23] 范志华, 吴欣欣, 李文明, 曹华伟, 安学军, 叶笑春, 范东睿. 面向低精度神经网络的数据流体系结构优化. 计算机研究与发展, 2023, 60(1): 43-58 (CCF-T1)
- [HPCC-22] Zhihua Fan, Wenming Li, Tianyu Liu, Shengzhong Tang, Zhen Wang, Xuejun An, Xiaochun Ye, Dongrui Fan. A Loop Optimization Method for Dataflow Architecture. In 2022 IEEE 24th Int Conf on High Performance Computing & Communications (HPCC), 2022, pp. 202-211 (CCF-C)
- [NPC-22] Zhihua Fan, Wenming Li, Shengzhong Tang, Tianyu Liu, Xuejun An, Xiaochun Ye, Dongrui Fan. A Routing-Aware Mapping Method for Dataflow Architectures. In Network and Parallel Computing. Lecture Notes in Computer Science, vol 13615. Springer. (CCF-C)
- [TPDS-25] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-A,通讯作者)
- [TCAD-25] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-A,通讯作者)
- [EuroPar-25] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-B,通讯作者)
- [APPT-25] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF旗舰会议,通讯作者)
- [TACO-25] Shantian Qin, Zhihua Fan, Wenming li, Zhen Wang, Xuejun An, Xiaochun Ye, and Dongrui Fan. PANDA: Adaptive Prefetching and Decentralized Scheduling for Dataflow Architectures. ACM Transactions on Architecture and Code Optimization (TACO). Accepted (CCF-A, 通讯作者)
- [JSA-25] Zhiyuan Zhang, Zhihua Fan, Wenming Li, Yuhang Qiu, Zhen Wang, Xiaochun Ye, Dongrui Fan, Xuejun An, Accelerating tensor multiplication by exploring hybrid product with hardware and software co-design, Journal of Systems Architecture (JSA), Volume 159, 2025. (CCF-B, 通讯作者)
- [DATE-25] Yuhang Qiu, Wenming Li, Tianyu Liu, Zhen Wang, Zhiyuan Zhang, Zhihua Fan, Xiaochun Ye, Dongrui Fan, Zhimin Tang. Accelerating Authenticated Block Ciphers via RISC-V Custom Cryptography Instructions. In Design, Automation & Test in Europe Conference & Exhibition (DATE). Accepted (CCF-B, 通讯作者)
- [学报-25] 穆宇栋, 李文明, 范志华, 吴萌, 吴海彬, 安学军, 叶笑春, 范东睿. 面向 YOLO 神经网络的数据流架构优化研究.计算机学报, 2025, 48(1): 82-99. (CCF-T1, 通讯作者)
- [JCST-25] Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. In IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-B)
- [ISCAS-25] Shantian Qin, Ziqing Qiang, Zhihua Fan, Wenming li, Xuejun An, Xiaochun Ye, and Dongrui Fan. StreamDCIM: A Tile-based Streaming Digital CIM Accelerator with Mixed-stationary Cross-forwarding Dataflow for Multimodal Transformer. Accepted (CCF-C, 通讯作者)
- [ICCD-23-1] Tianyu Liu, Wenming Li, Zhihua Fan. DFGC: DFG-aware NoC Control based on Time Stamp Prediction for Dataflow Architecture. In IEEE 41st International Conference on Computer Design (ICCD), 2023, pp. 432-439. (CCF-B,通讯作者)
- [ICCD-23-2] Haibin Wu, Wenming Li, Zhihua Fan, Zhen Wang, TianyuLiu, Junying Huang, Shengzhong Tang, Yanhuan Liu, Kunming Zhang, Xiaochun Ye, Dongrui Fan. Alleviating Transfer Latency in DataFlow Accelerator for DSP Applications. In IEEE 41st International Conference on Computer Design (ICCD), 2023, pp. 440-443. (CCF-B,通讯作者)
- [DATE-22] Xinxin Wu, Zhihua Fan, Tianyu Liu, Wenming Li, Xiaochun Ye, Dongrui Fan. LRP: Predictive output activation based on SVD approach for CNNs acceleration. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2022, pp. 831-836. (CCF-B)
Conference Papers