I am an Assistant Professor at the Institute of Computing Technology (ICT), Chinese Academy of Sciences. I received my Ph.D. in Computer Architecture from ICT, Chinese Academy of Sciences in 2024, under the supervision of Prof. Xuejun An. I earned my B.S. degree in Computer Science and Technology from the College of Computer Science and Technology, JiLin University, in 2018.
My research focuses on developing high-efficiency processor architectures by exploring innovative computing paradigms. My current interests include dataflow architecture, reconfigurable architecture, and open-source instruction set architectures such as RISC-V, with applications in high-throughput computing and domain-specific acceleration.
📝 Publications
(* indicates the corresponding author)
- [ISCA-26] MLX: Multi-Layer Execution for Structured LLM Workload Acceleration on Spatial Architectures. Haibin Wu, Wenming Li, Zhihua Fan, Zirui Ma, Yuqun Liu, Tengfei Xia, Yanhuan Liu, Kunming Zhang, Xiaochun Ye, Dongrui Fan, Jian Weng. (CCF-A)
- [DAC-26] UniNL: Unifying Fragmented Non-Linear Operators for Efficient Edge LLM Inference. Zhengxuan Hu, Zhihua Fan*, Shantian Qin, Yudong Mu, Wenming Li and Xiaochun Ye. In Design Automation Conference (Just Accept) . (CCF-A, Corresponding Author)
- [DAC-26] AHASD: Asynchronous Heterogeneous Architecture for LLM Adaptive Drafting Speculative Decoding on Mobile Devices . Zirui Ma,Zhihua Fan*, Wenxing Li, Haibin Wu, Fulin Zhang, Wenming Li and Xiaochun Ye. In Design Automation Conference (Just Accept) . (CCF-A, Corresponding Author)
- [ASPLOS-26] BitRed: Taming Non-Uniform Bit-Level Sparsity with a Programmable RISC-V ISA for DNN Acceleration. Yanhuan Liu, Wenming Li, Kunming Zhang, Yuqun Liu, Siao Wen, Lexin Wang, Tianyu Liu, Haibin Wu, Zhihua Fan, Xiaochun Ye, Dongrui Fan, Xuejun An. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2026, 239-254. (CCF-A)
- [TCAD-26] A RISC-V Extended Infrastructure for Edge FHE Through Software and Hardware Co-Design. Zhihua Fan, Jing Xue, Wenming Li, Xuejun An, Xiaochun Ye. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Just Accept). (CCF-A)
- [JSA-26] A Real-Time Edge SAR Imaging Acceleration Architecture Utilizing Multi-Level Dataflow Parallelism . Yinshen Wang, Zhengxuan Hu, Ping Zhang, Zhihua Fan*, Wenming Li, Xiaochun Ye and Xuejun An. In Journal of Systems Architecture , Volume 170, 2026. (CCF-B, Corresponding Author)
- [FCS-26] Striking the Mantissa: How Few Bits are Enough for Accurate DNN Inference?. Zhiyuan Zhang, Ping Zhang, Zhihua Fan*, Wenming Li, Xiaochun Ye and Xuejun An. In Frontiers of Computer Science (Just Accept). (CCF-B, Corresponding Author)
- [DATE-26] A2RT: Efficient Ray Tracing Accelerator with Approximate-Accurate Computing and Quantization. Zhiyuan Zhang, Zhihua Fan*, Wenming Li, Yudong Mu, Yuhang Qiu, Zhen Wang, Xiaochun Ye and Xuejun An. In In Design, Automation & Test in Europe Conference & Exhibition (Just Accept). (CCF-B, Corresponding Author)
- [DATE-26] RISC-V ISA Extensions for Vectorized Unstructured Sparse SpMM in LLM Inference. Tengfei Xia,Zhihua Fan*, Jing Xue, Shantian Qin, Xiaochun Ye and Wenming Li. In In Design, Automation & Test in Europe Conference & Exhibition (Just Accept). (CCF-B, Corresponding Author)
- [TACO-25] Compressing and Accelerating Sparse CNNs Using Sign-Reserved Toeplitz Filters and Input Activation Density-aware Dataflow. Zhen Wang, Tianyu Liu, Zhihua Fan*, Wenming Li, Yuhang Qiu, Zhiyuan Zhang, Xuejun An, Xiaochun Ye, and Dongrui Fan. In ACM Transactions on Architecture and Code Optimization, Volume 22, Issue 4, Article 148. (CCF-A, Corresponding Author)
- [TACO-25] DFGAS: Exploring the Balance of HW-SW Scheduling through the DFG-Aware Scheme. Tianyu Liu, Zhihua Fan*, Wenming Li, Zhen Wang, Yuhang Qiu, Shengzhong Tang, Haibin Wu, Yanhuan Liu, Xiaochun Ye, and Dongrui Fan. In ACM Transactions on Architecture and Code Optimization,Volume 22, Issue 4, Article 147. (CCF-A, Corresponding Author)
- [TACO-25, HiPEAC-26] GenCNN: A Partition-Aware Multi-Objective Mapping Framework for CNN Accelerator Based on Genetic Algorithm. Yudong Mu, Zhihua Fan*, Wenming Li, Zhiyuan Zhang, Xuejun An, Dongrui Fan, and Xiaochun Ye. In ACM Transactions on Architecture and Code Optimization, Volume 22, Issue 3, Article No.105, Pages 1-26. (CCF-A, Corresponding Author)
- [TPDS-25] DFU-E: A Dataflow Architecture for Edge DSP and AI Applications. Wenming Li, Zhihua Fan*, Tianyu Liu, Zhen Wang, Haibin Wu, Meng Wu, Kunming Zhang, Yanhuan Liu, Ninghui Sun, Xiaochun Ye, and Dongrui Fan. In IEEE Transactions on Parallel and Distributed Systems, Volume 36, Issue 6, pp 1100-1114. (CCF-A, Corresponding Author)
- [TCAD-25] A RISC-V Extended Infrastructure for CNNs Through Pipelined Computing and Data Dependence Optimization. Teng Luo, Tengfei Xia, Jiayuan Chen, Zhihua Fan*, Wenming Li, Yudong Mu, Xuejun An, Xiaochun Ye, and Dongrui Fan. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 44, no. 11, pp. 4141-4154. (CCF-A, Corresponding Author)
- [TACO-25] PANDA: Adaptive Prefetching and Decentralized Scheduling for Dataflow Architectures. Shantian Qin, Zhihua Fan*, Wenming Li, Zhen Wang, Xuejun An, Xiaochun Ye, and Dongrui Fan. In ACM Transactions on Architecture and Code Optimization, Volume 22, Issue 2, Article No.62, Pages 1-27. (CCF-A, Corresponding Author)
- [EuroPar-25] FDHA: Fusion-Driven Heterogeneous Accelerator for Efficient Diffusion Model Inference. Yudong Mu, Zhihua Fan*, Xiaoxia Yao, Wenming Li, Zhiyuan Zhang, Honglie Wang, Xuejun An, and Xiaochun Ye. In Euro-Par 2025: Parallel Processing. Lecture Notes in Computer Science, vol 15901. (CCF-B, Corresponding Author)
- [JSA-25] Accelerating tensor multiplication by exploring hybrid product with hardware and software co-design. Zhiyuan Zhang, Zhihua Fan*, Wenming Li, Yuhang Qiu, Zhen Wang, Xiaochun Ye, Dongrui Fan, Xuejun An. In Journal of Systems Architecture, Volume 159. (CCF-B, Corresponding Author)
- [DATE-25] Accelerating Authenticated Block Ciphers via RISC-V Custom Cryptography Instructions. Yuhang Qiu, Wenming Li, Tianyu Liu, Zhen Wang, Zhiyuan Zhang, Zhihua Fan*, Xiaochun Ye, Dongrui Fan, Zhimin Tang. In Design, Automation & Test in Europe Conference & Exhibition pp 1-7. (CCF-B, Corresponding Author)
- [ISCAS-25] StreamDCIM: A Tile-based Streaming Digital CIM Accelerator with Mixed-stationary Cross-forwarding Dataflow for Multimodal Transformer. Shantian Qin, Ziqing Qiang, Zhihua Fan*, Wenming li, Xuejun An, Xiaochun Ye, and Dongrui Fan. In IEEE International Symposium on Circuits and Systems, pp 1-5. (CCF-B, Corresponding Author)
- [JCST-25] [HARLD: A RISC-V Based Tightly Coupled Heterogeneous Computing Architecture for LDPC Decoding]. Bing Wang, ZiRui Ma, HaiBin Wu, FuLin Zhang, Yue Wang, ZhiHua Fan, WenMing Li, XiaoChun Ye, and Dongrui Fan. In Journal of Computer Science and Technology (Just Accept). (CCF-B)
- [计研发-25] 基于数据流架构的NTT蝶式计算加速研究. 石泓博, 范志华*, 李文明, 张志远,穆宇栋,叶笑春,安学军. 计算机研究与发展, 2025, 62(6): 1547-1561. (CCF-T1, Corresponding Author)
- [计算机学报-25, HPC China-23] 面向 YOLO 神经网络的数据流架构优化研究. 穆宇栋, 李文明, 范志华*, 吴萌, 吴海彬, 安学军, 叶笑春, 范东睿. 计算机学报, 2025, 48(1): 82-99. (CCF-T1, Corresponding Author)
- [APPT-25] NFMap: Node Fusion Optimization for Efficient CGRA Mapping with Reinforcement Learning.Yudong Mu, Siyi Li, Zhihua Fan*, Wenming Li, Xuejun An and Xiaochun Ye. In International Symposium on Advanced Parallel Processing Technology (Just Accept). (CCF-C, Corresponding Author)
- [HPCC-25] [TSCNN: Compressing and Accelerating Sparse CNNs Using Sign-Reserved Toeplitz Filters]. Zhen Wang, Tianyu Liu, Zhihua Fan*, Yuhang Qiu, Zhiyuan Zhang, Wenming Li, Xiaochun Ye, Dongrui Fan. In IEEE International Conference on High Performance Computing and Communications(Just Accept). (CCF-C, Corresponding Author)
- [NPC-25] LightCacheRL: A Lightweight Reinforcement Learning Framework for Unified Cache Management. Kunming Zhang, Zhihua Fan, Yingchun Fu, Yanhuan Liu, Lexin Wang, Yuqun Liu, Haibin Wu, Wenming L. (Just Accept). (CCF-C)
- [TACO-24] Improving Utilization of Dataflow Unit for Muti-Batch Processing. Zhihua Fan, Wenming Li, Zhen Wang, Yu Yang, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. ACM Transactions on Architecture and Code Optimization. Volume 21 Issue 1, Article No. 17, pp 1–26. 2024. (CCF-A)
- [CF-24] LeakageFreeSpec: Applying the Wiping Approach to Defend Against Transient Execution Attacks. Fahong Yu, Zhimin Tang, Xiaobo Li, Zhihua Fan. In ACM International Conference on Computing Frontiers, pp 276–284. 2024. (CCF-C)
- [HPCC-24] OBSD: On-The-Fly Block-Wise Sparse Distillation Accelerating SpGEMMs in DNN Applications. Liu, Yanhuan and Li, Wenming and Zhang, Kunming and Zhihua Fan and Wu, Haibin and Wang, Lexin and Liu, Tianyu and Qiu, Yuhang and Wang, Zhen and Ye, Xiaochun and Fan, Dongrui and An, Xuejun. In IEEE International Conference on High Performance Computing and Communications, pp224-231. (CCF-C)
- [TPDS-23] Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. Zhihua Fan, Wenming Li, Zhen Wang, Tianyu Liu, Haibin Wu, Yanhuan Liu, Meng Wu, Xinxin Wu, Xiaochun Ye, Dongrui Fan, Ninghui Sun, Xuejun An. In IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 12, pp. 3253-3265, 2023. (CCF-A)
- [EuroPar-23] Improving Utilization of Dataflow Architectures Through Software and Hardware Co-Design. Zhihua Fan, Wenming Li, Shengzhong Tang, Xuejun An, Xiaochun Ye, Dongrui Fan. In Euro-Par 2023: Parallel Processing. Lecture Notes in Computer Science, vol 14100. Springer 2023. (CCF-B)
- [计研发-23] 面向低精度神经网络的数据流体系结构优化. 范志华, 吴欣欣, 李文明, 曹华伟, 安学军, 叶笑春, 范东睿. 计算机研究与发展, 2023, 60(1): 43-58 (CCF-T1)
- [ICCD-23] DFGC: DFG-aware NoC Control based on Time Stamp Prediction for Dataflow Architecture. Tianyu Liu, Wenming Li, Zhihua Fan*. In IEEE 41st International Conference on Computer Design), 2023, pp. 432-439. (CCF-B, Corresponding Author)
- [ICCD-23] Alleviating Transfer Latency in DataFlow Accelerator for DSP Applications. Haibin Wu, Wenming Li, Zhihua Fan*, Zhen Wang, TianyuLiu, Junying Huang, Shengzhong Tang, Yanhuan Liu, Kunming Zhang, Xiaochun Ye, Dongrui Fan.In IEEE 41st International Conference on Computer Design, 2023, pp. 440-443. (CCF-B, Corresponding Author)
- [HPCC-22] A Loop Optimization Method for Dataflow Architecture. Zhihua Fan, Wenming Li, Tianyu Liu, Shengzhong Tang, Zhen Wang, Xuejun An, Xiaochun Ye, Dongrui Fan. In 2022 IEEE 24th Int Conf on High Performance Computing & Communications, 2022, pp. 202-211 (CCF-C)
- [NPC-22] A Routing-Aware Mapping Method for Dataflow Architectures. In Network and Parallel Computing. Zhihua Fan, Wenming Li, Shengzhong Tang, Tianyu Liu, Xuejun An, Xiaochun Ye, Dongrui Fan. Lecture Notes in Computer Science, vol 13615. Springer. (CCF-C)
- [DATE-22] LRP: Predictive output activation based on SVD approach for CNNs acceleration. Xinxin Wu, Zhihua Fan, Tianyu Liu, Wenming Li, Xiaochun Ye, Dongrui Fan. In 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022, pp. 831-836. (CCF-B)
- [CAL-22] Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture. Yinshen Wang, Wenming Li, Tianyu Liu, Liangjiang Zhou, Bingnan Wang, Zhihua Fan, Xiaochun Ye, Dongrui Fan, Chibiao Ding. In IEEE Computer Architecture Letters, 2022, 21(2): 121-124.
- [数据与计算发展前沿-21] 数据流计算研究进展与概述. 范志华,李文明,叶笑春,范东睿. 数据与计算发展前沿, 2021, 3(5): 65-81. (CCF-T3, Invited_paper)
🌟 Honors
- Chinese Academy of Sciences President’s Scholarship (中国科学院院长奖)
- Beijing Outstanding Graduate Award (北京市优秀毕业生)
- ICT Diretor’s Scholarship(计算所所长奖)
- Outstanding Paper Nomination Award at CCF HPC China (2023)
👨🏫 Services
- Program Committee Member for HPC China 2025
- Program Committee Member for ISPA 2025
- Journal Reviewer: ACM TACO,CCF THPC,JSA,FGCS……
🎓 Educations
- 2018.09 - 2024.06, Ph.D. in Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences
- 2014.09 - 2018.06, B.S. in Computer Science and Technology, Jilin University
🖥️ Research Fundings
- National Natural Science Foundation of China, Research on Dataflow Graph Optimization and Scheduling Based on Artificial Intelligence, 2026.01–2028.12
👥 Supervisory Faculty
- Prof. Xiaochun Ye, Ph.D. Supervisor: yexiaochun [at] ict.ac.cn
- Prof. Xuejun An, Ph.D. Supervisor: axj [at] ict.ac.cn
- Prof. Wenming Li, Ph.D. Supervisor: liwenming [at] ict.ac.cn
✉️ Contact
- Email: fanzhihua [at] ict.ac.cn
- Address: Building 1, Yard 33, Wensong Road, Haidian District, Beijing, China
