• Journal of Semiconductors
  • Vol. 45, Issue 4, 040204 (2024)
Bohan Yang1、3, Jia Chen1、2, and Fengbin Tu1、2、*
Author Affiliations
  • 1Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
  • 2AI Chip Center for Emerging Smart Systems, The Hong Kong University of Science and Technology, Hong Kong, China
  • 3School of the Gifted Young, University of Science and Technology of China, Hefei 230026, China
  • show less
    DOI: 10.1088/1674-4926/45/4/040204 Cite this Article
    Bohan Yang, Jia Chen, Fengbin Tu. Towards efficient generative AI and beyond-AI computing: New trends on ISSCC 2024 machine learning accelerators[J]. Journal of Semiconductors, 2024, 45(4): 040204 Copy Citation Text show less
    References

    [1] R Bommasani, D A Hudson, E Adeli et al. On the opportunities and risks of foundation models. arXiv preprint, 1(2021).

    [2] J Achiam, S Adler, S Agarwal et al. GPT-4 technical report. arXiv preprint, 1(2023).

    [3] A Ramesh, M Pavlov, G Goh et al. Zero-shot text-to-image generation. International Conference on Machine Learning (ICML), 1, 1(2021).

    [4] C Mu, J P Zheng, C X Chen. Beyond convolutional neural networks computing: New trends on ISSCC 2023 machine learning chips. J Semicond, 44, 050203(2023).

    [5] J Alben. Computing in the era of generative AI, 26(2024).

    [6] A Smith, E Chapman, C Patel et al. AMD InstinctTM MI300 series modular chiplet package–HPC and AI accelerator for exa-class systems, 490(2024).

    [7] R Q Guo, L Wang, X F Chen et al. A 28nm 74.34TFLOPS/W BF16 heterogenous CIM-based accelerator exploiting denoising-similarity for diffusion models, 362(2024).

    [8] S Kim, S Kim, W Jo et al. C-transformer: A 2.6-18.1μJ/token homogeneous DNN-transformer/spiking-transformer processor with big-little network and implicit weight generation for large language models, 368(2024).

    [10] H Fujiwara, H Mori, W C Zhao et al. A 3nm, 32.5TOPS/W, 55.0TOPS/mm2 and 3.78Mb/mm2 fully-digital compute-in-memory macro supporting INT12 × INT12 with a parallel-MAC architecture and foundry 6T-SRAM bit cell, 572(2024).

    [11] Y F He, S P Fan, X Li et al. A 28nm 2.4Mb/mm2 6.9-16.3TOPS/mm2 eDRAM-LUT-based digital-computing-in-memory macro with in-memory encoding and refreshing, 578(2024).

    [12] A Guo, X Chen, F Y Dong et al. A 22nm 64kb lightning-like hybrid computing-in-memory macro with a compressed adder tree and analog-storage quantizers for transformer and CNNs, 570(2024).

    [13] L F Wang, W Z Li, Z D Zhou et al. A flash-SRAM-ADC-fused plastic computing-in-memory macro for learning in neural networks in a standard 14nm FinFET process, 582(2024).

    [14] F B Tu, Y Q Wang, Z H Wu et al. A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 reconfigurable digital CIM processor with unified FP/INT pipeline and bitwise In-memory booth multiplication for cloud deep learning acceleration, 1(2022).

    [15] Y Wang, X L Yang, Y B Qin et al. A 28nm 83.23TFLOPS/W POSIT-based compute-in-memory macro for high-accuracy AI applications, 566(2024).

    [16] T H Wen, H H Hsu, W S Khwa et al. A 22nm 16Mb floating-point ReRAM compute-in-memory macro with 31.2TFLOPS/W for AI edge devices, 1, 1(2024).

    [17] M E Shih, S W Hsieh, P Y Tsai et al. NVE: A 3nm 23.2TOPS/W 12b-digital-CIM-based neural engine for high-resolution visual-quality enhancement on smart devices, 360(2024).

    [18] Y P Wang, M T Yang, C P Lo et al. Vecim: A 289.13GOPS/W RISC-V vector co-processor with compute-in-memory vector register file for efficient high-performance computing, 492(2024).

    [21] G Park, S Song, H Y Sang et al. Space-mate: A 303.5mW real-time sparse mixture-of-experts-based NeRF-SLAM processor for mobile spatial computing, 374(2024).

    [22] J Ryu, H Kwon, W Park et al. NeuGPU: A 18.5mJ/iter neural-graphics processing unit for instant-modeling and real-time rendering with segmented-hashing architecture, 372(2024).

    [23] K Nose, T Fujii, K Togawa et al. A 23.9TOPS/W @ 0.8V, 130TOPS AI accelerator with 16 × performance-accelerable pruning in 14nm heterogeneous embedded MPU for real-time robot applications, 364(2024).

    [24] Y C Chu, Y C Lin, Y C Lo et al. A fully integrated annealing processor for large-scale autonomous navigation optimization, 488(2024).

    [25] J H Song, Z H Wu, X Y Tang et al. A variation-tolerant In-eDRAM continuous-time Ising machine featuring 15-level coefficients and leaked negative-feedback annealing, 490(2024).

    [26] J Bae, C Shim, B Kim. E-chimera: A scalable SRAM-based Ising macro with enhanced-chimera topology for solving combinatorial optimization problems within memory, 286(2024).

    [27] J Bae, J Koo, C Shim et al. LISA: A 576 × 4 all-in-one replica-spins continuous-time latch-based Ising computer using massively-parallel random-number generations and replica equalizations, 284(2024).

    [28] C Shim, J Bae, B Kim. VIP-sat: A Boolean satisfiability solver featuring 5 × 12 variable in-memory processing elements with 98% solvability for 50-variables 218-clauses 3-SAT problems, 486(2024).

    [29] Y H Ju, G Q Xu, J Gu. A 28nm physics computing unit supporting emerging physics-informed neural network and finite element method for real-time scientific computing on edge devices, 366(2024).

    Bohan Yang, Jia Chen, Fengbin Tu. Towards efficient generative AI and beyond-AI computing: New trends on ISSCC 2024 machine learning accelerators[J]. Journal of Semiconductors, 2024, 45(4): 040204
    Download Citation