# Dongyun Kam, Ph.D.

77, Cheongam-ro, Nam-gu, Pohang-si, Gyeongsangbuk-do, 37673 | rkaehddbs@postech.ac.kr | Personal website | Google scholar

## **RESEARCH INTEREST**

Designing hardware-friendly algorithms and efficient hardware architectures for a variety of applications

- Applications: Large-scale DNN inference, wireless communication systems, signal processing units
- Keywords: VLSI architectures, digital circuits, algorithm-hardware co-optimizations, ASIC

## **EDUCATION**

| <ul> <li>Ph.D. in Electrical Engineering</li> <li>Pohang University of Science and Technology (POSTECH), Republic of Korea</li> <li>Advisor: Prof. Youngjoo Lee</li> </ul>                                                               | Sep. 2020 - Aug. 2024 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|
| <ul> <li>M.S. in Electrical Engineering</li> <li>Pohang University of Science and Technology (POSTECH), Republic of Korea</li> <li>Advisor: Prof. Youngjoo Lee</li> </ul>                                                                | Sep. 2018 - Aug. 2020 |
| <ul> <li>B.S. in Electrical Engineering</li> <li>Pohang University of Science and Technology (POSTECH), Republic of Korea</li> </ul>                                                                                                     | Mar. 2014 - Aug. 2018 |
| WORK EXPERIENCE                                                                                                                                                                                                                          |                       |
| <ul> <li>Postdoctoral Researcher</li> <li>POSTECH Institute of Artificial Intelligence, Republic of Korea</li> <li>Accelerating state space machine (SSM)-based LLM models with bit-serial computations</li> </ul>                       | Aug. 2024 - Present   |
| <ul> <li>Visiting Researcher</li> <li>University of Michigan, USA</li> <li>Designing energy-efficient inference accelerator design for large-scale DNN models</li> <li>Research collaboration with Prof. Zhengya Zhang</li> </ul>        | Mar. 2023 - Sep.2023  |
| <ul> <li>Research Assistant</li> <li>POSTECH, Republic of Korea</li> <li>Forward Error Correction (FEC) decoder design for 5G/6G communication systems</li> <li>Hardware verification and ASIC implementation with EDA tools.</li> </ul> | Mar. 2018 - Aug.2024  |
| <ul><li>Internship Program</li><li>Alticast, Republic of Korea</li><li>Speech recognition-based Educational software for a set-top Box</li></ul>                                                                                         | Jan. 2017 - June 2017 |
| SKILLS                                                                                                                                                                                                                                   |                       |

- General coding: C/C++/Python/Matlab/Verilog/CUDA
- DNN frameworks: Pytorch, Huggingface, LM-Eval, TensorRT-LLM
- EDA tools: Synopsys DC, ICC, ICC2, STA, VCS, Cadence Virtuoso
- FPGA tools: Quartus (Intel FPGA), Vivado (Xilinx FPGA)

## HONORS AND AWARDS

| <ul> <li>Best Paper Award, Samsung-POSTECH Research center</li> <li>Paper: "A 21.9 ns, 15.7 Gbps/mm<sup>2</sup> (128, 15) BOSS FEC decoder for 5G/6G URLLC applications"</li> </ul>                | Aug. 2024 |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
| Postdoctoral Fellowship, granted by POSTECH                                                                                                                                                        | Aug. 2024 |
| POSTECHIAN Fellowship, granted by POSTECH EE                                                                                                                                                       | May 2024  |
| <ul><li>Encouragement Award, POSTECH BK21 Four</li><li>POSTECH EE Achievement Competition</li></ul>                                                                                                | Jan. 2024 |
| <ul> <li>International Research Scholarship, granted by SNU, Korea Institute for Advancement of Technology (KIAT)</li> <li>Human Resource Development Program for Industrial Innovation</li> </ul> | Dec. 2022 |

Supporting the collaboration with Prof. Zhengya Zhang at the University of Michigan

| <ul> <li>IEEE SSCS Seoul Chapter Award (Best Design Award), International SoC Design Conference (ISOCC)</li> <li>Design: "Low-Latency SCL polar decoder using overlapped pruning operations"</li> </ul>              | Oct. 2020             |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|
| <ul> <li>Special Award, 21st Korea Semiconductor Design Contest</li> <li>Design: "Noise resilient CNN accelerator with network stacking"</li> </ul>                                                                  | Oct. 2020             |
| <ul> <li>Best Paper Award, Summer Annual Conference of IEIE</li> <li>Domestic Conference Paper: "Complexity Analysis of OSD Algorithm for Short Error Correction Codes"</li> </ul>                                   | Aug. 2020             |
| <ul> <li>Samsung Humantech Encouragement Paper Award, Samsung Electronics.</li> <li>Paper: "Massive MIMO systems with low-resolution ADCs: Baseband energy consumption vs. Symbol detection performance."</li> </ul> | Feb. 2019<br>ormance" |
| Cum Laude, POSTECH EE                                                                                                                                                                                                | Aug. 2018             |

## PUBLICATION

#### **Journal Papers**

- [1] J. Kim+, S. Han+, Dongyun Kam, B. Y. Kong, and Y. Lee, "A Design Framework for Cost-Efficient Sorters With Arbitrary Input/Output Constraints," *IEEE Transactions on Circuits and Systems I: Regular Papers (TCAS-I)*, Dec. 2024.
- [2] D. Park, **Dongyun Kam**, S. Yun, J. Choe, and Y. Lee, "Hard-decision SCL polar decoder with weighted pruning operation for storage application," *IEEE Transactions on Circuits and Systems II: Express Briefs (TCAS-II)*, Sep. 2024.
- [3] **Dongyun Kam**, B. Y. Kong and Y. Lee, "Ultra-Low-Latency SCL Polar Decoder Architecture Using Overlapped Pruning Operations," *IEEE Transactions on Circuits and Systems I: Regular Papers (TCAS-I)*, Mar. 2023.
- [4] C. Kim, **Dongyun Kam**, S. Kim, G. Park, and Y. Lee, "Simplified ordered statistic decoding for short-length linear block codes," *IEEE Communications Letters (CL)*, Aug. 2022.
- [5] S. Hong, Dongyun Kam, S. Yun, J. Choe, N. Lee, and Y. Lee, "Low-complexity and low-latency SVC decoding architecture using modified MAP-SP algorithm," *IEEE Transactions on Circuits and Systems I: Regular Papers (TCAS-I)*, Apr. 2022.
- [6] **Dongyun Kam**, H. Yoo, Y. Lee, "Ultra-low-latency successive cancellation polar decoding architecture using tree-level parallelism," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems (TVLSI)*, June 2021.
- [7] S. Hwang, S. Moon, **Dongyun Kam**, I. Oh, Y. Lee, "High-throughput and low-latency digital baseband architecture for energyefficient wireless VR systems," *MDPI Electronics*, July 2019.
- [8] S. Moon, I. Kim, **Dongyun Kam**, D. Jee, J. Choi, Y. Lee, "Massive MIMO systems with low-resolution ADCs: Baseband energy consumption vs. Symbol detection performance," *IEEE Access*, Jan. 2019. (Samsung HumanTech Paper Award)

#### **Conference Papers**

- [1] Dongyun Kam, M. Yun, S. Yoo, S. Hong, Z. Zhang, and Y. Lee, "Panacea: Novel DNN accelerator using accuracy-preserving asymmetric quantization and energy-saving bit-slice sparsity," Accepted by IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2025. (Samsung HumanTech Paper Award)
- [2] S. Han, J. Kim, Dongyun Kam, B. Y. Kong, M. Kim, Y. Kim, Y. Lee, "Constrained sorter design using zero-one principle," IEEE International Symposium on Circuits and Systems (ISCAS), May 2024.
- [3] Dongyun Kam, S. Yun, J. Choe, Z. Zhang, N. Lee, Y. Lee, "A 21.9 ns, 15.7 Gbps/mm<sup>2</sup> (128, 15) BOSS FEC decoder for 5G/6G URLLC applications," *IEEE International Solid-State Circuits Conference (ISSCC)*, Feb. 2024.
- [4] J. G. Min, **Dongyun Kam**, Y. Byun, G. Park, Y. Lee, "Energy-efficient RISC-V-based vector processor for cache-aware structurallypruned transformers," *IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)*, Aug. 2023.
- [5] M. Kang, R. Hwang, J. Lee, Dongyun Kam, Y. Lee, M. Rhu, "GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Network," IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2023.
- [6] **Dongyun Kam**, B. Y. Kong, and Y. Lee, "A 1.1μs 1.56Gb/s/mm<sup>2</sup> Cost Efficient Large-List SCL Polar Decoder Using Fully-Reusable LLR Buffers in 28nm CMOS Technology," *IEEE Symposium on VLSI Technology and Circuits (VLSI)*, June 2022.
- [7] **Dongyun Kam+**, J. G. Min+, J. Yoon, S. Kim, S. Kang, and Y. Lee, "Design and evaluation frameworks for advanced RISC-based ternary processor," *IEEE/ACM Design*, *Automation and Test in Europe (DATE)*, Mar. 2022. (+ denotes equal contribution)
- [8] C. Kim, D. Rim, J. Choe, **Dongyun Kam**, G. Park, S. Kim, and Y. Lee, "FPGA-based ordered statistic decoding architecture for B5G/6G URLLC IIOT networks," *IEEE Asian Solid-State Circuits Conference* (ASSCC), Nov. 2021.
- [9] **Dongyun Kam**, B. Kong, Y. Lee, "Low-latency polar decoder using overlapped SCL processing," *IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, June 2021.
- [10] S. Yun, **Dongyun Kam**, J. Choi, B. Kong, Y. Lee, "Ultra-low-latency LDPC decoding architecture using reweighted offset min-sum algorithm," *IEEE International Symposium on Circuits and Systems (ISCAS)*, Oct. 2020.
- [11] **Dongyun Kam**, Y. Lee, "Ultra-low-latency parallel SC polar decoding architecture for 5G wireless communications," *IEEE International Symposium on Circuits and Systems (ISCAS)*, May 2019. (**IEEE CASS Student Travel Grant Award**)

## **International Patents**

Jemin Lee, Youngjoo Lee, and Dongyun Kam, "Multi-bit partial sum network device for parallel SC decoder," PCT/KR2019/017108.
 Jemin Lee, Youngjoo Lee, and Dongyun Kam, "Polar codes decoding device and method thereof," PCT/KR2019/015834.

| PROJECTS                                                                                                                                                                                                                                                                                                                              |                       |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------|
| <ul> <li>NRC: Advanced Channel Coding and Channel Estimation for Wireless Communication Evolution<br/>Institute for Information &amp; communication Technology Planning &amp; evaluation (IITP)</li> <li>Developing α-TECC: <u>Al</u>I-in-one <u>P</u>aradigm C <u>hanging T</u>echnologies in <u>Error Control Coding</u></li> </ul> | Apr. 2024 - Aug. 2024 |
| <ul> <li>AI-ISP SW/HW Co-Optimization</li> <li>Samsung Advanced Institute of Technology (SAIT), Samsung Electronics</li> <li>Developing baseline ISP hardware modules and AI-ISP inference simulators</li> </ul>                                                                                                                      | Nov. 2022 - Aug. 2024 |
| <ul> <li>InSeCT: Intelligent Secure Underwater Communication Technology</li> <li>Korea Research Institute for defense Technology planning and advancement (KRIT)</li> <li>Developing efficient FEC decoder architectures for BOSS codes</li> </ul>                                                                                    | Dec. 2022 - Aug. 2024 |
| <ul> <li>Algorithm-Hardware Co-Optimization Methods for Energy-Efficient 6G Baseband Systems</li> <li>National Research Foundation of Korea (NRF)</li> <li>Developing short-length FEC decoders for 5G/6G URLLC scenarios</li> </ul>                                                                                                  | Sep. 2022 - Aug. 2024 |
| <ul> <li>Low-cost &amp; Low-latency polar decoder designs for B5G URLLC scenarios</li> <li>National Research Foundation of Korea (NRF)</li> <li>Optimizing node-pruning methods of SCL decoding and implementing polar decoders at the ASIC level</li> </ul>                                                                          | June 2019 - Feb 2022  |
| <ul> <li>Low-cost &amp; Low-power ECC/signal processing HW IP development</li> <li>Samsung Electronics</li> <li>Developing ECC decoder architectures for emerging memory devices</li> </ul>                                                                                                                                           | July 2018 - Sep. 2023 |
| TEACHING SERVICE                                                                                                                                                                                                                                                                                                                      |                       |
| <b>EECE276, Electronics &amp; Electrical Eng. Lab I</b><br>Teaching Assistant for the lecture on micro-controller applications                                                                                                                                                                                                        | Fall 2022             |
| EECE199, Freshmen Research Participation<br>Teaching Assistant                                                                                                                                                                                                                                                                        | Fall 2022             |
| <b>EECE550, Advanced Computer Design</b><br>Teaching Assistant for RISC-V design assignments                                                                                                                                                                                                                                          | Spring 2021           |
| EECE695E, VLSI Signal Processing<br>Teaching Assistant                                                                                                                                                                                                                                                                                | Fall 2020             |
| EECE273, Digital System Design<br>Teaching Assistant                                                                                                                                                                                                                                                                                  | Fall 2018             |

## **CHIP GALLERY**

Samsung 28nm CMOS Technology (2024)

Panacea: DNN accelerator using accuracy-preserving asymmetric quantization and energy-saving bit-slice sparsity (HPCA '25)

| Glo | balt       | <b>uffe</b> ( | s + P         | PU            |       |
|-----|------------|---------------|---------------|---------------|-------|
|     |            |               | -GEN          |               | re    |
|     | N.         | (PE           | A #8          | ~ #15         |       |
|     |            | On-           | chip n        | nemo          | ry    |
|     |            | AQS<br>(PE    | -GEN<br>EA #0 | IM co<br>~ #7 | ile 📄 |
| Tá  | Cor        | trolle        | - I4 D        | МА            |       |
|     | in and the | and beam      | antinta       | tentes        |       |

| Technology             | 28 nm                |
|------------------------|----------------------|
| # of 4bx4b multipliers | 3072                 |
| Overall area           | 2.11 $\mathrm{mm}^2$ |
| Core gate area         | 1.31 $\mathrm{mm}^2$ |
| Supply voltage         | 1.0 V                |
| Operating Frequency    | 250 MHz              |
| Throughput             | 1.268 TOPS           |
| Energy efficiency      | 12.5 TOPS/W          |
|                        |                      |

## Samsung 28nm CMOS Technology (2023)

Cost-efficient BOSS FEC decoder for URLLC scenarios (ISSCC '24)

| ← 0.61 mm → |                                                                                                                 | Technology          | 28 nm                |
|-------------|-----------------------------------------------------------------------------------------------------------------|---------------------|----------------------|
|             |                                                                                                                 | Target code         | (128, 15) BOSS code  |
|             |                                                                                                                 | Decoding algorithm  | Two-stage MAP        |
|             | Control + Interface In/out buffers                                                                              | Core area           | 0.37 $\mathrm{mm}^2$ |
|             | The second se | Supply voltage      | 0.95 V               |
|             | 4 IMMTs 4 IMMTs                                                                                                 | Operating Frequency | 590 MHz              |
|             | SCU FWHT-based SCU                                                                                              | Power               | 33.3 mW              |
| cal         | MVM<br>calculators                                                                                              | Coded Throughput    | 5.84 Gb/s            |
|             | SCU x4 SCU +                                                                                                    | Area efficiency     | 15.78 Gb/s/mm $^2$   |
| Ĭ,          | 4 IMMTs 4 IMMTs                                                                                                 | Energy efficiency   | 5.7 pJ/bit           |
| ۱<br>۱      | Lane #2 Lane #3                                                                                                 |                     |                      |

#### Samsung 28nm CMOS Technology (2021)

Cost efficient large-list SCL polar decoder using fully-reusable LLR buffers (VLSI '22)

