FuriosaAI RNGD

FuriosaAI RNGD#

FuriosaAI’s second-generation Neural Processing Unit (NPU), RNGD, is a chip designed for deep learning inference, supporting high-performance Large Language Models (LLM), Multi-Modal LLM, Vision models, and other deep learning models.

FuriosaAI RNGD
FuriosaAI RNGD

RNGD is based the Tensor Contraction Processor (TCP) architecture which utilizes TSMC’s 5nm process node, and operates at 1.0 GHz. It offers 512 TOPS and 1024 TOPS of INT8 and INT4 performance, respectively. RNGD is configured with two HBM3 modules providing a memory bandwidth of 1.5 TB/s, and supports PCIe Gen5 x16. For multi-tenant environments like Kubernetes, a single RNGD chip can work as 2, 4, or 8 individual NPUs, each fully isolated with its own cores and memory bandwidth. RNGD supports Single Root IO Virtualization (SR-IOV) and virtualization for multi-instance NPUs.

Please refer to the followings to learn more about TCP architecture and RNGD:

RNGD Hardware Specification#

Architecture

Tensor Contraction Processor

Process Node

TSMC 5nm

Frequency

1.0 GHz

BF16

256 TFLOPS

FP8

512 TFLOPS

INT8

512 TOPS

INT4

1024 TOPS

Memory Bandwidth

HBM3 1.5TB/s

Memory Capacity

HBM3 48GB

On-Chip SRAM

256MB

Interconnect Interface

PCIe Gen5 x16

Thermal Solution

Passive

Thermal Design Power (TDP)

150W

Power Connector

12VHPWR

Form Factor

PCIe dual-slot full-height 3/4 Length

Multi-Instance Support

8

Virtualization Support

Yes

SR-IOV

8 Virtual Functions

ECC Memory Support

Yes

Secure Boot with Root of Trust

Yes