Release Notes - 0.9.0
Furiosa SDK 0.9.0 is a major release, including many performance enhancements, additional functions, and bug fixes. In partcular, 0.9.0 release includes the significant improvements of the quantization tools.
| Package Name | Version | 
|---|---|
| NPU Driver | 1.7.0 | 
| NPU Firmware Tools | 1.4.0 | 
| NPU Firmware Image | 1.7.0 | 
| HAL (Hardware Abstraction Layer) | 0.11.0 | 
| Furiosa Compiler | 0.9.0 | 
| Python SDK (furiosa-runtime, furiosa-server, furiosa-serving, furiosa-quantizer, ..) | 0.9.0 | 
| NPU Management CLI (furiosactl) | 0.11.0 | 
| NPU Device Plugin | 0.10.1 | 
| NPU Feature Discovery | 0.2.0 | 
Installing the latest SDK
If you are using APT repository, the upgrade process is simpler.
apt-get update && apt-get upgrade
If you wish to designate a specific package for upgrade, execute as below: You can find more details about APT repository setup at Driver, Firmware, and Runtime Installation.
apt-get update && \
apt-get install -y furiosa-driver-pdma furiosa-libhal-warboy furiosa-libnux furiosactl
You can upgrade firmware as follows:
apt-get update && \
apt-get install -y furiosa-firmware-tools furiosa-firmware-image
You can upgrade Python package as follows:
pip install --upgrade pip setuptools wheel
pip install --upgrade furiosa-sdk
Warning
When installing or upgrading the furiosa-sdk without updating pip to the latest version, you may encounter the following errors.
ERROR: Could not find a version that satisfies the requirement furiosa-quantizer-impl==0.9.* (from furiosa-quantizer==0.9.*->furiosa-sdk) (from versions: none)
ERROR: No matching distribution found for furiosa-quantizer-impl==0.9.* (from furiosa-quantizer==0.9.*->furiosa-sdk)
Major changes
Quantization tool
Quantization tool is a library that converts a pre-trained model to a quantized model. You can refer to more details at Model Quantization 0.9.0 release includes the API improvement and new calibration methods, possibly leading to better accuracy.
- Added new quantization-related APIs that are more flexible and solid. ( - furiosa.quantizer,- furiosa.optimizer)
optimized_onnx_model = optimize_model(source_onnx_model)
calibrator = Calibrator(optimized_onnx_model, CalibrationMethod.MIN_MAX_ASYM)
for calibration_data, _ in tqdm.tqdm(calibration_dataloader, desc="Calibration", unit="images", mininterval=0.5):
  calibrator.collect_data([[calibration_data.numpy()]])
ranges = calibrator.compute_range()
quantizated_graph = quantize(optimized_onnx_model, ranges)
- Added an option to decide whether to perform quantize at the beginning of the model. - Instead of - without_quantizebeing removed from the compiler options, it can be specified via the argument- with_quantizeto the- quantizefunction.
 
- The - normalized_pixel_outputsargument to the- quantizefunction can be set to convert the model output to uint8 instead of dequantizing to fp32.- A tensor with an element range of - (0. , 1.)can be optimized to convert to pixel data in uint8.
 
- Provides more calibration methods. 
| Calibration Method | Asymmetric | QuasiSymmetric | 
|---|---|---|
| Min-Max | MIN_MAX_ASYM | MIN_MAX_SYM | 
| Entropy | ENTROPY_ASYM | ENTROPY_SYM | 
| Percentile | PERCENTILE_ASYM | PERCENTILE_SYM | 
| Mean squared error | MSE_ASYM | MSE_SYM | 
| Signal-to-quantization-noise ratio | SQNR_ASYM | SQNR_SYM | 
To ensure the effectiveness of new calibration methods, we measured the accuracy of 10 popular models with the new calibration methods. Among them, 8 models showed better accuracy than the existing calibration methods. For example, the accuracy of EfficientNet-B0 increased by 57.452%. With the min-max calibration method, EfficientNet-B0 had an accuracy of 16.104%. In contrast, with the percentile calibration method, the accuracy was 73.556%. The details of the experiment results can be found at Quantization Accuracy.
For more information on installing and using the new quantizer, you can refer to the following examples.
Compiler
- Added acceleration support for operators Lower, Unlower 
- Added acceleration support for operator Dequantize 
- Support for executing binaries that are larger than the hardware’s instruction memory 
- Improved scheduler and memory allocator to eliminate unnecessary I/O 
- Various improvements optimize compilation for better execution performance 
furiosa-toolkit
The furiosactl command-line tool included in the furiosa-toolkit 0.11.0 release includes improvements to the
includes the following major improvements
The newly added furiosactl top command is used to view utilization by NPU device over time.
$ furiosactl top --interval 200
NOTE: furiosa top is under development. Usage and output formats may change.
Please enter Ctrl+C to stop.
Datetime                        PID       Device        NPU(%)   Comp(%)   I/O(%)   Command
2023-03-21T09:45:56.699483936Z  152616    npu1pe0-1      19.06    100.00     0.00   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:56.906443888Z  152616    npu1pe0-1      51.09     93.05     6.95   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:57.110489333Z  152616    npu1pe0-1      46.40     97.98     2.02   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:57.316060982Z  152616    npu1pe0-1      51.43    100.00     0.00   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:57.521140588Z  152616    npu1pe0-1      54.28     94.10     5.90   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:57.725910558Z  152616    npu1pe0-1      48.93     98.93     1.07   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:57.935041998Z  152616    npu1pe0-1      47.91    100.00     0.00   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
2023-03-21T09:45:58.13929122Z   152616    npu1pe0-1      49.06     94.94     5.06   ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf
The furiosactl info command has been improved to display concise information about each device. As before, you can enter the --full option if you want to see more information about a device.
$ furiosactl info
+------+--------+----------------+-------+--------+--------------+
| NPU  | Name   | Firmware       | Temp. | Power  | PCI-BDF      |
+------+--------+----------------+-------+--------+--------------+
| npu1 | warboy | 1.6.0, 3c10fd3 |  54°C | 0.99 W | 0000:44:00.0 |
+------+--------+----------------+-------+--------+--------------+
$ furiosactl info --full
+------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+
| NPU  | Name   | UUID                                 | S/N               | Firmware       | Temp. | Power  | PCI-BDF      | PCI-DEV |
+------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+
| npu1 | warboy | 00000000-0000-0000-0000-000000000000 | WBYB0000000000000 | 1.6.0, 3c10fd3 |  54°C | 0.99 W | 0000:44:00.0 | 511:0   |
+------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+
More information about installing and using furiosactl can be found in furiosa-toolkit.