Changes between Version 16 and Version 17 of venice/npu


Ignore:
Timestamp:
08/19/2024 09:32:27 PM (4 weeks ago)
Author:
Tim Harvey
Comment:

added section for upstream/mainline NPU support

Legend:

Unmodified
Added
Removed
Modified
  • venice/npu

    v16 v17  
    1111[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark_new.png)]]
    1212
     13
     14== NXP Yocto BSP
    1315The easiest way to get started with the NPU is to use a image from the NXP BSP. This image contains the necessary libraries and kernel to interface the NPU with !TensorFlow without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
    1416
     
    2123
    2224**NOTE**: In the scripts below, we disable PCIe as a temporary fix to prevent the NXP 6.6.3_1.0.0 kernel from hanging on boot. This is caused by a missing patch necessary to work around a PCIe switch quirk when used on the IMX8MP, which can be found specifically [[https://github.com/Gateworks/linux-venice/commit/cf983e4a04eecb5be93af7b53cb10805ee448998|here]] from our kernel.
    23 == Getting Started with the NPU
     25
    2426=== 1. Download the Gateworks Venice Rescue Image to removable multimedia.
    2527Find which device your flash drive/SD card is. For example, {{{/dev/sdc}}}
     
    227229
    228230
     231== Upstream / Mainline NPU support
     232The IMX8MP has a Verisilicon NPU which is from Vivante. Vivante supports this via their vivante driver (/dev/galcore) but that is not an opensource driver so it would make sense to want to support the NPU with it's opensource equivalent (etnaviv driver). Support for this has been added to the Linux 6.10 kernel.
     233
     234The userspace library used for GPU is mesa so it makes sense to support the NPU there as well and for hardware accelleration this is done in mesa with 'delegate' libraries. While Support for the !VeriSilicon NPU made it into Mesa 24.1.0 via teflon it does not yest support the imx8mp which has a sightly newer version of the !VeriSilicon NPU.
     235
     236Therefore to support the NPU you need to build a custom fork of mesa that is being worked on by [https://blog.tomeuvizoso.net/ Tomeu Vizoso] with the work being sponsored by [https://ideasonboard.com/ Ideas On board].
     237
     238On a Gateworks Venice board with an IMX8MP and Ubuntu 22.04 (jammy) root filesystem:
     239- update kernel to v6.10 with NPU support
     240{{{#!bash
     241# update kernel
     242cd /tmp && wget https://dev.gateworks.com/venice/kernel/linux-venice-6.10.6.tar.xz && tar -C / -xvf linux*.tar.xz --keep-directory-symlink && reboot
     243}}}
     244- create a non-root user (always a good idea but not required for mesa)
     245{{{#!bash
     246# create user
     247USER=gateworks
     248useradd -s /bin/bash $USER
     249mkdir /home/$USER; chown $USER /home/$USER
     250# allow sudo for this user
     251echo "$USER ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
     252# add render group (needed for some examples)
     253usermod -aG render $USER
     254# login as this user
     255su -l $USER
     256}}}
     257- install python virtualenv which allows us to prevent conflicts between package libraries
     258{{{#!bash
     259sudo apt update && sudo apt upgrade -y && sudo apt install -y python3-pip python3-venv
     260# create tflite1-env" virtualenv
     261python3 -m venv tflite1-env
     262# activate the venv (repeat or every new shell)
     263source tflite1-env/bin/activate
     264}}}
     265- build mesa (fork with updated teflon support for imx8mp)
     266{{{#!bash
     267# install build deps
     268sudo apt install -y build-essential git cmake
     269sudo apt-get -y build-dep mesa
     270# get repo
     271git clone https://gitlab.freedesktop.org/tomeu/mesa.git -b etnaviv-imx8mp mesa
     272cd mesa
     273# mesa requires meson >= 1.1.0 (newer than jammy's), pycparser >= 2.20, and mako
     274pip3 install meson pycparser mako
     275~/tflite1-env/bin/meson setup build -Dgallium-drivers=etnaviv -Dvulkan-drivers= -Dteflon=true
     276~/tflite1-env/bin/meson compile -C build # 20 mins or so on imx8mp
     277ldd build/src/gallium/targets/teflon/libteflon.so
     278}}}
     279- install tensorflow lite runtime
     280{{{#!bash
     281# install tensorflow lite runtime
     282pip3 install tflite_runtime
     283# clone tensorflow for some examples and assets used below
     284cd
     285git clone https://github.com/tensorflow/tensorflow.git
     286}}}
     287- run teflon image classification test
     288{{{#!bash
     289# note this test requires python pillow, python numpy<2.0 and write access to /dev/dri/renderD128
     290pip3 install "numpy<2.0" pillow tflite_runtime # tflite_runtime requires pillow as well as numpy<2
     291groups # make sure your in the render group (or have write access to /dev/dri/renderD128)
     292# without teflon (175.335ms on imx8mp)
     293python3 ~/mesa/src/gallium/frontends/teflon/tests/classification.py \
     294 -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
     295 -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
     296 -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
     297# with teflon (7.651ms on imx8mp)
     298python3 ~/mesa/src/gallium/frontends/teflon/tests/classification.py \
     299 -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
     300 -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
     301 -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
     302 -e ~/mesa/build/src/gallium/targets/teflon/libteflon.so
     303}}}
     304- run tensorflow label_image example:
     305{{{#!bash
     306# we need to patch the label_image example to use tflite_runtime
     307cd ~/tensorflow
     308cat <<EOF |
     309diff --git a/tensorflow/lite/examples/python/label_image.py b/tensorflow/lite/examples/python/label_image.py
     310index d26454f921f..08c65962bf1 100644
     311--- a/tensorflow/lite/examples/python/label_image.py
     312+++ b/tensorflow/lite/examples/python/label_image.py
     313@@ -19,7 +19,7 @@ import time
    229314 
    230 
    231 
    232 
    233 
     315 import numpy as np
     316 from PIL import Image
     317-import tensorflow as tf
     318+import tflite_runtime.interpreter as tflite
     319 
     320 
     321 def load_labels(filename):
     322@@ -85,7 +85,7 @@ if __name__ == '__main__':
     323         tflite.load_delegate(args.ext_delegate, ext_delegate_options)
     324     ]
     325 
     326-  interpreter = tf.lite.Interpreter(
     327+  interpreter = tflite.Interpreter(
     328       model_path=args.model_file,
     329       experimental_delegates=ext_delegate,
     330       num_threads=args.num_threads)
     331
     332EOF
     333patch -p1
     334# without acceleration (175.993ms on imx8mp)
     335python3 ~/tensorflow/lite/examples/python/label_image.py \
     336  -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
     337  -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
     338  -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp
     339# with acceleration (14.138ms on imx8mp)
     340python3 ~/tensorflow/lite/examples/python/label_image.py \
     341 -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
     342 -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
     343 -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
     344 -e ~/mesa/build/src/gallium/targets/teflon/libteflon.so
     345}}}
     346
     347
     348
     349