Context Navigation

Changes between Version 16 and Version 17 of venice/npu

Timestamp:: 08/19/2024 09:32:27 PM (11 months ago)
Author:: Tim Harvey
Comment:: added section for upstream/mainline NPU support

Legend:

: Unmodified
: Added
: Removed
: Modified

venice/npu

-              v16
+              v17
 [[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark_new.png)]]
+== NXP Yocto BSP
 The easiest way to get started with the NPU is to use a image from the NXP BSP. This image contains the necessary libraries and kernel to interface the NPU with !TensorFlow without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
 …
 **NOTE**: In the scripts below, we disable PCIe as a temporary fix to prevent the NXP 6.6.3_1.0.0 kernel from hanging on boot. This is caused by a missing patch necessary to work around a PCIe switch quirk when used on the IMX8MP, which can be found specifically [[https://github.com/Gateworks/linux-venice/commit/cf983e4a04eecb5be93af7b53cb10805ee448998|here]] from our kernel.
+== Getting Started with the NPU
 === 1. Download the Gateworks Venice Rescue Image to removable multimedia.
 Find which device your flash drive/SD card is. For example, {{{/dev/sdc}}}
 …
+== Upstream / Mainline NPU support
+The IMX8MP has a Verisilicon NPU which is from Vivante. Vivante supports this via their vivante driver (/dev/galcore) but that is not an opensource driver so it would make sense to want to support the NPU with it's opensource equivalent (etnaviv driver). Support for this has been added to the Linux 6.10 kernel.
+The userspace library used for GPU is mesa so it makes sense to support the NPU there as well and for hardware accelleration this is done in mesa with 'delegate' libraries. While Support for the !VeriSilicon NPU made it into Mesa 24.1.0 via teflon it does not yest support the imx8mp which has a sightly newer version of the !VeriSilicon NPU.
+Therefore to support the NPU you need to build a custom fork of mesa that is being worked on by [https://blog.tomeuvizoso.net/ Tomeu Vizoso] with the work being sponsored by [https://ideasonboard.com/ Ideas On board].
+On a Gateworks Venice board with an IMX8MP and Ubuntu 22.04 (jammy) root filesystem:
+- update kernel to v6.10 with NPU support
+{{{#!bash
+# update kernel
+cd /tmp && wget https://dev.gateworks.com/venice/kernel/linux-venice-6.10.6.tar.xz && tar -C / -xvf linux*.tar.xz --keep-directory-symlink && reboot
+}}}
+- create a non-root user (always a good idea but not required for mesa)
+{{{#!bash
+# create user
+USER=gateworks
+useradd -s /bin/bash $USER
+mkdir /home/$USER; chown $USER /home/$USER
+# allow sudo for this user
+echo "$USER ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
+# add render group (needed for some examples)
+usermod -aG render $USER
+# login as this user
+su -l $USER
+}}}
+- install python virtualenv which allows us to prevent conflicts between package libraries
+{{{#!bash
+sudo apt update && sudo apt upgrade -y && sudo apt install -y python3-pip python3-venv
+# create tflite1-env" virtualenv
+python3 -m venv tflite1-env
+# activate the venv (repeat or every new shell)
+source tflite1-env/bin/activate
+}}}
+- build mesa (fork with updated teflon support for imx8mp)
+{{{#!bash
+# install build deps
+sudo apt install -y build-essential git cmake
+sudo apt-get -y build-dep mesa
+# get repo
+git clone https://gitlab.freedesktop.org/tomeu/mesa.git -b etnaviv-imx8mp mesa
+cd mesa
+# mesa requires meson >= 1.1.0 (newer than jammy's), pycparser >= 2.20, and mako
+pip3 install meson pycparser mako
+~/tflite1-env/bin/meson setup build -Dgallium-drivers=etnaviv -Dvulkan-drivers= -Dteflon=true
+~/tflite1-env/bin/meson compile -C build # 20 mins or so on imx8mp
+ldd build/src/gallium/targets/teflon/libteflon.so
+}}}
+- install tensorflow lite runtime
+{{{#!bash
+# install tensorflow lite runtime
+pip3 install tflite_runtime
+# clone tensorflow for some examples and assets used below
+cd
+git clone https://github.com/tensorflow/tensorflow.git
+}}}
+- run teflon image classification test
+{{{#!bash
+# note this test requires python pillow, python numpy<2.0 and write access to /dev/dri/renderD128
+pip3 install "numpy<2.0" pillow tflite_runtime # tflite_runtime requires pillow as well as numpy<2
+groups # make sure your in the render group (or have write access to /dev/dri/renderD128)
+# without teflon (175.335ms on imx8mp)
+python3 ~/mesa/src/gallium/frontends/teflon/tests/classification.py \
+ -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
+ -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
+ -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
+# with teflon (7.651ms on imx8mp)
+python3 ~/mesa/src/gallium/frontends/teflon/tests/classification.py \
+ -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
+ -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
+ -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
+ -e ~/mesa/build/src/gallium/targets/teflon/libteflon.so
+}}}
+- run tensorflow label_image example:
+{{{#!bash
+# we need to patch the label_image example to use tflite_runtime
+cd ~/tensorflow
+cat <<EOF |
+diff --git a/tensorflow/lite/examples/python/label_image.py b/tensorflow/lite/examples/python/label_image.py
+index d26454f921f..08c65962bf1 100644
+--- a/tensorflow/lite/examples/python/label_image.py
++++ b/tensorflow/lite/examples/python/label_image.py
+@@ -19,7 +19,7 @@ import time
+ import numpy as np
+ from PIL import Image
+-import tensorflow as tf
++import tflite_runtime.interpreter as tflite
+ def load_labels(filename):
+@@ -85,7 +85,7 @@ if __name__ == '__main__':
+         tflite.load_delegate(args.ext_delegate, ext_delegate_options)
+     ]
+-  interpreter = tf.lite.Interpreter(
++  interpreter = tflite.Interpreter(
+       model_path=args.model_file,
+       experimental_delegates=ext_delegate,
+       num_threads=args.num_threads)
+EOF
+patch -p1
+# without acceleration (175.993ms on imx8mp)
+python3 ~/tensorflow/lite/examples/python/label_image.py \
+  -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
+  -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
+  -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp
+# with acceleration (14.138ms on imx8mp)
+python3 ~/tensorflow/lite/examples/python/label_image.py \
+ -m ~/mesa/src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite \
+ -l ~/mesa/src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt \
+ -i ~/tensorflow/tensorflow/lite/examples/label_image/testdata/grace_hopper.bmp \
+ -e ~/mesa/build/src/gallium/targets/teflon/libteflon.so
+}}}