Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Version 24 and Version 25 of venice/npu

Timestamp:: 02/26/2026 10:45:59 PM (19 hours ago)
Author:: Ryan Erbstoesser
Comment:: adjust wiki, note for mesa and yocto

Legend:

: Unmodified
: Added
: Removed
: Modified

venice/npu

-              v24
+              v25
 [[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark_new.png)]]
-== NXP Yocto BSP
-The easiest way to get started with the NPU is to use a image from the NXP BSP. This image contains the necessary libraries and kernel to interface the NPU with !TensorFlow without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
-This guide assumes you have:
-- A Gateworks board with a i.MX8M Plus procesor.
-- A NXP account, which is necessary to download their image and models.
-- A >= 16GB flash drive, SD card, or other removable block storage to install a Rescue Image, NXP Image, and updated device trees (DTBs) onto the board.
-The steps are as generalized as possible to not depend on the boards available RAM to load an image, or the low speeds of JTAG uploading, as the .wic from NXP is >8GB. We will use a ramdisk to boot a "rescue image" fully in RAM, then use dd to write from the removable multimedia (flash drive) to the onboard eMMC (/dev/mmcblk2).
-**NOTE**: In the scripts below, we disable PCIe as a temporary fix to prevent the NXP 6.6.3_1.0.0 kernel from hanging on boot. This is caused by a missing patch necessary to work around a PCIe switch quirk when used on the IMX8MP, which can be found specifically [[https://github.com/Gateworks/linux-venice/commit/cf983e4a04eecb5be93af7b53cb10805ee448998|here]] from our kernel.
-=== 1. Download the Gateworks Venice Rescue Image to removable multimedia.
-Find which device your flash drive/SD card is. For example, {{{/dev/sdc}}}
-Then, use dd to flash this image onto your device byte-for-byte.
-{{{#!bash
-DEVICE=<your device, no trailing />
-wget https://dev.gateworks.com/buildroot/venice/minimal/rescue.img.gz
-zcat rescue.img.gz | sudo dd of=${DEVICE} bs=1M oflag=sync
-}}}
-In this guide, we will have the NXP image and our rescue image on the same drive, so we will resize the partition and file system to fit both. If you're using separate devices, this is not necessary.
-{{{#!bash
-sudo parted ${DEVICE} "resizepart 1 -0" # resize the partition to fit the size of the drive
-sudo resize2fs ${DEVICE}1 # resize ext fs on device partition 1
-}}}
-=== 2. Download the NXP BSP evaluation kit image to removable multimedia.
-On your host machine, install the Linux 6.6.3_1.0.0 image for the i.MX 8M Plus EVK [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | here]].
-In this download you will find imx-image-full-imx8mpevk.wic, which is a Yocto-generated image with all of the ML libraries.
-Copy this image to our device.
-{{{#!bash
-sudo mount ${DEVICE}1 /mnt
-sudo cp imx-image-full-imx8mpevk.wic /mnt/
-}}}
-=== 3. Patch & Build patch Venice DTBs from the Kernel source.
-Due to small inconsistencies between the NXP and Gateworks devicetrees for bleeding-edge peripherals, a patch is required until mainline compatibility is reached. The below script gets the patches from the attachments at the bottom of this page.
-{{{#!bash
-git clone https://github.com/nxp-imx/linux-imx -b lf-6.6.y
-cd linux-imx
-wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
-wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
-wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0003-arm64-dts-freescale-add-Gateworks-venice-board-dtbs.patch
-wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0004-arm64-dts-imx8mp-venice-gw74xx-enable-gpu-nodes.patch
-patch -p1 < 0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
-patch -p1 < 0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
-patch -p1 < 0003-arm64-dts-freescale-add-Gateworks-venice-board-dtbs.patch
-patch -p1 < 0004-arm64-dts-imx8mp-venice-gw74xx-enable-gpu-nodes.patch
-ARCH=arm64 make imx_v8_defconfig
-ARCH=arm64 make dtbs
-}}}
-Copy these patched dtbs to a directory on your flash such as {{{/nxp/}}}, as to not overwrite the ones necessary for booting into the rescue image.
-{{{#!bash
-sudo mkdir /mnt/nxp
-sudo cp arch/arm64/boot/dts/freescale/*venice*.dtb /mnt/nxp/
-}}}
-Now, the contents of the device should include:
-- Rescue image
-- Rescue image dtbs
-- rootfs.cpio.xz and boot.scr for booting Rescue Image
-- nxp/ with new, updated dtbs
-=== 4. Boot Rescue Image ramdisk on board
-Connect the removable multimedia, in our case a USB stick, to the board before powering. Many boards have built-in SD readers, which would change the device commands slightly.
-Connect serial console via JTAG and power on the board.
-Enter the U-Boot console by stopping autoboot.
-Sanity check: is the USB device properly detected?
-{{{#!bash
-usb start
-part list usb 0
-}}}
-This command should have an expected output like below
-{{{#!bash
-Partition Map for USB device 0  --   Partition Type: DOS
-Part    Start Sector    Num Sectors     UUID            Type
-     2048            204800          6fd772a2-01     83 Boot
-}}}
-Override the boot_targets variable temporarily to ensure booting into the Rescue Image, then boot into it. If you are not using usb0, run {{{print boot_targets}}} to see a list.
-{{{#!bash
-setenv boot_targets usb0
-run bootcmd
-}}}
-If all functions normally, you should be met with a login; login with root and you will enter the shell.
-=== Flash NXP .wic and patched DTBs onto eMMC
-You are now booted into the ramdisk rescue image. The next steps are to flash the .wic onto the emmc.
-For venice boards the emmc that we are imaging is {{{/dev/mmcblk2}}} and with only one removable storage device your rescue image with be {{{/dev/sda}}}.
-Image the emmc as followes:
-{{{#!bash
-mkdir /mnt/src
-mkdir /mnt/dst
-mount /dev/sda1 /mnt/src
-dd if=/mnt/src/imx-image-full-imx8mpevk.wic of=/dev/mmcblk2 bs=16M oflag=sync # this will take a couple of minutes
-mount /dev/mmcblk2p1 /mnt/dst
-cp /mnt/src/*.dtb /mnt/dst/
-cp /mnt/src/nxp/*.dtb /mnt/dst/
-}}}
-This flashes the prebuilt .wic image (both partitions, the kernel and fs) to our eMMC, then also brings over the old and new device trees. Next, we will create the boot script. If the below doesn't copy right, the file can be created/edited in a text editor like vi; just remove the EOF line.
-{{{#!bash
-cat <<\EOF > boot.scr.txt
-setenv bootargs 'root=/dev/mmcblk2p2'
-load mmc 2:1 $kernel_addr_r Image
-setenv fdt_addr
-setenv fdt_list $fdt_file $fdt_file1 $fdt_file2 $fdt_file3 $fdt_file4 $fdt_file5
-setenv load_fdt 'echo Loading $fdt...; load ${devtype} ${devnum}:${distro_bootpart} ${fdt_addr_r} ${prefix}${fdt} && setenv fdt_addr ${fdt_addr_r}'
-for fdt in ${fdt_list}; do if test -e ${devtype} ${devnum}:${distro_bootpart} ${prefix}${fdt}; then run load_fdt; fi; done
-if test -z "$fdt_addr"; then echo "Warning: Using bootloader DTB"; setenv fdt_addr $fdtcontroladdr; fi
-#Disables PCI; patch is needed, otherwise kernel hangs: See note at start of wiki page.
-fdt addr $fdt_addr_r && fdt resize && fdt set /soc@0/pcie@33800000 status disabled
-booti $kernel_addr_r - $fdt_addr_r
-EOF
-}}}
-'Compile' the boot script txt and flash it onto the MMC
-{{{#!bash
-mkimage -A arm64 -T script -C none -d boot.scr.txt /mnt/dst/boot.scr
-umount /mnt/dst
-umount /mnt/src
-}}}
-=== Boot into the NXP image.
-Power cycle the board, and note the kernel which it boots into. The board should automatically boot into the eMMC image we just flashed, meaning the removable multimedia need not be connected.
-If there is an error, look at the logs and the boot scripts in U-Boot.
-At this point, all features regarding the Kernel and below are properly enabled. If you have an application that uses !TensorFlow, it will run on the NPU or GPU using {{{/usr/lib/libvx_delegate.so}}}. Follow the [[https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf | NXP Machine Learning User's Guide]] for more information.
-=== Image Classification Example
-As per the [[https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf | NXP Machine Learning User's Guide]], we will test a simple image labeling script on both the CPU and NPU.
-{{{#!bash
-$ cd /usr/bin/tensorflow-lite-2.15.0/examples
-$ python3 label_image.py # without NPU acceleration
-$ python3 label_image.py -e /usr/lib/libvx_delegate.so # with NPU accerlation via the libvx_delegate external TensorFlow delegate
-}}}
-Result from either label_image script:
-{{{#!bash
-.878431: military uniform
-.027451: Windsor tie
-.011765: mortarboard
-.011765: bulletproof vest
-.007843: sax
-}}}
-Without the NPU: {{{Image Classification time: 170.5 ms}}}
-With the NPU: {{{Image Classification: 3.2 ms}}}
-Without considering the warmup times, this is a >**98% speedup**! For every CPU frame, the NPU can process 53.
-This data is derived from {{{classifications/sec = 1/(image classification time)}}}
-[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark_new.png)]]
-=== GStreamer Example for Detection
-Section 8.2 of the Machine Learning Users Guide details this process, such as how to download the necessary models. After following the download steps, the {{{home/root/nxp-nnstreamer-examples/}}} directory on your board should have a {{{downloads}}} directory with {{{models}}} and {{{media}}} directories. If not, you need to run the update script on your host to compile the models and scp them to the board.
-On your host, execute the following command to have GStreamer take in video over UDP.
-{{{ gst-launch-1.0 udpsrc port=5000 ! application/x-rtp,payload=96 ! rtpjpegdepay ! jpegdec ! autovideosink }}}
-[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/hostpipeline.svg, width=1200)]]
-\\Host GStreamer pipeline (SVG)
-On your board, execute the following to send a stream over UDP to the host port 5000. This script was derived from Section 8.1 of the Machine Learning Users Guide. The GStreamer command takes in a video input and overlays both bounding boxes and labels on it using !TensorFlow and NXP filters.
-{{{#!bash
-CAMERA= <your camera device, such as /dev/video2>
-HOST_IP= <desktop ip addr>
-gst-launch-1.0 v4l2src name=cam_src device=${CAMERA} num-buffers=-1
-! video/x-raw,width=640,height=480,framerate=30/1
-! tee name=t t.
-! queue name=thread-nn max-size-buffers=2 leaky=2 ! imxvideoconvert_g2d
-! video/x-raw,width=300,height=300,format=RGBA ! videoconvert
-! video/x-raw,format=RGB ! tensor_converter
-! tensor_filter framework=tensorflow-lite model=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess.tflite custom=Delegate:External,ExtDelegateLib:libvx_delegate.so
-! tensor_decoder mode=bounding_boxes option1=mobilenet-ssd option2=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/coco_labels_list.txt option3=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/box_priors.txt option4=640:480 option5=300:300
-! videoconvert ! queue
-! mix. t.
-! queue name=thread-img max-size-buffers=2 leaky=2 ! videoconvert
-! mix. imxcompositor_g2d name=mix latency=30000000 min-upstream-latency=30000000 sink_0::zorder=2 sink_1::zorder=1
-! videoconvert ! jpegenc ! rtpjpegpay ! udpsink host=${HOST_IP} port=5000
-}}}
-[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/pipeline.svg, width=1200)]]
-\\GW74xx AI-detection GStreamer pipeline (SVG)
-If everything works properly, you should instantly see your video input streamed to your desktop host. After a few seconds of warming up, the bounding boxes from the [[https://nnstreamer.github.io/gst/nnstreamer/README.html | TensorFlow Detection filter]] will be overlaid on the video. The stream properties can be changed for different resolutions and framerates; see [[https://trac.gateworks.com/wiki/Yocto/gstreamer/streaming | gstreamer/streaming]]. NOTE: This example is object detection, which differs from the image classification that we got benchmark data from in the previous section.
-[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/imx8mp_border.png)]]
 …
 == Other AI Chipsets and Solutions
+Please note other AI accelerators can also be added via expansion slots described [wiki:TPU here]
+Please note other AI accelerators can also be added via expansion slots described [wiki:TPU here]
+== NXP Yocto BSP
+Another option to explore the NPU is to use an image from the NXP BSP.
+'''NOTE''' Gateworks typically uses a mainline kernel and Ubuntu, etc. If using the NXP image, it is a completely different eco-system that will require a lot of learning, etc. It may be handy to explore the NPU, but for a larger project, the mainline MESA option above may be better.
+This NXP image contains the necessary libraries and kernel to interface the NPU with !TensorFlow without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
+This guide assumes you have:
+- A Gateworks board with a i.MX8M Plus procesor.
+- A NXP account, which is necessary to download their image and models.
+- A >= 16GB flash drive, SD card, or other removable block storage to install a Rescue Image, NXP Image, and updated device trees (DTBs) onto the board.
+The steps are as generalized as possible to not depend on the boards available RAM to load an image, or the low speeds of JTAG uploading, as the .wic from NXP is >8GB. We will use a ramdisk to boot a "rescue image" fully in RAM, then use dd to write from the removable multimedia (flash drive) to the onboard eMMC (/dev/mmcblk2).
+**NOTE**: In the scripts below, we disable PCIe as a temporary fix to prevent the NXP 6.6.3_1.0.0 kernel from hanging on boot. This is caused by a missing patch necessary to work around a PCIe switch quirk when used on the IMX8MP, which can be found specifically [[https://github.com/Gateworks/linux-venice/commit/cf983e4a04eecb5be93af7b53cb10805ee448998|here]] from our kernel.
+=== 1. Download the Gateworks Venice Rescue Image to removable multimedia.
+Find which device your flash drive/SD card is. For example, {{{/dev/sdc}}}
+Then, use dd to flash this image onto your device byte-for-byte.
+{{{#!bash
+DEVICE=<your device, no trailing />
+wget https://dev.gateworks.com/buildroot/venice/minimal/rescue.img.gz
+zcat rescue.img.gz | sudo dd of=${DEVICE} bs=1M oflag=sync
+}}}
+In this guide, we will have the NXP image and our rescue image on the same drive, so we will resize the partition and file system to fit both. If you're using separate devices, this is not necessary.
+{{{#!bash
+sudo parted ${DEVICE} "resizepart 1 -0" # resize the partition to fit the size of the drive
+sudo resize2fs ${DEVICE}1 # resize ext fs on device partition 1
+}}}
+=== 2. Download the NXP BSP evaluation kit image to removable multimedia.
+On your host machine, install the Linux 6.6.3_1.0.0 image for the i.MX 8M Plus EVK [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | here]].
+In this download you will find imx-image-full-imx8mpevk.wic, which is a Yocto-generated image with all of the ML libraries.
+Copy this image to our device.
+{{{#!bash
+sudo mount ${DEVICE}1 /mnt
+sudo cp imx-image-full-imx8mpevk.wic /mnt/
+}}}
+=== 3. Patch & Build patch Venice DTBs from the Kernel source.
+Due to small inconsistencies between the NXP and Gateworks devicetrees for bleeding-edge peripherals, a patch is required until mainline compatibility is reached. The below script gets the patches from the attachments at the bottom of this page.
+{{{#!bash
+git clone https://github.com/nxp-imx/linux-imx -b lf-6.6.y
+cd linux-imx
+wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
+wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
+wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0003-arm64-dts-freescale-add-Gateworks-venice-board-dtbs.patch
+wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0004-arm64-dts-imx8mp-venice-gw74xx-enable-gpu-nodes.patch
+patch -p1 < 0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
+patch -p1 < 0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
+patch -p1 < 0003-arm64-dts-freescale-add-Gateworks-venice-board-dtbs.patch
+patch -p1 < 0004-arm64-dts-imx8mp-venice-gw74xx-enable-gpu-nodes.patch
+ARCH=arm64 make imx_v8_defconfig
+ARCH=arm64 make dtbs
+}}}
+Copy these patched dtbs to a directory on your flash such as {{{/nxp/}}}, as to not overwrite the ones necessary for booting into the rescue image.
+{{{#!bash
+sudo mkdir /mnt/nxp
+sudo cp arch/arm64/boot/dts/freescale/*venice*.dtb /mnt/nxp/
+}}}
+Now, the contents of the device should include:
+- Rescue image
+- Rescue image dtbs
+- rootfs.cpio.xz and boot.scr for booting Rescue Image
+- nxp/ with new, updated dtbs
+=== 4. Boot Rescue Image ramdisk on board
+Connect the removable multimedia, in our case a USB stick, to the board before powering. Many boards have built-in SD readers, which would change the device commands slightly.
+Connect serial console via JTAG and power on the board.
+Enter the U-Boot console by stopping autoboot.
+Sanity check: is the USB device properly detected?
+{{{#!bash
+usb start
+part list usb 0
+}}}
+This command should have an expected output like below
+{{{#!bash
+Partition Map for USB device 0  --   Partition Type: DOS
+Part    Start Sector    Num Sectors     UUID            Type
+     2048            204800          6fd772a2-01     83 Boot
+}}}
+Override the boot_targets variable temporarily to ensure booting into the Rescue Image, then boot into it. If you are not using usb0, run {{{print boot_targets}}} to see a list.
+{{{#!bash
+setenv boot_targets usb0
+run bootcmd
+}}}
+If all functions normally, you should be met with a login; login with root and you will enter the shell.
+=== Flash NXP .wic and patched DTBs onto eMMC
+You are now booted into the ramdisk rescue image. The next steps are to flash the .wic onto the emmc.
+For venice boards the emmc that we are imaging is {{{/dev/mmcblk2}}} and with only one removable storage device your rescue image with be {{{/dev/sda}}}.
+Image the emmc as followes:
+{{{#!bash
+mkdir /mnt/src
+mkdir /mnt/dst
+mount /dev/sda1 /mnt/src
+dd if=/mnt/src/imx-image-full-imx8mpevk.wic of=/dev/mmcblk2 bs=16M oflag=sync # this will take a couple of minutes
+mount /dev/mmcblk2p1 /mnt/dst
+cp /mnt/src/*.dtb /mnt/dst/
+cp /mnt/src/nxp/*.dtb /mnt/dst/
+}}}
+This flashes the prebuilt .wic image (both partitions, the kernel and fs) to our eMMC, then also brings over the old and new device trees. Next, we will create the boot script. If the below doesn't copy right, the file can be created/edited in a text editor like vi; just remove the EOF line.
+{{{#!bash
+cat <<\EOF > boot.scr.txt
+setenv bootargs 'root=/dev/mmcblk2p2'
+load mmc 2:1 $kernel_addr_r Image
+setenv fdt_addr
+setenv fdt_list $fdt_file $fdt_file1 $fdt_file2 $fdt_file3 $fdt_file4 $fdt_file5
+setenv load_fdt 'echo Loading $fdt...; load ${devtype} ${devnum}:${distro_bootpart} ${fdt_addr_r} ${prefix}${fdt} && setenv fdt_addr ${fdt_addr_r}'
+for fdt in ${fdt_list}; do if test -e ${devtype} ${devnum}:${distro_bootpart} ${prefix}${fdt}; then run load_fdt; fi; done
+if test -z "$fdt_addr"; then echo "Warning: Using bootloader DTB"; setenv fdt_addr $fdtcontroladdr; fi
+#Disables PCI; patch is needed, otherwise kernel hangs: See note at start of wiki page.
+fdt addr $fdt_addr_r && fdt resize && fdt set /soc@0/pcie@33800000 status disabled
+booti $kernel_addr_r - $fdt_addr_r
+EOF
+}}}
+'Compile' the boot script txt and flash it onto the MMC
+{{{#!bash
+mkimage -A arm64 -T script -C none -d boot.scr.txt /mnt/dst/boot.scr
+umount /mnt/dst
+umount /mnt/src
+}}}
+=== Boot into the NXP image.
+Power cycle the board, and note the kernel which it boots into. The board should automatically boot into the eMMC image we just flashed, meaning the removable multimedia need not be connected.
+If there is an error, look at the logs and the boot scripts in U-Boot.
+At this point, all features regarding the Kernel and below are properly enabled. If you have an application that uses !TensorFlow, it will run on the NPU or GPU using {{{/usr/lib/libvx_delegate.so}}}. Follow the [[https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf | NXP Machine Learning User's Guide]] for more information.
+=== Image Classification Example
+As per the [[https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf | NXP Machine Learning User's Guide]], we will test a simple image labeling script on both the CPU and NPU.
+{{{#!bash
+$ cd /usr/bin/tensorflow-lite-2.15.0/examples
+$ python3 label_image.py # without NPU acceleration
+$ python3 label_image.py -e /usr/lib/libvx_delegate.so # with NPU accerlation via the libvx_delegate external TensorFlow delegate
+}}}
+Result from either label_image script:
+{{{#!bash
+.878431: military uniform
+.027451: Windsor tie
+.011765: mortarboard
+.011765: bulletproof vest
+.007843: sax
+}}}
+Without the NPU: {{{Image Classification time: 170.5 ms}}}
+With the NPU: {{{Image Classification: 3.2 ms}}}
+Without considering the warmup times, this is a >**98% speedup**! For every CPU frame, the NPU can process 53.
+This data is derived from {{{classifications/sec = 1/(image classification time)}}}
+[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark_new.png)]]
+=== GStreamer Example for Detection
+Section 8.2 of the Machine Learning Users Guide details this process, such as how to download the necessary models. After following the download steps, the {{{home/root/nxp-nnstreamer-examples/}}} directory on your board should have a {{{downloads}}} directory with {{{models}}} and {{{media}}} directories. If not, you need to run the update script on your host to compile the models and scp them to the board.
+On your host, execute the following command to have GStreamer take in video over UDP.
+{{{ gst-launch-1.0 udpsrc port=5000 ! application/x-rtp,payload=96 ! rtpjpegdepay ! jpegdec ! autovideosink }}}
+[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/hostpipeline.svg, width=1200)]]
+\\Host GStreamer pipeline (SVG)
+On your board, execute the following to send a stream over UDP to the host port 5000. This script was derived from Section 8.1 of the Machine Learning Users Guide. The GStreamer command takes in a video input and overlays both bounding boxes and labels on it using !TensorFlow and NXP filters.
+{{{#!bash
+CAMERA= <your camera device, such as /dev/video2>
+HOST_IP= <desktop ip addr>
+gst-launch-1.0 v4l2src name=cam_src device=${CAMERA} num-buffers=-1
+! video/x-raw,width=640,height=480,framerate=30/1
+! tee name=t t.
+! queue name=thread-nn max-size-buffers=2 leaky=2 ! imxvideoconvert_g2d
+! video/x-raw,width=300,height=300,format=RGBA ! videoconvert
+! video/x-raw,format=RGB ! tensor_converter
+! tensor_filter framework=tensorflow-lite model=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess.tflite custom=Delegate:External,ExtDelegateLib:libvx_delegate.so
+! tensor_decoder mode=bounding_boxes option1=mobilenet-ssd option2=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/coco_labels_list.txt option3=/home/root/nxp-nnstreamer-examples/detection/../downloads/models/detection/box_priors.txt option4=640:480 option5=300:300
+! videoconvert ! queue
+! mix. t.
+! queue name=thread-img max-size-buffers=2 leaky=2 ! videoconvert
+! mix. imxcompositor_g2d name=mix latency=30000000 min-upstream-latency=30000000 sink_0::zorder=2 sink_1::zorder=1
+! videoconvert ! jpegenc ! rtpjpegpay ! udpsink host=${HOST_IP} port=5000
+}}}
+[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/pipeline.svg, width=1200)]]
+\\GW74xx AI-detection GStreamer pipeline (SVG)
+If everything works properly, you should instantly see your video input streamed to your desktop host. After a few seconds of warming up, the bounding boxes from the [[https://nnstreamer.github.io/gst/nnstreamer/README.html | TensorFlow Detection filter]] will be overlaid on the video. The stream properties can be changed for different resolutions and framerates; see [[https://trac.gateworks.com/wiki/Yocto/gstreamer/streaming | gstreamer/streaming]]. NOTE: This example is object detection, which differs from the image classification that we got benchmark data from in the previous section.
+[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/imx8mp_border.png)]]