Changes between Version 5 and Version 6 of venice/npu


Ignore:
Timestamp:
07/26/2024 07:02:22 PM (4 months ago)
Author:
Blake Stewart
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • venice/npu

    v5 v6  
    55 * Any GW71xx, GW72xx and GW73xx using a GW702x SOM module will use the i.MX8M Plus processor
    66
    7 The NPU operatines up to 2.25 TOPS.
    8 
    9 [[Image(https://i.imgur.com/Jw1JTHp.png)]]
    10 
    11 The easiest way to get started with the NPU is to use a image from the NXP BSP. This image contains the necessary libraries and kernel to interface the NPU without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
     7The NPU operatines up to 2.25 TOPS. Out of the box, this makes Gateworks boards with NPU capabilities powerful for AI applications on the edge.
     8
     9[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark.png)]]
     10
     11The easiest way to get started with the NPU is to use a image from the NXP BSP. This image contains the necessary libraries and kernel to interface the NPU with TensorFlow without much configuration. You can either [[https://www.nxp.com/docs/en/user-guide/IMX_YOCTO_PROJECT_USERS_GUIDE.pdf | follow the guide to build their image]] or [[https://www.nxp.com/design/design-center/software/embedded-software/i-mx-software/embedded-linux-for-i-mx-applications-processors:IMXLINUX | download a pre-built one]] (recommended).
    1212
    1313This guide assumes you have:
     
    1616- A >= 16GB flash drive, SD card, or other removable block storage to install a Rescue Image, NXP Image, and updated device trees (DTBs) onto the board.
    1717
     18The steps are as generalized as possible to not depend on the boards available RAM to load an image, or the low speeds of JTAG uploading, as the .wic from NXP is >8GB. We will use a ramdisk to boot a "rescue image" fully in RAM, then use dd to write from the removable multimedia (flash drive) to the onboard eMMC (/dev/mmcblk2).
     19
    1820== Getting Started with the NPU
    1921=== 1. Download the Gateworks Venice Rescue Image to removable multimedia.
     
    4547
    4648=== 3. Patch & Build patch Venice DTBs from the Kernel source.
    47 Due to small inconsistencies between the NXP and Gateworks devicetrees for bleeding-edge peripherals, a patch is required until mainline compatibility is reached.
     49Due to small inconsistencies between the NXP and Gateworks devicetrees for bleeding-edge peripherals, a patch is required until mainline compatibility is reached. The below script gets the patches from the attachments at the bottom of this page.
    4850
    4951{{{
    5052git clone https://github.com/nxp-imx/linux-imx -b lf-6.6.y
    5153cd linux-imx
    52 wget <patches>
     54wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
     55wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
     56wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0003-arm64-dts-freescale-add-Gateworks-venice-board-dtbs.patch
     57wget https://trac.gateworks.com/raw-attachment/wiki/venice/npu/0004-arm64-dts-imx8mp-venice-gw74xx-enable-gpu-nodes.patch
    5358patch -p1 < 0001-arm64-dts-imx8mp-venice-fix-USB_OC-pinmux.patch
    5459patch -p1 < 0002-arm64-dts-imx8mm-venice-gw700x-remove-ddrc.patch
     
    177182Without considering the warmup times, this is a >**98% speedup**! For every CPU frame, the NPU can process 53.
    178183
    179 [[Image(https://i.imgur.com/Jw1JTHp.png)]]
     184[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/gw74xx_npu_benchmark.png)]]
    180185
    181186=== GStreamer Example
     
    195200If everything works properly, you should instantly see your video input streamed to your desktop host. After a few seconds of warming up, the bounding boxes from the [[https://nnstreamer.github.io/gst/nnstreamer/README.html | TensorFlow filter]] will be overlaid on the video. The stream properties can be changed for different resolutions and framerates; see [[https://trac.gateworks.com/wiki/Yocto/gstreamer/streaming | gstreamer/streaming]].
    196201
    197 [[Image(https://i.imgur.com/7KK4Wo8.png)]]
     202[[Image(https://trac.gateworks.com/raw-attachment/wiki/venice/npu/imx8mp_border.png)]]
    198203
    199204