wiki:ventana/PCIe

Version 7 (modified by Ryan Erbstoesser, 3 months ago) (diff)

add in information about gen2

See also:

Ventana PCI/PCIe Support

PCIe Pinout

The PCIe pins can be found in the User Manual .

Note that Gateworks prefers to adhere to the industry standard for pin usage.

Resource Limits

The i.MX6 CPU has an internal address translation unit (iATU) that connects the i.MX6 PCI host controller to the memory bus. This iATU window size imposes a resource limit which can ultimately limit the number of PCI devices you can have on the bus. The iATU window is 16MB which can technically be broken up in a variety of ways but later kernels (v4.x) by default use it as:

  • 512KB config space
  • 64KB io space
  • 15MB mem space available for devices

These ranges are defined in the device tree file imx6qdl.dtsi under the pcie node. The last entry on each line dictates the size of the range.

pcie: pcie@0x01000000 {
           compatible = "fsl,imx6q-pcie", "snps,dw-pcie";
           ...
           ranges = <0x81000000 0 0          0x01f80000 0 0x00010000   /* downstream I/O (1MB) */
                     0x82000000 0 0x01000000 0x01000000 0 0x00f00000>; /* non-prefetchable memory (15MB) */
           ...
}

PCI Devices can request 1 or more io regions, and 1 or more mem regions however when devices are behind a bridge (which they will be on a GW52xx, GW53xx, and GW54xx) the various resource requests must go through a PCI bridge which imposes a 1MB granularity for mem regions. On the GW52xx, GW53xx, GW54xx, each PCIe socket is behind a bridge and thus has this 1MB granularity. The upstream port on a PCIe switch takes a mem resource itself, which ends up leaving 14 more 1MB windows available.

The outcome is complex and is likely best explained with a series of examples of what is possible. Remember that results may vary depending on BSP, kernel version, and specific radio models. The following examples use various hardware combinations of:

  • Baseboards:
    • GW54xx - 2 mem windows used by baseboard (1 for PCIe switch, 1 for eth1 GigE)
    • GW53xx - 2 mem windows used by baseboard (1 for PCIe switch, 1 for eth1 GigE)
    • GW52xx - 1 mem windows used by baseboard (1 for PCIe switch)
    • GW51xx - 0 mem windows used by baseboard
  • Expansion Mezzanines:
    • GW16081 PCIe expansion mezz - 1 mem window used by PCIe bridge upstream port
    • GW16082 PCI expansion mezz - 1 mem window used by PCIe to PCI bridge upstream port
  • Various WiFi Radios:
    • WLE300 802.11n 3x3 MIMO radio - 2 mem windows required
    • SR71e 802.11n 2x2 MIMO radio - 1 mem window required
    • DNMA H-5 802.11abg radio - 1 mem window required
    • WLE900 802.11ac 3x3 MIMO radio - 3 mem windows (1@2mb window, 1@1mb window)
      • Note the ath10k driver/firmware may also request coherent pool memory (coherent memory from the kernel's atomic memory pool) and may require you to increase the kernel atomic coherent memory pool via the 'coherent_pool' kernel command line if you encounter allocation errors using multiple radios.
      •  setenv extra 'coherent_pool=4M'
        
      • Depending on the card(s) and mode(s) you're using, this value can change, so 4M is a very safe bet (considering it's currently set to 256k by default). To verify that the kernel got this new setting, just do a cat /proc/cmdline and you should see the coherent_pool=4M sitting there.

Disclaimer: Correctness of configurations can not be easily predicted and you should always verify compatibility yourself.

Other configurations are possible if someone for example wants to spread out some PCIe devices across a couple of GW16081 mezzanines to allow many cellular radios (which USE USB, not PCI). The basic rules can be summarized as follows:

  • Most atheros radios seem to require 1 (ie SR71e, Option GTM671WFS), but some (ie WLE300 ath9k) requires 2 or more
  • Each PCIe switch requires 1 (ie GW54xx/GW53xx/GW52xx has one on-board, add another if you have a GW16081 mezz)
  • 2nd onboard eth1 GigE requires 1
  • The PCIe-to-PCI bridge (GW16082 mezz) requires 1 but has the unique case that everything behind it fits into 1 resource regardless of radio.

Notes:

  • The above examples refer to the PCIe host controller driver used in the OpenWrt (3.10+) kernel. Ubuntu with a 4.13.0 kernel was also used for testing. Beware that the 3.0.35 kernel used for our Yocto and Android BSPs reserve a 14MB mem resource window which leaves 1 less region affecting the examples above.

Identifying Allocation Failures

Failures to allocate pci resources manifest in a few different ways depending on BSP, kernel version, and model number. A failure to map devices behind the pci switch is the most typical failure case and occurs when

  • Newer kernels (v4.x) attempt to map all requested devices behind a pci switch at once, and if there are insufficient resources then none of these devices will enumerate. This failure is easily recognizable by the kernel prints of the form:
    [    0.366474] pci 0000:01:00.0: BAR 8: failed to assign [mem size 0x01300000]
    [    0.366501] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00020000]
    [    0.366527] pci 0000:01:00.1: BAR 0: failed to assign [mem size 0x00020000]
    [    0.366574] pci 0000:02:01.0: BAR 8: failed to assign [mem size 0x00300000]
    [    0.366599] pci 0000:02:04.0: BAR 8: failed to assign [mem size 0x00300000]
    [    0.366624] pci 0000:02:05.0: BAR 8: failed to assign [mem size 0x00300000]
    [    0.366649] pci 0000:02:06.0: BAR 8: failed to assign [mem size 0x00300000]
    [    0.366674] pci 0000:02:07.0: BAR 8: failed to assign [mem size 0x00300000]
    [    0.366699] pci 0000:02:08.0: BAR 8: failed to assign [mem size 0x00100000]
    [    0.366723] pci 0000:02:09.0: BAR 8: failed to assign [mem size 0x00300000]
    
  • Previous kernels will inconsistently map devices while resources last. In some cases the pci devices may even appear to be mapped correctly judging by the kernel prints and output of lspci. However, attempting to utilize them results in functional behavior from only a fraction of the devices. In this case the best way to identify if this is the case is to check your system log for pci driver prints. Using the ath10k_pci driver as an example:
    [   14.518701] ath10k_pci: probe of 0000:09:00.0 failed with error -5
    
    This will be printed for each failed device.

Memory Calculation Example

To determine how many memory resources a wifi or other card uses, consult the following:

The simplest method to document would be lspci.

For example:

From lspci, we can see the card we want is at address 07:00.0 (Atheros AR93xx)

root@OpenWrt:/# lspci
00:00.0 PCI bridge: Device 16c3:abcd (rev 01)
01:00.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
01:00.1 System peripheral: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:01.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:04.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:05.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:06.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:07.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:08.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
02:09.0 PCI bridge: PLX Technology, Inc. PEX 8609 8-lane, 8-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA (rev ba)
07:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01)
08:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8057 PCI-E Gigabit Ethernet Controller

Now run the dmesg command and grep for that device 07:00.0 Note the two memory regions below, reg10 and reg 30.

root@OpenWrt:/# dmesg | grep 07:00.0
[    0.678291] pci 0000:07:00.0: [168c:0030] type 00 class 0x028000
[    0.678395] pci 0000:07:00.0: reg 10: [mem 0x00000000-0x0001ffff 64bit]
[    0.678590] pci 0000:07:00.0: reg 30: [mem 0x00000000-0x0000ffff pref]
[    0.678725] pci 0000:07:00.0: supports D1
[    0.678739] pci 0000:07:00.0: PME# supported from D0 D1 D3hot
[    0.682167] pci 0000:07:00.0: BAR 0: assigned [mem 0x01100000-0x0111ffff 64bit]
[    0.682222] pci 0000:07:00.0: BAR 6: assigned [mem 0x01400000-0x0140ffff pref]
[    8.326434] PCI: enabling device 0000:07:00.0 (0140 -> 0142)

Calculate the memory usage by looking at the lines such as reg 10 and reg30.

Look at the mem portion and calculate the size between 0x00 and the other number.

For example:

  • 0x001fffff = 2097151÷1024 = 2048 KB = 2 MB
  • 0x0001ffff = 131072÷1024 = 128 KB = 1MB (this is because there is only a 1MB granularity in the memory allocation, so the miniumum that can be allocated is 1MB even if the actual is only 128KB.)

PCIe Switch

While the IMX6 has only a single PCIe host controller, many Ventana models have a PLX PCIe switch which allows the board to support more than 1 PCIe endpoint device.

A PCIe switch operates like a PCI bridge such that it will create additional subordinate busses.

PCIe bus layout can vary board-to-board depending on which PLX switch is used (we use a variety of 4 port, 6 port, and 8 port devices).

The lspci command will list the details of the devices enumerated on the bus. Note that PCI enumeration occurs at kernel init time as ARM Linux does not support PCI hotplug.

The typical power usage of the PLX switch is appx 1.35W up to a max of appx 2.6W (max being 85% traffic). The switch does support PCIExpress Active State Power Management (ASPM) and also will power down unused SerDes lanes automatically to reduce power when possible.

Some customers choose to place a heatsink on the PLX PCIe switch chip depending on their operating environment, system load, and enclosure.

PCI Throughput

The IMX6 PCIe host supports up to PCI Gen2 (5.0Gbits/sec) however this requires an external clock generator which is not present on all board revisions and was added as an enhancement. Therefore we limit the link to PCI Gen1 (2.5Gbits/sec) in our Board Support Packages (BSPs) via a software patch as a conservative effort. A future software update (bootloader and kernel) will remove this limit for board revisions that have an external clock generator.

See Ventana errata HW14 for details on what board revisions have Gen2 capability.

To detect if Gen1 or Gen2 is being used, do not use the lspci command. This only shows the capability and not the actual activated version. To see the actual activated version, use the dmesg command with a grep, like so to see that Gen2 is disabled and the link is up :

root@ventana:~# dmesg | grep Link
[    0.271427] imx6q-pcie 1ffc000.pcie: Link: Gen2 disabled
[    0.271444] imx6q-pcie 1ffc000.pcie: Link up, Gen=1
[   11.502799] libphy: 2188000.ethernet:00 - Link is Up - 1000/Full

The bus speed represents a theoretical maximum throughput and does not account for host processing speed or bus contention from multiple masters.

Message Signaled Interrupts (MSI)

MSI replaces traditional out-of-band interrupt assertion with an in-band messaging construct. This was introduced in PCI 2.2 and is used by PCI Express. MSI-X was introduced in PCI 3.0 and permits a device to allocate up to 2048 interrupts.

While MSI is used for PCIe at a hardware level, an additional layer of support can be provided and used by the kernel and drivers that can expand the 'legacy' PCI interrupts INTA/B/C/D to virtual software interrupts. For example, a GW54xx having 6 PCIe expansion sockets must share the 4 legacy PCI interrupts among its sockets. If MSI was used, while they would all end up firing a single hardware interrupt it would be cascaded to unique software interrupts which theoretically could be split across CPU cores. The end result of this is better CPU core separation ability via smp-affinity. Note that the IMX6 PCIe host controller driver does not implement virtual MSI interrupts in a way that allows them to be steered towards different CPU's and adding code to allow this would add additional overhead burdening the single-cpu case.

Because MSI interrupts can not be steered to different CPU's in the hardirq context there is no performance benefit of MSI and we have MSI disabled in the Gateworks Ventana kernels. Additionally we have encountered devices/drivers from time-to-time that do not work properly with MSI interrupts enabled.

Additionally it should be noted that currently enabling MSI in IMX6 kernels will break legacy PCI interrupt support meaning any card/driver that doesn't support MSI will not work. For this reason we do not currently support MSI on Ventana.

PCIe Reset

Reset signals are routed to the Mini-PCIe slots.

Some boards have the same reset signal across all slots, and some boards have individually controlled reset signals for each slot. Please contact Gateworks support via email for a specific board model.

PCIe reset signals are typically always controlled by the kernel and software, however, they also can at times be controlled via a GPIO, with more details here

PCIe In U-Boot

To avoid PCI and PCIe related issues in Linux, the default bootloader behavior is to disable all pci until booting to Linux. In order to use any pci or pcie devices while in the bootloader, the pcidisable environment variable must be cleared.

setenv pcidisable; saveenv; reset

Once cleared and after board reset, pci devices should enumerate as expected.

You can further interact with the pci subsystem via the pci command.

Attachments (1)

Download all attachments as: .zip