Changes between Version 1 and Version 2 of laguna/errata


Ignore:
Timestamp:
06/01/2018 09:03:32 AM (3 months ago)
Author:
Cale Collins
Comment:

Removed HTML, fixed Wiki syntax.

Legend:

Unmodified
Added
Removed
Modified
  • laguna/errata

    v1 v2  
    1 {{{#!html
    2           <div id="wikipage" class="trac-content"><p>
    3 </p><div class="wiki-toc">
    4 <ol>
    5   <li>
    6     <a href="#LagunaBoardErrata">Laguna Board Errata</a>
    7     <ol>
    8       <li>
    9         <a href="#LAGUNA_ERR_1:PowerSupplyLimits">LAGUNA_ERR_1: Power Supply Limits</a>
    10       </li>
    11       <li>
    12         <a href="#LAGUNA_ERR_2:PCIeSpecwithregardstoPCIeclockandPERST">LAGUNA_ERR_2: PCIe Spec with regards to PCIe clock and PERST#</a>
    13       </li>
    14       <li>
    15         <a href="#LAGUNA_ERR_3:PCIBridgecancomeupinexternalarbitermode">LAGUNA_ERR_3: PCI Bridge can come up in external arbiter mode</a>
    16       </li>
    17       <li>
    18         <a href="#LAGUNA_ERR_4:GSCLock">LAGUNA_ERR_4: GSC Lock</a>
    19       </li>
    20       <li>
    21         <a href="#LAGUNA_ERR_6:PCIeBusClockSignal">LAGUNA_ERR_6: PCIe Bus Clock Signal</a>
    22       </li>
    23       <li>
    24         <a href="#LAGUNA_ERR_7:Tamperswitchnon-functional">LAGUNA_ERR_7: Tamper switch non-functional</a>
    25       </li>
    26     </ol>
    27   </li>
    28 </ol>
    29 </div><p>
    30 </p>
    31 <h1 id="LagunaBoardErrata">Laguna Board Errata</h1>
    32 <p>
     1= Laguna Board Errata
     2[[PageOutline]]
     3
    334The below errata only affects certain models and certain revisions.
    34 </p>
    35 <p>
    36 Please contact support@… with any questions.
    37 </p>
    38 <h2 id="LAGUNA_ERR_1:PowerSupplyLimits">LAGUNA_ERR_1: Power Supply Limits</h2>
    39 <p>
     5
     6Please contact !support@gateworks.com with any questions.
     7== LAGUNA_ERR_1: Power Supply Limits
     8
    409Issue:
    41 </p>
    42 <ul><li>The CPU VCore supply may not provide enough power during high loads at higher ambient temperatures (typically over 70C) resulting in a CPU hang.
    43 </li></ul><p>
     10* The CPU VCore supply may not provide enough power during high loads at higher ambient temperatures (typically over 70C) resulting in a CPU hang.
     11
    4412Affected Product:
    45 </p>
    46 <ul><li>GW2380-A,B,B.1
    47 </li><li>GW2380-SP232-A,B
    48 </li><li>GW2380-SP242-A,B
    49 </li><li>GW2382-A,B
    50 </li></ul><p>
     13
     14* GW2380-A,B,B.1
     15* GW2380-SP232-A,B
     16* GW2380-SP242-A,B
     17* GW2382-A,B
     18
    5119Resolution:
    52 </p>
    53 <ul><li>The following product versions and onward were revised with a BOM change resolving the issue:
    54 <ul><li>GW2380-B.2
    55 </li><li>GW2380-SP232-B.1
    56 </li><li>GW2380-SP242-B.1
    57 </li><li>GW2382-B.1
    58 </li></ul></li><li>If update required on affected product contact sales@… to inquire about our RMA process
    59 </li></ul><h2 id="LAGUNA_ERR_2:PCIeSpecwithregardstoPCIeclockandPERST">LAGUNA_ERR_2: PCIe Spec with regards to PCIe clock and PERST#</h2>
    60 <p>
     20
     21* The following product versions and onward were revised with a BOM change resolving the issue:
     22 * GW2380-B.2
     23 * GW2380-SP232-B.1
     24 * GW2380-SP242-B.1
     25 * GW2382-B.1
     26* If update required on affected product contact !sales@gateworks.com to inquire about our RMA process
     27== LAGUNA_ERR_2: PCIe Spec with regards to PCIe clock and PERST#
     28
    6129Issue:
    62 </p>
    63 <ul><li>A PCIe spec non-conformity exists where PERST# is not properly asserted before the PCI Clock is stable. On product with miniPCI support this can occasionally cause (typically at high temperatures) the TI XIO2001 PCI bridge to not link and thus disallows any PCI device access. On product with MiniPCIe support (no bridge) this can cause various PCIe device issues typically seen on soft resets.
    64 </li></ul><p>
     30* A PCIe spec non-conformity exists where PERST# is not properly asserted before the PCI Clock is stable. On product with miniPCI support this can occasionally cause (typically at high temperatures) the TI XIO2001 PCI bridge to not link and thus disallows any PCI device access. On product with MiniPCIe support (no bridge) this can cause various PCIe device issues typically seen on soft resets.
     31
    6532Affected Product:
    66 </p>
    67 <ul><li>GW2388 with PCB 02210082 revision 00-02
    68 </li><li>GW2387 with PCB 02210086 revision 00
    69 </li><li>GW2380 with PCB 02210087 revision 00-01
    70 </li></ul><p>
     33
     34* GW2388 with PCB 02210082 revision 00-02
     35* GW2387 with PCB 02210086 revision 00
     36* GW2380 with PCB 02210087 revision 00-01
     37
    7138Resolution:
    72 </p>
    73 <ul><li>The following products were updated to connect PERST# to an ARM GPIO. This requires a bootloader update (present in svn <a class="changeset" href="/changeset/368" title="fixed: assert PERST# until PCI clock is stable
    74   - this addresses some ...">r368</a> onward) to pulse and leave de-asserted the GPIO:
    75 <ul><li>GW2388-D (PCB 02210082-04)
    76 </li><li>GW2387-B (PCB 02210086-01)
    77 </li><li>GW2380-C (PCB 02210087-03)
    78 </li></ul></li><li>For affected products with a TI XIO2001 PCI bridge: GW2388, GW2387:
    79 <ul><li>If PCI bridge link not found, warm reset the board (software workaround)
    80 </li><li>a pullup resistor can be added to one of the PCI clock signals removes glitches from the clock
    81 </li></ul></li><li>For affected products without a PCI bridge: GW2380
    82 <ul><li>if a soft reboot is necessary, use instead the Gateworks System Controller to hard reset the board by telling it to put the board to sleep for 1 or more seconds
    83 </li><li>reboot via hard reset with GSC if link is not detected
    84 </li></ul></li></ul><h2 id="LAGUNA_ERR_3:PCIBridgecancomeupinexternalarbitermode">LAGUNA_ERR_3: PCI Bridge can come up in external arbiter mode</h2>
    85 <p>
     39* The following products were updated to connect PERST# to an ARM GPIO. This requires a bootloader update to pulse and leave de-asserted the GPIO:
     40 * GW2388-D (PCB 02210082-04)
     41 * GW2387-B (PCB 02210086-01)
     42 * GW2380-C (PCB 02210087-03)
     43* For affected products with a TI XIO2001 PCI bridge: GW2388, GW2387:
     44 * If PCI bridge link not found, warm reset the board (software workaround)
     45 * a pullup resistor can be added to one of the PCI clock signals removes glitches from the clock
     46* For affected products without a PCI bridge: GW2380
     47 * if a soft reboot is necessary, use instead the Gateworks System Controller to hard reset the board by telling it to put the board to sleep for 1 or more seconds
     48 * reboot via hard reset with GSC if link is not detected
     49== LAGUNA_ERR_3: PCI Bridge can come up in external arbiter mode
     50
    8651Issue:
    87 </p>
    88 <ul><li>In environments with both low temperature and high humidity it has been found that the Texas Instruments XIO2001 PCIe to PCI bridge with its EXT_ARB_EN pin unconnected can power up thinking the pin is asserted. TI found the cause to be that the documented internal pulldown was missing.  The ball routes to a trace on the underside of the BGA to the edge representing a trace which is highly sensitive. When the XIO2001 powers up in external arbiter mode the result is the kernel hangs during PCI enumeration. Depending on software the last thing seen over the serial console can be any of the following lines:
    89 <pre class="wiki">&lt;&lt;0000:02:03.00 00(4)= master_abort on read
    90 </pre><pre class="wiki">PCI: bus1: Fast back to back transfers enabled
    91 </pre></li></ul><p>
     52
     53* In environments with both low temperature and high humidity it has been found that the Texas Instruments XIO2001 PCIe to PCI bridge with its EXT_ARB_EN pin unconnected can power up thinking the pin is asserted. TI found the cause to be that the documented internal pulldown was missing.  The ball routes to a trace on the underside of the BGA to the edge representing a trace which is highly sensitive. When the XIO2001 powers up in external arbiter mode the result is the kernel hangs during PCI enumeration. Depending on software the last thing seen over the serial console can be any of the following lines:
     54{{{#!bash
     55<<0000:02:03.00 00(4)= master_abort on read
     56}}}
     57{{{#!bash
     58PCI: bus1: Fast back to back transfers enabled
     59}}}
    9260Affected Product (All product using the TI XIO2001 (02120635):
    93 </p>
    94 <ul><li>GW2388 with PCB 02210082 revision 00-05
    95 </li><li>GW2387 with PCB 02210086 revision 00-01
    96 </li></ul><p>
     61* GW2388 with PCB 02210082 revision 00-05
     62* GW2387 with PCB 02210086 revision 00-01
    9763Resolution:
    98 </p>
    99 <ul><li>Existing Product: Applying epoxy around the edges of the XIO2001 covering all exposed copper test traces increases the noise immunity of the EXT_ARB_EN signal and has shown in extensive testing to eliminate the issue. This has been done at the factor for affected product and you should see black epoxy around the bridge chip as a resolution.
    100 </li><li>On the following product revisions and forward, the EXT_ARB_EN signal was brought out and pulled down to resolve this:
    101 <ul><li>GW2388-4-F (PCB 02210082-05)
    102 </li><li>GW2387-C (PCB 02210086-02)
    103 </li></ul></li></ul><h2 id="LAGUNA_ERR_4:GSCLock">LAGUNA_ERR_4: GSC Lock</h2>
    104 <p>
     64* Existing Product: Applying epoxy around the edges of the XIO2001 covering all exposed copper test traces increases the noise immunity of the EXT_ARB_EN signal and has shown in extensive testing to eliminate the issue. This has been done at the factor for affected product and you should see black epoxy around the bridge chip as a resolution.
     65* On the following product revisions and forward, the EXT_ARB_EN signal was brought out and pulled down to resolve this:
     66 * GW2388-4-F (PCB 02210082-05)
     67 * GW2387-C (PCB 02210086-02)
     68== LAGUNA_ERR_4: GSC Lock
     69
    10570Issue:
    106 </p>
    107 <ul><li>A glitch on the RST# line can cause the Gateworks System Controller (GSC) to 'hang' requiring its battery to be removed (for 10+ seconds) in order for the board to be able to boot. At high supply voltages (ie 48V) and high powerup load (ie 2x+ DNMA-H92 radios) any bouncing of Vin can cause the CPU to assert its WD output at power-up which keeps the system monitor un-powered and results in a 3.3V signal on the RST# line. In some circumstances this can cause the GSC to get in a locked up state. As the GSC provides the EEPROM device needed to be read by the bootloader, if the GSC is not responsive the board will not boot. The 3.3V Power LED will come on indicating board power but no output over the serial port will be seen.
    108 </li></ul><p>
     71* A glitch on the RST# line can cause the Gateworks System Controller (GSC) to 'hang' requiring its battery to be removed (for 10+ seconds) in order for the board to be able to boot. At high supply voltages (ie 48V) and high powerup load (ie 2x+ DNMA-H92 radios) any bouncing of Vin can cause the CPU to assert its WD output at power-up which keeps the system monitor un-powered and results in a 3.3V signal on the RST# line. In some circumstances this can cause the GSC to get in a locked up state. As the GSC provides the EEPROM device needed to be read by the bootloader, if the GSC is not responsive the board will not boot. The 3.3V Power LED will come on indicating board power but no output over the serial port will be seen.
     72
    10973Affected Product:
    110 </p>
    111 <ul><li>GW2380 with PCB 02210087 revision 00-02
    112 </li><li>GW2382 with PCB 02210105 revision 00-01
    113 </li><li>GW2383 with PCB 02210113 revision 00
    114 </li><li>GW2388 with PCB 02210082 revision 00-03 (other than E.1)
    115 </li></ul><p>
     74
     75* GW2380 with PCB 02210087 revision 00-02
     76* GW2382 with PCB 02210105 revision 00-01
     77* GW2383 with PCB 02210113 revision 00
     78* GW2388 with PCB 02210082 revision 00-03 (other than E.1)
     79
    11680Resolution:
    117 </p>
    118 <ul><li>The following product revisions were modified by replacing a 0ohm resistor with a 7.15kohm which will allows VCC to drop below the 3.08V tripping point, but not below 0.65V allowing the system monitor to effectively clamp RST# down to a max of 0.65V eliminating the large glitch producing the issue.
    119 <ul><li>GW2380-D (PCB 02210087-03)
    120 </li><li>GW2382-C (PCB 02210105-02)
    121 </li><li>GW2383-B (PCB 02210113-01)
    122 </li><li>GW2388-4-E.1 (02210082-04)
    123 </li></ul></li></ul><h2 id="LAGUNA_ERR_6:PCIeBusClockSignal">LAGUNA_ERR_6: PCIe Bus Clock Signal</h2>
    124 <p>
     81* The following product revisions were modified by replacing a 0ohm resistor with a 7.15kohm which will allows VCC to drop below the 3.08V tripping point, but not below 0.65V allowing the system monitor to effectively clamp RST# down to a max of 0.65V eliminating the large glitch producing the issue.
     82 * GW2380-D (PCB 02210087-03)
     83 * GW2382-C (PCB 02210105-02)
     84 * GW2383-B (PCB 02210113-01)
     85 * GW2388-4-E.1 (02210082-04)
     86== LAGUNA_ERR_6: PCIe Bus Clock Signal
     87
    12588Issue:
    126 </p>
    127 <ul><li>A high error rate (~10/min) of SERR's and a somewhat lesser rate of PERR's and MABORT's on the PCIe bus can occur at high clock frequencies on designs that have both the CNS3420 600MHz and the TI XIO2001 bridge and under heavy load. This is realized by inspecting PCI configuration register 0x06. This is caused by a CNS3xxx errata where excessive jitter can exist on the PCIe refclk provided by the CPU. The work-arounds suggested by Cavium were followed but turned out to only help reduce the issue and not resolve it completely.
    128 </li></ul><p>
     89* A high error rate (~10/min) of SERR's and a somewhat lesser rate of PERR's and MABORT's on the PCIe bus can occur at high clock frequencies on designs that have both the CNS3420 600MHz and the TI XIO2001 bridge and under heavy load. This is realized by inspecting PCI configuration register 0x06. This is caused by a CNS3xxx errata where excessive jitter can exist on the PCIe refclk provided by the CPU. The work-arounds suggested by Cavium were followed but turned out to only help reduce the issue and not resolve it completely.
     90
    12991Affected Product:
    130 </p>
    131 <ul><li>CNS3420 600MHz boards with TI XIO2001 bridge (mini-PCI slots) using PCB 02210082-00 to -05
    132 <ul><li>GW2388 with PCB 02210082 rev 00-05
    133 </li></ul></li></ul><p>
     92* CNS3420 600MHz boards with TI XIO2001 bridge (mini-PCI slots) using PCB 02210082-00 to -05
     93 * GW2388 with PCB 02210082 rev 00-05
     94
    13495Resolution:
    135 </p>
    136 <ul><li>This is resolved in the following models by switching to an external PCIe clock generator:
    137 <ul><li>GW2388-4-G (PCB 02210082-06)
    138 </li><li>This change also required a bootloader update to configure the PCIe controller to use the external clock instead of its own internal clock:
    139 <ul><li><a class="ext-link" href="http://trac.gateworks.com/changeset/521/laguna/trunk/u-boot-2008.10"><span class="icon">​</span>r521</a>
    140 </li><li>board detection is done by looking at GPIOB block - based what is pulled up
    141 </li><li>a new line is output that displays the fact there is a PCI_RST# gpio and what it is, and whether the PCI clock is internal or external:
    142 <pre class="wiki">U-Boot 2008.10-mpcore-svn512 (Feb  5 2014 - 10:15:12)
     96
     97* This is resolved in the following models by switching to an external PCIe clock generator:
     98 * GW2388-4-G (PCB 02210082-06)
     99 * This change also required a bootloader update to configure the PCIe controller to use the external clock instead of its own internal clock:
     100  * board detection is done by looking at GPIOB block - based what is pulled up
     101  * a new line is output that displays the fact there is a PCI_RST# gpio and what it is, and whether the PCI clock is internal or external:
     102{{{
     103U-Boot 2008.10-mpcore-svn512 (Feb  5 2014 - 10:15:12)
    143104
    144105CPU: Cavium Networks CNS3000
     
    146107CPU ID: 900
    147108PCI:   PERST:GPIOA11 clock:internal
    148 </pre></li></ul></li></ul></li><li>The following workarounds can be used for boards with a dual-core CNS3420 600MHz and an internal PCIe clock:
    149 <ul><li>disable TI bridge Master abort mode. This will cause the bridge to not forward master aborts to the PCI side of the bus and thus would hide the failure from the PCI device/driver (which may have undesirable side-effects depending on the device/driver used)
    150 <pre class="wiki">pciset -s 00:01.0 0x3E.W=0x481
    151 </pre></li><li>lower Vcore loading by disabling L2 Cache (in some I/O bound cases disabling L2 cache improves performance as well). You can disable L2 cache by adding 'nol2x0' to the kernel command-line for the 3.8.x kernel. For the 2.6.39 kernel you must rebuild the kernel with L2X0 disabled.
    152 </li><li>implement proper handling for master aborts in the device driver. PCI bus errors should always be handled properly. The ath9k driver for Atheros AR9xxx 802.11n radios for example has a driver bug that causes a crash when a master abort is detected during an aggregated frame. A patch exists in the Gateworks <a class="wiki" href="/wiki/OpenWrt">OpenWrt</a> BSP (13.06 onward) to take care of this
    153 </li></ul></li></ul><h2 id="LAGUNA_ERR_7:Tamperswitchnon-functional">LAGUNA_ERR_7: Tamper switch non-functional</h2>
    154 <p>
     109}}}
     110* The following workarounds can be used for boards with a dual-core CNS3420 600MHz and an internal PCIe clock:
     111 * disable TI bridge Master abort mode. This will cause the bridge to not forward master aborts to the PCI side of the bus and thus would hide the failure from the PCI device/driver (which may have undesirable side-effects depending on the device/driver used)
     112{{{
     113pciset -s 00:01.0 0x3E.W=0x481
     114}}}
     115 * lower Vcore loading by disabling L2 Cache (in some I/O bound cases disabling L2 cache improves performance as well). You can disable L2 cache by adding 'nol2x0' to the kernel command-line for the 3.8.x kernel. For the 2.6.39 kernel you must rebuild the kernel with L2X0 disabled.
     116 * implement proper handling for master aborts in the device driver. PCI bus errors should always be handled properly. The ath9k driver for Atheros AR9xxx 802.11n radios for example has a driver bug that causes a crash when a master abort is detected during an aggregated frame. A patch exists in the Gateworks [[wiki:OpenWrt#OpenWrt]] BSP (13.06 onward) to take care of this
     117== LAGUNA_ERR_7: Tamper switch non-functional
     118
    155119Issue:
    156 </p>
    157 <ul><li>The GW2388-4 has a capacitor loaded on the tamper circuit which makes it un-usable for the GSC tamper feature (Note that it still works as a GSC GPIO)
    158 </li></ul><p>
     120* The GW2388-4 has a capacitor loaded on the tamper circuit which makes it un-usable for the GSC tamper feature (Note that it still works as a GSC GPIO)
     121
    159122Affected Products:
    160 </p>
    161 <ul><li>GW2388-4-B and above
    162 </li></ul><p>
     123* GW2388-4-B and above
     124
    163125Resolution:
    164 </p>
    165 <ul><li>Remove C40, C303 if you wish to use the tamper feature
    166 </li></ul
    167 }}}
     126* Remove C40, C303 if you wish to use the tamper feature