Changes between Version 4 and Version 5 of ventana/encryption


Ignore:
Timestamp:
05/29/2019 11:49:22 PM (5 months ago)
Author:
Tim Harvey
Comment:

link to linux/crypto page and remove redundant info

Legend:

Unmodified
Added
Removed
Modified
  • ventana/encryption

    v4 v5  
    11[[PageOutline]]
    22
    3 = i.MX6 Encryption =
     3= i.MX6 Encryption
    44The i.MX6 Processors offer hardware encryption through Freescale's Cryptographic Accelerator and Assurance Module (CAAM, also known as SEC4). It offers the following support:
    55 * Security Control
     
    4848  * CTR for AES
    4949* Symmetric key stream ciphers
    50 * ArcFour (alleged RC4 with 40 - 128 bit keys)
     50* !ArcFour (alleged RC4 with 40 - 128 bit keys)
    5151* Random-number generation
    5252  * Entropy is generated via an independent free running ring oscillator
     
    5656The above features are usable via the CAAM driver which is available on our Yocto BSPs, as well as our [wiki:OpenWrt/building latest OpenWrt] on [https://github.com/Gateworks/openwrt GitHub]. In order to make use of some of these features, the Linux CryptoAPI must be used. The driver itself is integrated with the Crypto API kernel service in which the algorithms supported by CAAM can replace the native SW implementations.
    5757
    58 == References ==
     58References:
    5959 * [https://community.freescale.com/thread/303229]
    6060 * [https://community.freescale.com/thread/319374]
     
    6767 * Also see [wiki:ventana/security Ventana Security]
    6868
    69 == i.MX6 Security Reference Manual ==
     69== i.MX6 Security Reference Manual
    7070Please register on the NXP website and request the document by visiting the link [https://www.nxp.com/webapp/sps/download/mod_download.jsp?colCode=IMX6DQ6SDLSRM&appType=moderatedWithoutFAE here]
    7171
    72 = Driver Information =
     72== Linux Drivers
    7373The Cryptographic Accelerator and Assurance Module (CAAM) is the driver for Freescale's hardware crypto. It configures hw to operate as a DPAA component, as well as creates job ring devices. Please see [https://www.kernel.org/doc/menuconfig/drivers-crypto-caam-Kconfig.html here] for more detail. This driver was added to Linux 4.3, but we have support for it in our Yocto 1.6, Yocto 1.7, Yocto 1.8, and OpenWrt next (our latest OpenWrt branch on [https://github.com/Gateworks/openwrt GitHub]).
    7474
     
    111111We can see that the {{{caamhash}}} module offers the sha1 ahash function. This effectively means that any program using this hash will automatically gain hardware acceleration.
    112112
    113 == cryptodev vs. af_alg vs. ocf-linux ==
    114 {{{cryptodev}}}, {{{af_alg}}}, and {{{ocf-linux}}} are three userspace crypto API's into the Linux kernel. While both {{{cryptodev}}} and {{{af_alg}}} use the native Linux crypto interface, {{{ocf-linux}}} does not. {{{ocf-linux}}} also conflicts with {{{cryptodev}}} in that they both create a {{{/dev/crypto}}} interface. For this reason, these two drivers cannot co-exist. Gateworks has decided to include {{{cryptodev}}} over {{{ocf-linux}}} for these reasons.
    115 
    116 However, {{{af_alg}}} and {{{cryptodev}}} both use the native Linux crypto interface, but go about it in differing ways. According to the [http://cryptodev-linux.org/comparison.html cryptodev] site, {{{cryptodev}}} outperforms {{{af_alg}}}, mainly due to how each was created. Both are acceptable ways of interacting with the kernel and many programs default to utilizing one or the other. Programs such as {{{openssl}}} are able to pick the engine they can use. However, {{{cryptodev}}} must be built out-of-tree because it is not apart of the kernel. However, {{{af_alg}}} is and so no special handling must be done there.
    117 
    118 To build {{{cryptodev}}} out-of-tree:
    119 {{{#!bash
    120 # Download cryptodev tarball from here: http://download.gna.org/cryptodev-linux/
    121 wget http://download.gna.org/cryptodev-linux/cryptodev-linux-1.8.tar.gz
    122 tar xvf cryptodev-linux-1.8.tar.gz
    123 cd cryptodev-linux-1.8
    124 # Make sure you have kernel build directory for the kernel you are compiling for and point to it via KERNEL_DIR= (if cross compiling)
    125 KERNEL_DIR=/usr/src/psidhu/linux/linux-imx6 make
    126 make install # Only do this if compiling on target system
    127 }}}
    128 
    129 Gateworks has written an example {{{cryptodev}}} program for the cbc(aes) cipher called [https://github.com/Gateworks/gateworks-sample-apps/tree/master/gw-cryptodev-example gw-cryptodev-example]. To get the source and compile, please follow these instructions:
    130 {{{#!bash
    131 git clone https://github.com/Gateworks/gateworks-sample-apps.git
    132 cd gateworks-sample-apps/gw-cryptodev-example
    133 # (optional) Source your env. if cross compiling. In this case, we'll use the Yocto 1.8 SDK.
    134 . /opt/pocky/1.8/environment-setup-cortexa9hf-vfp-neon-poky-linux-gnueabi # Please make sure this is the updated version with cryptodev.h.
    135 make
    136 }}}
    137 
    138 To run:
    139 {{{#!bash
    140 root@ventana:~# ./gw-cryptodev-example
    141 Using cbc-aes-caam driver! Accelerated through SEC4 engine.
    142 Encrypted 'Hello, World!' to '���<�팻�m��5͎'
    143 Decrypted '���<�팻�m��5͎' to 'Hello, World!'
    144 Test passed!
    145 }}}
    146 
    147 An example of using this same cipher, but through {{{af_alg}}}, can be found [http://lwn.net/Articles/410833/ here].
    148 
    149 Note that the main differences between using {{{cryptodev}}} and {{{af_alg}}} are how messages are sent to the kernel. {{{cryptodev}}} relies on {{{ioctl}}} calls while {{{af_alg}}} relies on the kernels SOCKET family (called AF_ALG).
    150 
    151 * References
    152  * https://en.wikipedia.org/wiki/Crypto_API_%28Linux%29
    153  * http://lwn.net/Articles/410833/
    154  * https://lwn.net/Articles/410536/
    155  * http://cryptodev-linux.org/
    156 
    157 == BSP Support ==
    158 Both Yocto and the [wiki:OpenWrt/building latest OpenWrt] have CAAM support.
    159 
    160 For example, adding the CAAM driver will grant the ability to directly access the hardware random number generator via {{{/dev/hwrng}}}. This tremendously speeds up generation of random garbage as seen below:
     113The CAAM driver will also grant the ability to directly access the hardware random number generator via {{{/dev/hwrng}}}. This tremendously speeds up generation of random garbage as seen below:
    161114{{{#!bash
    162115# Generate 50Mb of data via software
     
    178131As seen above, using the hardware accelerated rng, random data with good entropy was generated almost 17x faster.
    179132
    180 This, however, also means programs using either {{{cryptodev}}} or {{{af_alg}}} will automatically have hardware accelerated cryptography. However, some programs use their own software based algorithms for portability reasons. One such program is {{{openssl}}}. Note, {{{openssl}}} must be compiled with the following flags in order to use the {{{cryptodev}}} engine: {{{-DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS}}}
     133For information on how to use the Linux Kernel Crypto API consult the kernel documentation:
     134- https://www.kernel.org/doc/html/latest/crypto/index.html
    181135
    182 === Yocto ===
    183 In the Yocto BSP, {{{openssl}}} is built with {{{cryptodev}}} support. Please see below for a comparison using the {{{cryptodev}}} engine and without:
    184 * Yocto 1.8 WITHOUT {{{cryptodev}}} (using {{{openssl}}} software based algorithms)
    185 {{{#!bash
    186 root@ventana:~# openssl speed aes-128-cbc
    187 Doing aes-128 cbc for 3s on 16 size blocks: 6008244 aes-128 cbc's in 3.00s
    188 Doing aes-128 cbc for 3s on 64 size blocks: 1608835 aes-128 cbc's in 2.99s
    189 Doing aes-128 cbc for 3s on 256 size blocks: 411309 aes-128 cbc's in 2.99s
    190 Doing aes-128 cbc for 3s on 1024 size blocks: 103187 aes-128 cbc's in 3.00s
    191 Doing aes-128 cbc for 3s on 8192 size blocks: 12923 aes-128 cbc's in 3.00s
    192 OpenSSL 1.0.2d 9 Jul 2015
    193 built on: reproducible build, date unspecified
    194 options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
    195 compiler: arm-poky-linux-gnueabi-gcc  -march=armv7-a -marm  -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 --sysroot=/usr/src/psidhu/gw-yocto-1.8/build/tmp/sysroots/ventana -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN    -DTERMIO  -O2 -pipe -g -feliminate-unused-debug-types -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
    196 The 'numbers' are in 1000s of bytes per second processed.
    197 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    198 aes-128 cbc      32043.97k    34436.60k    35215.75k    35221.16k    35288.41k
    199 }}}
     136For more information on Linux Kernel Crypto API and how to use in Userspace see:
     137- [wiki:linux/crypto linux/crypto]
    200138
    201 * Yocto 1.8 with {{{cryptodev}}} (using kernel hardware accelerated algorithms)
    202 {{{#!bash
    203 root@ventana:~# openssl speed -evp aes-128-cbc -engine cryptodev
    204 engine "cryptodev" set.
    205 Doing aes-128-cbc for 3s on 16 size blocks: 44146 aes-128-cbc's in 0.14s
    206 Doing aes-128-cbc for 3s on 64 size blocks: 43561 aes-128-cbc's in 0.11s
    207 Doing aes-128-cbc for 3s on 256 size blocks: 39724 aes-128-cbc's in 0.13s
    208 Doing aes-128-cbc for 3s on 1024 size blocks: 30733 aes-128-cbc's in 0.10s
    209 Doing aes-128-cbc for 3s on 8192 size blocks: 9122 aes-128-cbc's in 0.01s
    210 OpenSSL 1.0.2d 9 Jul 2015
    211 built on: reproducible build, date unspecified
    212 options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
    213 compiler: arm-poky-linux-gnueabi-gcc  -march=armv7-a -marm  -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 --sysroot=/usr/src/psidhu/gw-yocto-1.8/build/tmp/sysroots/ventana -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN    -DTERMIO  -O2 -pipe -g -feliminate-unused-debug-types -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
    214 The 'numbers' are in 1000s of bytes per second processed.
    215 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    216 aes-128-cbc       5045.26k    25344.58k    78225.72k   314705.92k  7472742.40k
    217 }}}
    218 
    219 One of the biggest advantages to using hardware encryption is how cpu is utilized. In the above two cases, we found the following to be true:
    220  * With {{{cryptodev}}} disabled: 25% usr CPU usage (one core pegged to 100%)
    221  * With {{{cryptodev}}} enabled : 16% sys CPU usage, 2% sirq
    222  * {{{openssl}}} using the hardware engine {{{cryptodev}}} increased the number of bytes per second processed tremendously, especially on the larger number of bytes processed
    223 
    224 === OpenWrt ===
    225 Our OpenWrt 16.02 BSP added support for CAAM and {{{cryptodev}}}. {{{openssl}}} can utilize this engine like Yocto. Please see below for some results:
    226 
    227 * OpenWrt 16.02 WITHOUT {{{cryptodev}}} (using {{{openssl}}} software based algorithms)
    228 {{{#!bash
    229 root@OpenWrt:/# openssl speed aes-128-cbc
    230 Doing aes-128 cbc for 3s on 16 size blocks: 2890377 aes-128 cbc's in 3.00s
    231 Doing aes-128 cbc for 3s on 64 size blocks: 767833 aes-128 cbc's in 2.99s
    232 Doing aes-128 cbc for 3s on 256 size blocks: 196252 aes-128 cbc's in 3.00s
    233 Doing aes-128 cbc for 3s on 1024 size blocks: 49243 aes-128 cbc's in 3.00s
    234 Doing aes-128 cbc for 3s on 8192 size blocks: 6165 aes-128 cbc's in 3.00s
    235 OpenSSL 1.0.2g  1 Mar 2016
    236 built on: reproducible build, date unspecified
    237 options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr)
    238 compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include/fortify -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include -znow -zrelro -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv7-a -mtune=cortex-a9 -mfpu=neon -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -iremap /usr/src/psidhu/openwrt/openwrt-next/build_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/openssl-1.0.2g:openssl-1.0.2g -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fpic -fomit-frame-pointer -Wall
    239 The 'numbers' are in 1000s of bytes per second processed.
    240 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    241 aes-128 cbc      15415.34k    16435.22k    16746.84k    16808.28k    16834.56k
    242 }}}
    243 
    244 * OpenWrt 16.02 with {{{cryptodev}}} (using kernel hardware accelerated algorithms)
    245 {{{#!bash
    246 root@OpenWrt:/# openssl speed -evp aes-128-cbc -engine cryptodev
    247 engine "cryptodev" set.
    248 Doing aes-128-cbc for 3s on 16 size blocks: 80789 aes-128-cbc's in 0.13s
    249 Doing aes-128-cbc for 3s on 64 size blocks: sy67854 aes-128-cbc's in 0.15s
    250 Doing aes-128-cbc for 3s on 256 size blocks: 63909 aes-128-cbc's in 0.21s
    251 Doing aes-128-cbc for 3s on 1024 size blocks: 46740 aes-128-cbc's in 0.06s
    252 Doing aes-128-cbc for 3s on 8192 size blocks: 12239 aes-128-cbc's in 0.03s
    253 OpenSSL 1.0.2g  1 Mar 2016
    254 built on: reproducible build, date unspecified
    255 options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr)
    256 compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include/fortify -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include -znow -zrelro -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv7-a -mtune=cortex-a9 -mfpu=neon -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -iremap /usr/src/psidhu/openwrt/openwrt-next/build_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/openssl-1.0.2g:openssl-1.0.2g -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fpic -fomit-frame-pointer -Wall
    257 The 'numbers' are in 1000s of bytes per second processed.
    258 type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    259 aes-128-cbc       9943.26k    28951.04k    77908.11k   797696.00k  3342062.93k
    260 }}}