113 | | == cryptodev vs. af_alg vs. ocf-linux == |
114 | | {{{cryptodev}}}, {{{af_alg}}}, and {{{ocf-linux}}} are three userspace crypto API's into the Linux kernel. While both {{{cryptodev}}} and {{{af_alg}}} use the native Linux crypto interface, {{{ocf-linux}}} does not. {{{ocf-linux}}} also conflicts with {{{cryptodev}}} in that they both create a {{{/dev/crypto}}} interface. For this reason, these two drivers cannot co-exist. Gateworks has decided to include {{{cryptodev}}} over {{{ocf-linux}}} for these reasons. |
115 | | |
116 | | However, {{{af_alg}}} and {{{cryptodev}}} both use the native Linux crypto interface, but go about it in differing ways. According to the [http://cryptodev-linux.org/comparison.html cryptodev] site, {{{cryptodev}}} outperforms {{{af_alg}}}, mainly due to how each was created. Both are acceptable ways of interacting with the kernel and many programs default to utilizing one or the other. Programs such as {{{openssl}}} are able to pick the engine they can use. However, {{{cryptodev}}} must be built out-of-tree because it is not apart of the kernel. However, {{{af_alg}}} is and so no special handling must be done there. |
117 | | |
118 | | To build {{{cryptodev}}} out-of-tree: |
119 | | {{{#!bash |
120 | | # Download cryptodev tarball from here: http://download.gna.org/cryptodev-linux/ |
121 | | wget http://download.gna.org/cryptodev-linux/cryptodev-linux-1.8.tar.gz |
122 | | tar xvf cryptodev-linux-1.8.tar.gz |
123 | | cd cryptodev-linux-1.8 |
124 | | # Make sure you have kernel build directory for the kernel you are compiling for and point to it via KERNEL_DIR= (if cross compiling) |
125 | | KERNEL_DIR=/usr/src/psidhu/linux/linux-imx6 make |
126 | | make install # Only do this if compiling on target system |
127 | | }}} |
128 | | |
129 | | Gateworks has written an example {{{cryptodev}}} program for the cbc(aes) cipher called [https://github.com/Gateworks/gateworks-sample-apps/tree/master/gw-cryptodev-example gw-cryptodev-example]. To get the source and compile, please follow these instructions: |
130 | | {{{#!bash |
131 | | git clone https://github.com/Gateworks/gateworks-sample-apps.git |
132 | | cd gateworks-sample-apps/gw-cryptodev-example |
133 | | # (optional) Source your env. if cross compiling. In this case, we'll use the Yocto 1.8 SDK. |
134 | | . /opt/pocky/1.8/environment-setup-cortexa9hf-vfp-neon-poky-linux-gnueabi # Please make sure this is the updated version with cryptodev.h. |
135 | | make |
136 | | }}} |
137 | | |
138 | | To run: |
139 | | {{{#!bash |
140 | | root@ventana:~# ./gw-cryptodev-example |
141 | | Using cbc-aes-caam driver! Accelerated through SEC4 engine. |
142 | | Encrypted 'Hello, World!' to '���<�팻�m��5͎' |
143 | | Decrypted '���<�팻�m��5͎' to 'Hello, World!' |
144 | | Test passed! |
145 | | }}} |
146 | | |
147 | | An example of using this same cipher, but through {{{af_alg}}}, can be found [http://lwn.net/Articles/410833/ here]. |
148 | | |
149 | | Note that the main differences between using {{{cryptodev}}} and {{{af_alg}}} are how messages are sent to the kernel. {{{cryptodev}}} relies on {{{ioctl}}} calls while {{{af_alg}}} relies on the kernels SOCKET family (called AF_ALG). |
150 | | |
151 | | * References |
152 | | * https://en.wikipedia.org/wiki/Crypto_API_%28Linux%29 |
153 | | * http://lwn.net/Articles/410833/ |
154 | | * https://lwn.net/Articles/410536/ |
155 | | * http://cryptodev-linux.org/ |
156 | | |
157 | | == BSP Support == |
158 | | Both Yocto and the [wiki:OpenWrt/building latest OpenWrt] have CAAM support. |
159 | | |
160 | | For example, adding the CAAM driver will grant the ability to directly access the hardware random number generator via {{{/dev/hwrng}}}. This tremendously speeds up generation of random garbage as seen below: |
| 113 | The CAAM driver will also grant the ability to directly access the hardware random number generator via {{{/dev/hwrng}}}. This tremendously speeds up generation of random garbage as seen below: |
182 | | === Yocto === |
183 | | In the Yocto BSP, {{{openssl}}} is built with {{{cryptodev}}} support. Please see below for a comparison using the {{{cryptodev}}} engine and without: |
184 | | * Yocto 1.8 WITHOUT {{{cryptodev}}} (using {{{openssl}}} software based algorithms) |
185 | | {{{#!bash |
186 | | root@ventana:~# openssl speed aes-128-cbc |
187 | | Doing aes-128 cbc for 3s on 16 size blocks: 6008244 aes-128 cbc's in 3.00s |
188 | | Doing aes-128 cbc for 3s on 64 size blocks: 1608835 aes-128 cbc's in 2.99s |
189 | | Doing aes-128 cbc for 3s on 256 size blocks: 411309 aes-128 cbc's in 2.99s |
190 | | Doing aes-128 cbc for 3s on 1024 size blocks: 103187 aes-128 cbc's in 3.00s |
191 | | Doing aes-128 cbc for 3s on 8192 size blocks: 12923 aes-128 cbc's in 3.00s |
192 | | OpenSSL 1.0.2d 9 Jul 2015 |
193 | | built on: reproducible build, date unspecified |
194 | | options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) |
195 | | compiler: arm-poky-linux-gnueabi-gcc -march=armv7-a -marm -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 --sysroot=/usr/src/psidhu/gw-yocto-1.8/build/tmp/sysroots/ventana -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -pipe -g -feliminate-unused-debug-types -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM |
196 | | The 'numbers' are in 1000s of bytes per second processed. |
197 | | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes |
198 | | aes-128 cbc 32043.97k 34436.60k 35215.75k 35221.16k 35288.41k |
199 | | }}} |
| 136 | For more information on Linux Kernel Crypto API and how to use in Userspace see: |
| 137 | - [wiki:linux/crypto linux/crypto] |
201 | | * Yocto 1.8 with {{{cryptodev}}} (using kernel hardware accelerated algorithms) |
202 | | {{{#!bash |
203 | | root@ventana:~# openssl speed -evp aes-128-cbc -engine cryptodev |
204 | | engine "cryptodev" set. |
205 | | Doing aes-128-cbc for 3s on 16 size blocks: 44146 aes-128-cbc's in 0.14s |
206 | | Doing aes-128-cbc for 3s on 64 size blocks: 43561 aes-128-cbc's in 0.11s |
207 | | Doing aes-128-cbc for 3s on 256 size blocks: 39724 aes-128-cbc's in 0.13s |
208 | | Doing aes-128-cbc for 3s on 1024 size blocks: 30733 aes-128-cbc's in 0.10s |
209 | | Doing aes-128-cbc for 3s on 8192 size blocks: 9122 aes-128-cbc's in 0.01s |
210 | | OpenSSL 1.0.2d 9 Jul 2015 |
211 | | built on: reproducible build, date unspecified |
212 | | options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) |
213 | | compiler: arm-poky-linux-gnueabi-gcc -march=armv7-a -marm -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 --sysroot=/usr/src/psidhu/gw-yocto-1.8/build/tmp/sysroots/ventana -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -pipe -g -feliminate-unused-debug-types -Wall -Wa,--noexecstack -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM |
214 | | The 'numbers' are in 1000s of bytes per second processed. |
215 | | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes |
216 | | aes-128-cbc 5045.26k 25344.58k 78225.72k 314705.92k 7472742.40k |
217 | | }}} |
218 | | |
219 | | One of the biggest advantages to using hardware encryption is how cpu is utilized. In the above two cases, we found the following to be true: |
220 | | * With {{{cryptodev}}} disabled: 25% usr CPU usage (one core pegged to 100%) |
221 | | * With {{{cryptodev}}} enabled : 16% sys CPU usage, 2% sirq |
222 | | * {{{openssl}}} using the hardware engine {{{cryptodev}}} increased the number of bytes per second processed tremendously, especially on the larger number of bytes processed |
223 | | |
224 | | === OpenWrt === |
225 | | Our OpenWrt 16.02 BSP added support for CAAM and {{{cryptodev}}}. {{{openssl}}} can utilize this engine like Yocto. Please see below for some results: |
226 | | |
227 | | * OpenWrt 16.02 WITHOUT {{{cryptodev}}} (using {{{openssl}}} software based algorithms) |
228 | | {{{#!bash |
229 | | root@OpenWrt:/# openssl speed aes-128-cbc |
230 | | Doing aes-128 cbc for 3s on 16 size blocks: 2890377 aes-128 cbc's in 3.00s |
231 | | Doing aes-128 cbc for 3s on 64 size blocks: 767833 aes-128 cbc's in 2.99s |
232 | | Doing aes-128 cbc for 3s on 256 size blocks: 196252 aes-128 cbc's in 3.00s |
233 | | Doing aes-128 cbc for 3s on 1024 size blocks: 49243 aes-128 cbc's in 3.00s |
234 | | Doing aes-128 cbc for 3s on 8192 size blocks: 6165 aes-128 cbc's in 3.00s |
235 | | OpenSSL 1.0.2g 1 Mar 2016 |
236 | | built on: reproducible build, date unspecified |
237 | | options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr) |
238 | | compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include/fortify -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include -znow -zrelro -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv7-a -mtune=cortex-a9 -mfpu=neon -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -iremap /usr/src/psidhu/openwrt/openwrt-next/build_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/openssl-1.0.2g:openssl-1.0.2g -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fpic -fomit-frame-pointer -Wall |
239 | | The 'numbers' are in 1000s of bytes per second processed. |
240 | | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes |
241 | | aes-128 cbc 15415.34k 16435.22k 16746.84k 16808.28k 16834.56k |
242 | | }}} |
243 | | |
244 | | * OpenWrt 16.02 with {{{cryptodev}}} (using kernel hardware accelerated algorithms) |
245 | | {{{#!bash |
246 | | root@OpenWrt:/# openssl speed -evp aes-128-cbc -engine cryptodev |
247 | | engine "cryptodev" set. |
248 | | Doing aes-128-cbc for 3s on 16 size blocks: 80789 aes-128-cbc's in 0.13s |
249 | | Doing aes-128-cbc for 3s on 64 size blocks: sy67854 aes-128-cbc's in 0.15s |
250 | | Doing aes-128-cbc for 3s on 256 size blocks: 63909 aes-128-cbc's in 0.21s |
251 | | Doing aes-128-cbc for 3s on 1024 size blocks: 46740 aes-128-cbc's in 0.06s |
252 | | Doing aes-128-cbc for 3s on 8192 size blocks: 12239 aes-128-cbc's in 0.03s |
253 | | OpenSSL 1.0.2g 1 Mar 2016 |
254 | | built on: reproducible build, date unspecified |
255 | | options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr) |
256 | | compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/usr/include -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include/fortify -I/usr/src/psidhu/openwrt/openwrt-next/staging_dir/toolchain-arm_cortex-a9+neon_gcc-5.2.0_musl-1.1.12_eabi/include -znow -zrelro -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv7-a -mtune=cortex-a9 -mfpu=neon -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -iremap /usr/src/psidhu/openwrt/openwrt-next/build_dir/target-arm_cortex-a9+neon_musl-1.1.12_eabi/openssl-1.0.2g:openssl-1.0.2g -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fpic -fomit-frame-pointer -Wall |
257 | | The 'numbers' are in 1000s of bytes per second processed. |
258 | | type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes |
259 | | aes-128-cbc 9943.26k 28951.04k 77908.11k 797696.00k 3342062.93k |
260 | | }}} |