Changes between Initial Version and Version 1 of floatingpoint


Ignore:
Timestamp:
10/22/2017 05:28:45 AM (7 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • floatingpoint

    v1 v1  
     1[[PageOutline]]
     2
     3= Floating Point Math =
     4Historically ARM CPU's lacked a Floating Point Unit (FPU) to perform hardware accelerated floating point calculations.  The modern ARM CPU's have FPU's and in some cases other hardware blocks capable of accelerating floating point calculations.  The use of these blocks is typically referred to as 'hardware floating point' or 'hardfloat' instead of using software to emulate floating-point otherwise known as 'softfloat'.
     5
     6The GCC compiler can produce binaries with several options regarding floating point:
     7 * soft - suitable for running on CPU's with no FPU - calculations are done in software by compiler generated code
     8 * softfp - suitable for running on CPU's with or without FPU - will use an FPU if present, otherwise will use compiler generated code
     9 * hard - suitable for running on CPU's with FPU only - the most efficient but also the most restrictive as far as binary compatibility goes
     10
     11= Gateworks Product Family CPU details =
     12Most of the Gateworks product families have hardware floating point acceleration capabilities:
     13 * Ventana: Freescale IMX6 SoC
     14  * ARM Cortex-A9 core (1 to 4 core depending on board options)
     15  * armv7a CPU instruction set
     16  * vfpv3-d16 Vector Floating Point unit (3rd generation, 16 64bit FPU registers)
     17  * [http://www.arm.com/products/processors/technologies/neon.php NEON General purpose SIMD (Single Instruction Multiple Dataset) engine]
     18 * Laguna: Cavium Econa CNS3XXX SoC
     19  * ARM11 core (1 to 2 core depending on board options)
     20  * armv6k CPU instruction set
     21  * vfpv2 Vector Floating Point unit (2nd generation, 16 64bit FPU registers)
     22 * Rincon: TI Davinci DM64XX SoC
     23  * ARM9 core
     24  * armv5t CPU instruction set
     25  * c64x DSP but no FPU
     26 * Avila / Cambria: Intel XScale IXP4xx SoC
     27  * armv5te CPU instruction set
     28  * no FPU
     29
     30FPU Notes:
     31 - NEON is a SIMD engine (vector math operations):  It can be used for single-precision floating-point ops on up to 4 single-precision values in parallel - there are pros and cons to using neon as the fpu (see [https://wiki.debian.org/ArmHardFloatPort/VfpComparison here])
     32 - VFP is an FPU (floating point unit) - the one in the Cortex-A9 is the [http://www.arm.com/products/processors/technologies/vector-floating-point.php vfpv3]
     33
     34= GCC options and binary compatibility =
     35The GCC compiler has several options that tell it how to deal with floating point in C code for ARM CPUs:
     36 * -mfloat-abi=<name>:
     37   * soft - causes GCC to generate output containing lib calls (to your libc) for floating-point operations.
     38   * softfp - allows generation of code using hardware floating-point instructions (based on the -mfp option) but still uses the soft-float calling conventions.
     39   * hard - allows generation of floating-point instructions '''and''' uses FPU-specific calling conventions.
     40 * -mfp=<name> - specifies the floating point hardware that is available on the target
     41
     42Note that there are some arguments that can be used for aliases of the above.  We will only refer to the options above however to avoid confusion in this document:
     43 * -mhard-float - equivalent to -mfloat-abi=hard
     44 * -msoft-float - equvalent to -mfloat-abi=soft
     45
     46The default setting (if not specified) for the above is dictated by the configuration options used to build the gcc cross-toolchain.  See [http://gcc.gnu.org/install/configure.html here] for more details.
     47
     48References:
     49 - https://wiki.debian.org/ArmHardFloatPort/VfpComparison - probably the best explanation of the details of soft/softfp/hard I've seen
     50 - https://wiki.linaro.org/Linaro-arm-hardfloat gc
     51 - http://www.arm.com/products/processors/technologies/vector-floating-point.php
     52 - http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
     53 - http://gcc.gnu.org/install/configure.html
     54
     55== soft ==
     56When gcc is used with -mfloat-abi=soft this causes GCC to generate output containing lib calls (to your libc) for floating-point operations.
     57
     58The GCC [http://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html software floating point library] is used when -mfloat-abi=soft (for use on machines which do not have hardware support for floating point).  This library provides addition, subtraction, multiplication, division, and conversion functions for floats and doubles.
     59
     60== softfp ==
     61When gcc is used with -mfloat-abi=softfp this allows generation of code using hardware floating-point instructions (based on the -mfp option) but still uses the soft-float calling conventions.  This allows a parameter passing and function linking compatibility between binaries built with hardware floating-point instructions or library calls as they both pass floats using standard non-float instructions and registers.  In other words this '''will''' use hardware floating-point where available (as specified at build-time with the -mfp argument) but can also link with binary/lib objects built with soft floating-point.
     62
     63The downside is the performance hit that you take in the function prologue/epilog when passing floats around (assuming you are).
     64
     65The upside is the binary/lib/kernel compatibility which allows you to support more system types (FPU or no FPU) with the same OS distribution.
     66
     67== hard ==
     68When gcc is used with -mfloat-abi=hard this allows generation of floating-point instructions '''and''' uses FPU-specific calling conventions.  This means FPU specific instructions/registers are used and thus a binary built with hard floating-point cannot link with a binary built with soft/softfp.  Note that this also requires CONFIG_VFP=y in the kernel.
     69
     70The upside is that you get better performance as the function prologue/epilog don't have to spend time moving floats to standard registers - instead the compiler will store arguments in dedicated FPU registers.
     71
     72The downside is the lack of binary/lib compatibility across the system and kernel compatibility (FPU vs no FPU) - which means you reduce the number of systems that your OS can run on.
     73
     74== NEON vs VFPv3 ==
     75The IMX6 SoC used on the Ventana product family has two choices for hardware floating point:
     76 * NEON - Note that NEON hardware does not fully implement the IEEE 754 standard for floating-point arithmetic so the use of NEON instructions may loead to a loss of precision.
     77 * VFPv3
     78
     79References:
     80 * http://www.arm.com/products/processors/technologies/vector-floating-point.php
     81 * http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
     82 * https://wiki.debian.org/ArmHardFloatPort/VfpComparison (information regarding gcc neon support is likely dated)
     83
     84== Exploring an Example ==
     85Consider the simple C application below built with the OpenWrt GCC4.6 compiler (based on Linaro GCC 4.6):
     86{{{
     87#include <stdio.h>
     88#include <stdlib.h>
     89
     90float foo(float f1, float f2) {
     91        return f1 * f2;
     92}
     93
     94int main(int argc, char **argv)
     95{
     96        float a1, a2, r;
     97
     98        if (argc != 3) {
     99                printf("usage: %s <float1> <float2>\n", argv[0]);
     100                exit(1);
     101        }
     102        a1 = strtof(argv[1], NULL);
     103        a2 = strtof(argv[2], NULL);
     104
     105        r = foo(a1, a2);
     106        printf("%f * %f = %f\n", a1, a2, r);
     107}
     108}}}
     109
     110Compiling the application and disassembling you can compare the results (below).  You will notice that in the soft dissasembly the branch to eaabi_f2d for the multiplication (this is provided by gcc).  You will notice that in the softfp dissasembly the flds/fmuls/fsts/fcvtds calls and the use of s14,s15 - these are all vfpv3 instructions/registers.  You will notice in the hard dissasembly that floating point registers and instructions are used instead of the stack.
     111
     112 * soft:
     113{{{
     114$ staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-4.6-linaro_uClibc-0.9.33.2_eabi/bin/arm-openwrt-linux-gcc \
     115-pipe -march=armv7-a -mtune=cortex-a9 \
     116-mfloat-abi=soft -S -o a.out-softfp.s floattest.c
     117
     118$ cat a.out-softfp.s
     119        .arch armv7-a
     120        .fpu softvfp
     121        .eabi_attribute 20, 1
     122        .eabi_attribute 21, 1
     123        .eabi_attribute 23, 3
     124        .eabi_attribute 24, 1
     125        .eabi_attribute 25, 1
     126        .eabi_attribute 26, 2
     127        .eabi_attribute 30, 6
     128        .eabi_attribute 34, 1
     129        .eabi_attribute 18, 4
     130        .file   "floattest.c"
     131        .global __aeabi_fmul
     132        .text
     133        .align  2
     134        .global foo
     135        .type   foo, %function
     136foo:
     137        @ args = 0, pretend = 0, frame = 8
     138        @ frame_needed = 1, uses_anonymous_args = 0
     139        stmfd   sp!, {fp, lr}
     140        add     fp, sp, #4
     141        sub     sp, sp, #8
     142        str     r0, [fp, #-8]   @ float
     143        str     r1, [fp, #-12]  @ float
     144        ldr     r0, [fp, #-8]   @ float
     145        ldr     r1, [fp, #-12]  @ float
     146        bl      __aeabi_fmul
     147        mov     r3, r0
     148        mov     r0, r3
     149        sub     sp, fp, #4
     150        ldmfd   sp!, {fp, pc}
     151        .size   foo, .-foo
     152        .section        .rodata
     153        .align  2
     154.LC0:
     155        .ascii  "usage: %s <float1> <float2>\012\000"
     156        .align  2
     157.LC1:
     158        .ascii  "%f * %f = %f\012\000"
     159        .global __aeabi_f2d
     160        .text
     161        .align  2
     162        .global main
     163        .type   main, %function
     164main:
     165        @ args = 0, pretend = 0, frame = 24
     166        @ frame_needed = 1, uses_anonymous_args = 0
     167        stmfd   sp!, {r4, r6, r7, r8, r9, fp, lr}
     168        add     fp, sp, #24
     169        sub     sp, sp, #44
     170        str     r0, [fp, #-48]
     171        str     r1, [fp, #-52]
     172        ldr     r3, [fp, #-48]
     173        cmp     r3, #3
     174        beq     .L3
     175        movw    r3, #:lower16:.LC0
     176        movt    r3, #:upper16:.LC0
     177        ldr     r2, [fp, #-52]
     178        ldr     r2, [r2, #0]
     179        mov     r0, r3
     180        mov     r1, r2
     181        bl      printf
     182        mov     r0, #1
     183        bl      exit
     184.L3:
     185        ldr     r3, [fp, #-52]
     186        add     r3, r3, #4
     187        ldr     r3, [r3, #0]
     188        mov     r0, r3
     189        mov     r1, #0
     190        bl      strtof
     191        str     r0, [fp, #-32]  @ float
     192        ldr     r3, [fp, #-52]
     193        add     r3, r3, #8
     194        ldr     r3, [r3, #0]
     195        mov     r0, r3
     196        mov     r1, #0
     197        bl      strtof
     198        str     r0, [fp, #-36]  @ float
     199        ldr     r0, [fp, #-32]  @ float
     200        ldr     r1, [fp, #-36]  @ float
     201        bl      foo
     202        str     r0, [fp, #-40]  @ float
     203        movw    r4, #:lower16:.LC1
     204        movt    r4, #:upper16:.LC1
     205        ldr     r0, [fp, #-32]  @ float
     206        bl      __aeabi_f2d
     207        mov     r6, r0
     208        mov     r7, r1
     209        ldr     r0, [fp, #-36]  @ float
     210        bl      __aeabi_f2d
     211        mov     r8, r0
     212        mov     r9, r1
     213        ldr     r0, [fp, #-40]  @ float
     214        bl      __aeabi_f2d
     215        mov     r2, r0
     216        mov     r3, r1
     217        strd    r8, [sp]
     218        strd    r2, [sp, #8]
     219        mov     r0, r4
     220        mov     r2, r6
     221        mov     r3, r7
     222        bl      printf
     223        mov     r0, r3
     224        sub     sp, fp, #24
     225        ldmfd   sp!, {r4, r6, r7, r8, r9, fp, pc}
     226        .size   main, .-main
     227        .ident  "GCC: (OpenWrt/Linaro GCC 4.6-2013.05 r39638) 4.6.4"
     228        .section        .note.GNU-stack,"",%progbits
     229}}}
     230  * Notes:
     231   * note the '.global __eabi_fmul' - this is the routine gcc uses for software float multiply.  The floating point emulation functions are automatically inserted by GCC when the processor does not have native instructions to deal with them.  There are several floating point emulation functions: __eabi_fadd, __eable_fmul for example
     232   * note the use of standard ARM instructions and registers when operating on the floats (str/ldr r0,r1,r3)
     233
     234 * softfp:
     235{{{
     236$ staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-4.6-linaro_uClibc-0.9.33.2_eabi/bin/arm-openwrt-linux-gcc \
     237-pipe -march=armv7-a -mtune=cortex-a9 \
     238-mfloat-abi=softfp -mfpu=vfpv3-d16 -S -o a.out-softfp.s floattest.c
     239
     240$ cat a.out-softfp.s
     241        .arch armv7-a
     242        .eabi_attribute 27, 3
     243        .fpu vfpv3-d16
     244        .eabi_attribute 20, 1
     245        .eabi_attribute 21, 1
     246        .eabi_attribute 23, 3
     247        .eabi_attribute 24, 1
     248        .eabi_attribute 25, 1
     249        .eabi_attribute 26, 2
     250        .eabi_attribute 30, 6
     251        .eabi_attribute 34, 1
     252        .eabi_attribute 18, 4
     253        .file   "floattest.c"
     254        .text
     255        .align  2
     256        .global foo
     257        .type   foo, %function
     258foo:
     259        @ args = 0, pretend = 0, frame = 8
     260        @ frame_needed = 1, uses_anonymous_args = 0
     261        @ link register save eliminated.
     262        str     fp, [sp, #-4]!
     263        add     fp, sp, #0
     264        sub     sp, sp, #12
     265        str     r0, [fp, #-8]   @ float
     266        str     r1, [fp, #-12]  @ float
     267        flds    s14, [fp, #-8]
     268        flds    s15, [fp, #-12]
     269        fmuls   s15, s14, s15
     270        fmrs    r3, s15
     271        mov     r0, r3  @ float
     272        add     sp, fp, #0
     273        ldmfd   sp!, {fp}
     274        bx      lr
     275        .size   foo, .-foo
     276        .section        .rodata
     277        .align  2
     278.LC0:
     279        .ascii  "usage: %s <float1> <float2>\012\000"
     280        .align  2
     281.LC1:
     282        .ascii  "%f * %f = %f\012\000"
     283        .text
     284        .align  2
     285        .global main
     286        .type   main, %function
     287main:
     288        @ args = 0, pretend = 0, frame = 24
     289        @ frame_needed = 1, uses_anonymous_args = 0
     290        stmfd   sp!, {fp, lr}
     291        add     fp, sp, #4
     292        sub     sp, sp, #40        str     r0, [fp, #-24]
     293        str     r1, [fp, #-28]
     294        ldr     r3, [fp, #-24]
     295        cmp     r3, #3
     296        beq     .L3
     297        movw    r3, #:lower16:.LC0
     298        movt    r3, #:upper16:.LC0
     299        ldr     r2, [fp, #-28]
     300        ldr     r2, [r2, #0]
     301        mov     r0, r3
     302        mov     r1, r2
     303        bl      printf
     304        mov     r0, #1
     305        bl      exit
     306.L3:
     307        ldr     r3, [fp, #-28]
     308        add     r3, r3, #4
     309        ldr     r3, [r3, #0]
     310        mov     r0, r3
     311        mov     r1, #0
     312        bl      strtof
     313        str     r0, [fp, #-8]   @ float
     314        ldr     r3, [fp, #-28]
     315        add     r3, r3, #8
     316        ldr     r3, [r3, #0]
     317        mov     r0, r3
     318        mov     r1, #0
     319        bl      strtof
     320        str     r0, [fp, #-12]  @ float
     321        ldr     r0, [fp, #-8]   @ float
     322        ldr     r1, [fp, #-12]  @ float
     323        bl      foo
     324        str     r0, [fp, #-16]  @ float
     325        movw    r3, #:lower16:.LC1
     326        movt    r3, #:upper16:.LC1
     327        flds    s15, [fp, #-8]
     328        fcvtds  d5, s15
     329        flds    s15, [fp, #-12]
     330        fcvtds  d6, s15
     331        flds    s15, [fp, #-16]
     332        fcvtds  d7, s15
     333        fstd    d6, [sp, #0]
     334        fstd    d7, [sp, #8]
     335        mov     r0, r3
     336        fmrrd   r2, r3, d5
     337        bl      printf
     338        mov     r0, r3
     339        sub     sp, fp, #4
     340        ldmfd   sp!, {fp, pc}
     341        .size   main, .-main
     342        .ident  "GCC: (OpenWrt/Linaro GCC 4.6-2013.05 r39638) 4.6.4"
     343        .section        .note.GNU-stack,"",%progbits
     344}}}
     345  * Notes:
     346   * '''the flds/fmuls s14/s15 VFP instructions/registers are used here because the compiler was told to use the vfpv3-d16 floating point unit'''
     347   * however when passed as arguments to functions they are stored in the standard ARM cpu registers (str/mov r0/r1/r3) which incurs a pipeline stall for each register passed and has performance implications in that a lot of time is spent in function prologue/epilogue copying data back and forth to the FPU registers (which could be about 20 cycles or more for each float).  Obviously here passing floats or returning floats from functions takes the performance hit here.  If the sample code inlined the foo function the code would look the same as the hard float below
     348   * '''binaries built with soft or softfp can be intermixed on a target at the expense of less performance when passing and returning floats to/from functions'''
     349
     350 * hard:
     351{{{
     352$ staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-4.6-linaro_uClibc-0.9.33.2_eabi/bin/arm-openwrt-linux-gcc \
     353-pipe -march=armv7-a -mtune=cortex-a9 \
     354-mfloat-abi=hard -mfpu=vfpv3-d16 -S -o a.out-softfp.s floattest.c
     355
     356$ cat a.out-hard.s
     357        .arch armv7-a
     358        .eabi_attribute 27, 3
     359        .eabi_attribute 28, 1
     360        .fpu vfpv3-d16
     361        .eabi_attribute 20, 1
     362        .eabi_attribute 21, 1
     363        .eabi_attribute 23, 3
     364        .eabi_attribute 24, 1
     365        .eabi_attribute 25, 1
     366        .eabi_attribute 26, 2
     367        .eabi_attribute 30, 6
     368        .eabi_attribute 34, 1
     369        .eabi_attribute 18, 4
     370        .file   "floattest.c"
     371        .text
     372        .align  2
     373        .global foo
     374        .type   foo, %function
     375foo:
     376        @ args = 0, pretend = 0, frame = 8
     377        @ frame_needed = 1, uses_anonymous_args = 0
     378        @ link register save eliminated.
     379        str     fp, [sp, #-4]!
     380        add     fp, sp, #0
     381        sub     sp, sp, #12
     382        fsts    s0, [fp, #-8]
     383        fsts    s1, [fp, #-12]
     384        flds    s14, [fp, #-8]
     385        flds    s15, [fp, #-12]
     386        fmuls   s15, s14, s15
     387        fcpys   s0, s15
     388        add     sp, fp, #0
     389        ldmfd   sp!, {fp}
     390        bx      lr
     391        .size   foo, .-foo
     392        .section        .rodata
     393        .align  2
     394.LC0:
     395        .ascii  "usage: %s <float1> <float2>\012\000"
     396        .align  2
     397.LC1:
     398        .ascii  "%f * %f = %f\012\000"
     399        .text
     400        .align  2
     401        .global main
     402        .type   main, %function
     403main:
     404        @ args = 0, pretend = 0, frame = 24
     405        @ frame_needed = 1, uses_anonymous_args = 0
     406        stmfd   sp!, {fp, lr}
     407        add     fp, sp, #4
     408        sub     sp, sp, #40
     409        str     r0, [fp, #-24]
     410        str     r1, [fp, #-28]
     411        ldr     r3, [fp, #-24]
     412        cmp     r3, #3
     413        beq     .L3
     414        movw    r3, #:lower16:.LC0
     415        movt    r3, #:upper16:.LC0
     416        ldr     r2, [fp, #-28]
     417        ldr     r2, [r2, #0]
     418        mov     r0, r3
     419        mov     r1, r2
     420        bl      printf
     421        mov     r0, #1
     422        bl      exit
     423.L3:
     424        ldr     r3, [fp, #-28]
     425        add     r3, r3, #4
     426        ldr     r3, [r3, #0]
     427        mov     r0, r3
     428        mov     r1, #0
     429        bl      strtof
     430        fsts    s0, [fp, #-8]
     431        ldr     r3, [fp, #-28]
     432        add     r3, r3, #8
     433        ldr     r3, [r3, #0]
     434        mov     r0, r3
     435        mov     r1, #0
     436        bl      strtof
     437        fsts    s0, [fp, #-12]
     438        flds    s0, [fp, #-8]
     439        flds    s1, [fp, #-12]
     440        bl      foo
     441        fsts    s0, [fp, #-16]
     442        movw    r3, #:lower16:.LC1
     443        movt    r3, #:upper16:.LC1
     444        flds    s15, [fp, #-8]
     445        fcvtds  d5, s15
     446        flds    s15, [fp, #-12]
     447        fcvtds  d6, s15
     448        flds    s15, [fp, #-16]
     449        fcvtds  d7, s15
     450        fstd    d6, [sp, #0]
     451        fstd    d7, [sp, #8]
     452        mov     r0, r3
     453        fmrrd   r2, r3, d5
     454        bl      printf
     455        mov     r0, r3
     456        sub     sp, fp, #4
     457        ldmfd   sp!, {fp, pc}
     458        .size   main, .-main
     459        .ident  "GCC: (OpenWrt/Linaro GCC 4.6-2013.05 r39638) 4.6.4"
     460        .section        .note.GNU-stack,"",%progbits
     461}}}
     462  * Notes:
     463   * like softfp, the flds/fmuls s14/s15 VFP instructions/registers are used here because the compiler was told to use the vfpv3-d16 floating point unit
     464   * however when passed as arguments to functions they are stored in the VFP registers (s14/s15 above) directly and don't need to be copied around saving the pipeline hit and 20 or so instructions per function call.
     465   * '''binaries built with hard float can yield better float performance but can not be intermixed with binaries built with soft/softfp'''
     466
     467= Floating Point Support in the Kernel (via exception handling) =
     468If the hardware floating point ABI (-mfloat-abi=hard) is going to be used, the Linux kernel must be built with CONFIG_VFP=y to install the necessary exception handlers.  You can optionally eliminate this if you know your not going to use hard float - the only place I see this as beneficial is if you are wanting to build a multi-arch kernel that runs on CPUs with and without an FPU and thus also much be using either -mfloat-abi=soft|softfp.  Furthermore, if you don't have hardware floating point, you will need to configure software float emulation in the kernel if you have any userspace apps/libs that use hardware floating point.
     469
     470This is available in the kernel config under 'Enable Floating point emulation' (CONFIG_VFP=y)
     471
     472Depending on the architecture and SoC you can select emulation using various hardware floating-point options.  For example, for the IMX6 you can select either VFPv3 (CONFIG_VFPv3=y) or NEON (CONFIG_NEON=y) emulation.  The kernel floating-point emulation can only be used for architectures that have hardware floating-point.
     473
     474References:
     475 * http://www.linux-arm.org/LinuxKernel/LinuxVFP
     476 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/arm/VFP/release-notes.txt
     477 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/Kconfig
     478
     479= OpenWrt =
     480There is a menuconfig in OpenWrt under toolchain options under '''Advanced configuration options->Target options->Use software floating point'' which configures the following:
     481 * CONFIG_SOFT_FLOAT=y:
     482    * apps are built with -mfloat-abi=softfp
     483    * uClibc is built with UCLIBC_HAS_FLOATS=y, UCLIBC_HAS_SOFT_FLOAT=y
     484    * toolchain is configured to default to -mfloat-abi=soft (if not specified)
     485 * CONFIG_SOFT_FLOAT undefined
     486    * apps are built with -mfloat-abi=hard
     487    * uClibc is built with UCLIBC_HAS_FLOATS=y, UCLIBC_HAS_FPU=y
     488    * toolchain is configured to default to -mfloat-abi=hard (if not specified)
     489
     490When using CONFIG_SOFT_FLOAT undefined, and thus -mfloat-abi=hard you '''must''' have kernel support for VFP, otherwise any VFP stack instructions (used when passing floats to functions) will cause an exception that is not handled and crash your system.
     491
     492= OpenEmbedded / Yocto =
     493The OpenEmbedded build system used to build the Yocto BSP for the Gateworks boards varies by the yocto version:
     494 * Yocto 1.4:
     495  * toolchain: gcc v4.7.2 with no default for -mabi-float
     496  * libc: eglibc-2.16
     497  * apps: -mfloat-abi=softfp
     498 * Yocto 1.5:
     499  * toolchain: gcc v4.8.1 defaulted to -mabi-float=hard
     500  * libc: eglibc-2.18
     501  * apps: -mfloat-abi=hard
     502
     503= Binary distributions =
     504Certain OS Distributions may provide multiple distributions based on floating point.  It is common for the suffix 'armhl' to be used for hard floating point.