doc improvements and a pragma, which probably does nothing
This commit is contained in:
parent
5a010d3970
commit
540e7e44f8
2 changed files with 8 additions and 3 deletions
6
COMPILE
6
COMPILE
|
@ -15,7 +15,7 @@ flags.
|
|||
Fast math
|
||||
---------
|
||||
|
||||
Specifying "-ffast-math" is important for all compilers, since it allows the
|
||||
Specifying "-ffast-math" or "-ffp-contract=fast" is important for all compilers, since it allows the
|
||||
compiler to fuse multiplications and additions into FMA instructions, which is
|
||||
forbidden by the C99 standard. Since FMAs are a central aspect of the algorithm,
|
||||
they are needed for optimum performance.
|
||||
|
@ -25,8 +25,8 @@ to the C99 standard, you should still be able to compile libsharp with
|
|||
"-ffast-math" without any problems.
|
||||
|
||||
|
||||
Runtime CPU selection with gcc
|
||||
------------------------------
|
||||
Runtime CPU selection with gcc and clang
|
||||
----------------------------------------
|
||||
|
||||
When using a recent gcc (6.0 and newer) or a recent clang (successfully tested
|
||||
with versions 6 and 7) on an x86_64 platform, the build machinery can compile
|
||||
|
|
|
@ -42,6 +42,11 @@
|
|||
#include "libsharp/sharp_internal.h"
|
||||
#include "c_utils/c_utils.h"
|
||||
|
||||
// In the following, we explicitly allow the compiler to contract floating
|
||||
// point operations, like multiply-and-add.
|
||||
// Unfortunately, most compilers don't act on this pragma yet.
|
||||
#pragma STDC FP_CONTRACT ON
|
||||
|
||||
typedef complex double dcmplx;
|
||||
|
||||
#define nv0 (128/VLEN)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue