87 lines
3.2 KiB
Text
87 lines
3.2 KiB
Text
Libsharp2 is configured, compiled and installed using GNU autotools.
|
|
|
|
If you have cloned the libsharp2 repository, you have to run
|
|
"autoreconf -i" before starting the configuration, which requires several
|
|
GNU developer tools to be available on your system.
|
|
|
|
When using a release tarball, configuration is done via
|
|
|
|
[CC=...] [CFLAGS=...] ./configure
|
|
|
|
The following sections briefly describe possible choices for compilers and
|
|
flags.
|
|
|
|
|
|
Fast math
|
|
---------
|
|
|
|
Specifying "-ffast-math" or "-ffp-contract=fast" is important for all compilers,
|
|
since it allows the compiler to fuse multiplications and additions into FMA
|
|
instructions, which is forbidden by the C99 standard. Since FMAs are a central
|
|
aspect of the algorithm, they are needed for optimum performance.
|
|
|
|
If you are calling libsharp2 from other code which requires strict adherence
|
|
to the C99 standard, you should still be able to compile libsharp2 with
|
|
"-ffast-math" without any problems.
|
|
|
|
|
|
Runtime CPU selection with gcc and clang
|
|
----------------------------------------
|
|
|
|
When using a recent gcc (6.0 and newer) or a recent clang (successfully tested
|
|
with versions 6 and 7) on an x86_64 platform, the build machinery can compile
|
|
the time-critical functions for several different architectures (SSE2, AVX,
|
|
AVX2, FMA3, FMA4, AVX512F), and the appropriate implementation will be selected
|
|
at runtime.
|
|
This is enabled by passing "-DMULTIARCH" as part of the CFLAGS.
|
|
If this is enabled, please do _not_ specify "-march=native" or
|
|
"-mtarget=avx" or similar!
|
|
If you are compiling libsharp2 for a particular target CPU only, or if you are
|
|
using a different compiler, however, "-march-native" should be used. The
|
|
resulting binary will most likely not run on other computers, though.
|
|
|
|
|
|
OpenMP
|
|
------
|
|
|
|
OpenMP is enabled by default if the selected compiler supports it.
|
|
It can be disabled at configuration time by specifying "--disable-openmp" at the
|
|
configure command line.
|
|
At runtime OMP_NUM_THREADS should be set to the number of hardware threads
|
|
(not physical cores) of the system.
|
|
(Usually this is already the default setting when OMP_NUM_THREADS is not
|
|
specified.)
|
|
|
|
|
|
MPI
|
|
---
|
|
|
|
MPI support is enabled by using the MPI compiler (typically "mpicc") _and_
|
|
adding the flag "-DUSE_MPI".
|
|
When using MPI and OpenMP simultaneously, the product of MPI tasks per node
|
|
and OMP_NUM_THREADS should be equal to the number of hardware threads available
|
|
on the node. One MPI task per node should result in the best performance.
|
|
|
|
|
|
Example configure invocations
|
|
=============================
|
|
|
|
GCC, OpenMP, portable binary:
|
|
CFLAGS="-DMULTIARCH -std=c99 -O3 -ffast-math" ./configure
|
|
|
|
GCC, no OpenMP, portable binary:
|
|
CFLAGS="-DMULTIARCH -std=c99 -O3 -ffast-math" ./configure --disable-openmp
|
|
|
|
Clang, OpenMP, portable binary:
|
|
CC=clang CFLAGS="-DMULTIARCH -std=c99 -O3 -ffast-math" ./configure
|
|
|
|
Intel C compiler, OpenMP, nonportable binary:
|
|
CC=icc CFLAGS="-std=c99 -O3 -march=native -ffast-math -D__PURE_INTEL_C99_HEADERS__" ./configure
|
|
|
|
MPI support, OpenMP, portable binary:
|
|
CC=mpicc CFLAGS="-DUSE_MPI -DMULTIARCH -std=c99 -O3 -ffast-math" ./configure
|
|
|
|
Additional GCC flags for pedantic warning and debugging:
|
|
|
|
-Wall -Wextra -Wshadow -Wmissing-prototypes -Wfatal-errors -pedantic -g
|
|
-fsanitize=address
|