better documentation
This commit is contained in:
parent
b5eda7fc0a
commit
3a9705e3cc
1 changed files with 73 additions and 13 deletions
86
COMPILE
86
COMPILE
|
@ -1,23 +1,83 @@
|
||||||
Libsharp is configured, compiled and installed using GNU autotools.
|
Libsharp is configured, compiled and installed using GNU autotools.
|
||||||
The most complicated step for the user is selecting the appropriate compiler
|
|
||||||
flags (and in some cases the compiler).
|
|
||||||
|
|
||||||
Here are a few (hopefully helpful) examples:
|
If you have cloned the libsharp repository, you have to run
|
||||||
|
"autoreconf -i" before starting the configuration, which requires several
|
||||||
|
GNU developer tools to be available on your system.
|
||||||
|
|
||||||
GCC, OpenMP, portable executable:
|
When using a release tarball, configuration is done via
|
||||||
CFLAGS="-std=c99 -O3 -ffast-math -flto -fopenmp" ./configure
|
|
||||||
|
|
||||||
GCC, OpenMP, specific optimization for the target CPU:
|
[CC=...] [CFLAGS=...] ./configure
|
||||||
CFLAGS="-std=c99 -O3 -march=native -ffast-math -flto -fopenmp" ./configure
|
|
||||||
|
|
||||||
GCC, no OpenMP, specific optimization for the target CPU:
|
The following sections briefly describe possible choices for compilers and
|
||||||
CFLAGS="-std=c99 -O3 -march=native -ffast-math -flto" ./configure
|
flags.
|
||||||
|
|
||||||
Clang:
|
|
||||||
CC=clang CFLAGS="-std=c99 -O3 -march=native -ffast-math -flto -fopenmp" ./configure
|
|
||||||
|
|
||||||
MPI support:
|
Fast math
|
||||||
CC=mpicc CFLAGS="-DUSE_MPI -std=c99 -O3 -march=native -ffast-math -flto" ./configure
|
---------
|
||||||
|
|
||||||
|
Specifying "-ffast-math" is important for all compilers, since it allows the
|
||||||
|
compiler to fuse multiplications and additions into FMA instructions, which is
|
||||||
|
forbidden by the C99 standard. Since FMAs are a central aspect of the algorithm,
|
||||||
|
they are needed for optimum performance.
|
||||||
|
|
||||||
|
If you are calling libsharp from other code which requires strict adherence
|
||||||
|
to the C99 standard, you should still be able to compile libsharp with
|
||||||
|
"-ffast-math" without any problems.
|
||||||
|
|
||||||
|
|
||||||
|
Runtime CPU selection with gcc
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
When using a recent gcc (6.0 and newer) on an x86_64 platform, the build
|
||||||
|
machinery will compile the time-critical functions for several different
|
||||||
|
architectures (SSE2, AVX, AVX2, FMA3, FMA4, AVX512F), and the appropriate
|
||||||
|
implementation will be selected at runtime.
|
||||||
|
This only happens if you do _not_ explicitly specify a target architecture via
|
||||||
|
the compiler flags. I.e., please do _not_ specify "-march=native" or
|
||||||
|
"-mtarget=avx" or similar if you want a portable binary that will run
|
||||||
|
efficiently on different x86_64 CPUs.
|
||||||
|
If you are compiling libsharp for a particular target CPU only, or if you are
|
||||||
|
using a different compiler, however, "-march-native" should be used. The
|
||||||
|
resulting binary will most likely not run on other computers, though.
|
||||||
|
|
||||||
|
|
||||||
|
OpenMP
|
||||||
|
------
|
||||||
|
|
||||||
|
OpenMP should be switched on for maximum performance, and at runtime
|
||||||
|
OMP_NUM_THREADS should be set to the number of hardware threads (not physical
|
||||||
|
cores) of the system.
|
||||||
|
(Usually this is already the default setting when OMP_NUM_THREADS is not
|
||||||
|
specified.)
|
||||||
|
|
||||||
|
|
||||||
|
MPI
|
||||||
|
---
|
||||||
|
|
||||||
|
MPI support is enabled by using the MPI compiler (typically "mpicc") _and_
|
||||||
|
adding the flag "-DUSE_MPI".
|
||||||
|
When using MPI and OpenMP simultaneously, the product of MPI tasks per node
|
||||||
|
and OMP_NUM_THREADS should be equal to the number of hardware threads available
|
||||||
|
on the node. One MPI task per node should result in the best performance.
|
||||||
|
|
||||||
|
|
||||||
|
Example configure invocations
|
||||||
|
=============================
|
||||||
|
|
||||||
|
GCC, OpenMP, portable binary:
|
||||||
|
CFLAGS="-std=c99 -O3 -ffast-math -fopenmp" ./configure
|
||||||
|
|
||||||
|
GCC, no OpenMP, portable binary:
|
||||||
|
CFLAGS="-std=c99 -O3 -ffast-math" ./configure
|
||||||
|
|
||||||
|
Clang, OpenMP, nonportable binary:
|
||||||
|
CC=clang CFLAGS="-std=c99 -O3 -march=native -ffast-math -fopenmp" ./configure
|
||||||
|
|
||||||
|
Intel C compiler, OpenMP, nonportable binary:
|
||||||
|
CC=icc CFLAGS="-std=c99 -O3 -march=native -ffast-math -fopenmp" ./configure
|
||||||
|
|
||||||
|
MPI support, nonportable binary:
|
||||||
|
CC=mpicc CFLAGS="-DUSE_MPI -std=c99 -O3 -march=native -ffast-math" ./configure
|
||||||
|
|
||||||
Additional GCC flags for pedantic warning and debugging:
|
Additional GCC flags for pedantic warning and debugging:
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue