No description

Find a file

Martin Reinecke b27ce30cde switch to new FFT		2020-01-04 12:59:59 +01:00
fortran	add copyright headers	2019-12-06 21:21:20 +01:00
libsharp2	switch to new FFT	2020-01-04 12:59:59 +01:00
m4	missing file	2018-12-12 20:57:34 +01:00
python	update library name	2019-12-06 14:27:56 +01:00
test	runs as C++, no vector support yet	2020-01-03 17:22:31 +01:00
.gitignore	name change to libsharp2	2019-12-06 13:53:27 +01:00
COMPILE	runs as C++, no vector support yet	2020-01-03 17:22:31 +01:00
configure.ac	runs as C++, no vector support yet	2020-01-03 17:22:31 +01:00
COPYING	initial import	2012-06-27 16:44:31 +02:00
Makefile.am	switch to new FFT	2020-01-04 12:59:59 +01:00
README.md	name change to libsharp2	2019-12-06 13:53:27 +01:00
runtest.sh	name change to libsharp2	2019-12-06 13:53:27 +01:00

README.md

Libsharp2

Library for efficient spherical harmonic transforms at arbitrary spins, supporting CPU vectorization, OpenMP and MPI.

Paper

https://arxiv.org/abs/1303.4945

News

January 2019

This update features significant speedups thanks to important algorithmic discoveries by Keiichi Ishioka (https://www.jstage.jst.go.jp/article/jmsj/96/2/96_2018-019/_article and personal communication).

These improvements reduce the fraction of CPU time spent on evaluating the recurrences for Y_lm coefficients, which means that computing multiple simultaneous SHTs no longer has a big performance advantage compared to SHTs done one after the other. As a consequence, libsharp's support for simultaneous SHTs was dropped, making its interface much simpler.

With the proper compilers and flags (see the file COMPILE for details) libsharp2 is now built with support for SSE2, AVX, AVX2, FMA3, FMA4 and AVX512f and the appropriate implementation is selected dynamically at runtime. This should provide a very significant performance boost for everyone using pre-compiled portable binaries.

Compilation

The library uses the standard autotools mechanism for configuration, compilation and installation. See the file COMPILE for configuration hints.