Initial import

This commit is contained in:
Guilhem Lavaux 2023-05-29 10:41:03 +02:00
commit 56a50eead3
820 changed files with 192077 additions and 0 deletions

View file

@ -0,0 +1,52 @@
dm_sheet
========
This is a module for ARES/HADES/BORG.
It adds the algorithms **dm_sheet** to compute cosmological fields from
the dark matter phase-space sheet (in particular, density and velocity
fields from tetrahedra formalism).
``borg_forward`` supports the use of dm_sheet when it is available.
Setup
-----
To use this module, clone `the repository <https://bitbucket.org/bayesian_lss_team/dm_sheet/>`_ in $ARES_ROOT/extra/ (where $ARES_ROOT
represents the root source directory of ARES on your computer).
For example, you can do:
.. code:: bash
cd $ARES_SOURCE/extra
git clone git@bitbucket.org:/bayesian_lss_team/dm_sheet.git dm_sheet
and :ref:`rebuild <building>`.
Use
---
To use dm_sheet in ``borg_forward``, use the flag ``--dmsheet``. New
fields are then added to the :ref:`output files<outputs>`.
Contributors
------------
The main authors of this module are:
- Florent Leclercq
- Guilhem Lavaux
To add more features, please contact these people, or submit pull
requests.
Additional contributions from:
- James Prideaux-Ghee
References
----------
- T. Abel, O. Hahn, R. Kaehler (2012), Tracing the Dark Matter Sheet in Phase Space, arXiv:1111.3944
- O. Hahn, R. Angulo, T. Abel (2015), The Properties of Cosmic Velocity Fields, arXiv:1404.2280
- F. Leclercq, J. Jasche, G. Lavaux, B. Wandelt, W. Percival (2017), The phase-space structure of nearby dark matter as constrained by the SDSS, arXiv:1601.00093

View file

@ -0,0 +1,109 @@
hmclet
======
Guilhem has developped a much smaller variant of the Hamiltonian Markov
Chain algorithm to jointly sample a limited set of parameters (like <
100).
This is **HMCLET**: a small extra HMC framework for |a| to allow sampling a bunch of model parameters
together. It provides a self calibration step to estimate the masses for
the HMC.
Setup
-----
The code is available in "hmclet" module . To use it, clone this
repository into extra/hmclet in ARES source tree. You can for example
do:
.. code:: bash
cd $ARES_SOURCE/extra
git clone https://bitbucket.org/bayesian_lss_team/hmclet.git hmclet
Once it is checked out you can move to the build directory and run
``cmake .``, then ``make`` and you will have the new module compiled.
You can run ``libLSS/tests/test_hmclet`` to check that no error is
triggered and verify the content of "test_sample.h5". It must contain a
chain with 2 parameters for which the first one oscillates around 1 with
a variance of 10, and the other oscillates around 4 with a variance of
2.
Use
---
The Little HMC (HMClet, like Applet) framework consists in two classes
in the namespace ``LibLSS::HMCLet``:
- JointPosterior, which is the one acting like a parent to your class
describing the log-posterior,
- SimpleSampler, which is using an instance of JointPosterior to
generate samples using the HMC algorithm.
There is a demonstration (and test case) available in
libLSS/tests/test_hmclet.cpp, please have a look at it.
To use SingleSampler you have to make a new class derivative of
JointPosterior and implement three functions:
- ``getNumberOfParameters()`` which returns an integer corresponding to
the number of parameters supported by your posterior
- ``evaluate(parameters)`` which returns the opposite of the
log-posterior (i.e. like chi2/2)
- ``adjointGradient(parameters, adjoint_gradient)`` which fills the
adjoint gradient vector corresponding to the given parameters.
An example is as follow:
.. code:: cpp
class MyPosterior: virtual public JointPosterior {
public:
/* Bla bla for constructor and destructor */
virtual size_t getNumberOfParameters() const {
return 1;
}
virtual double evaluate(VectorType const& params) {
return 0.5 * square(params[0]-1)/10.;
}
virtual void adjointGradient(VectorType const& params, VectorType& params_gradient) {
params_gradient[0] = (params[0]-1)/10.;
}
};
The above posterior will represent a Gaussian distribution centered on
one, with a variance of 10. It depends on a single parameter.
The sampling would occur like this:
.. code:: cpp
auto posterior = std::make_shared<MyPosterior>();
SimpleSampler sampler(posterior);
/* Calibrate the mass matrix.
* comm: MPI communication
* rgen: Random number generator
* steps: number of steps to attempt for calibration
* init_params: initial parameters to start calibration
* init_step: typical step size to start with
*/
sampler.calibrate(comm, rgen, steps, init_params, init_step);
/* Generate a sample with HMC
* comm: MPI communication
* rgen: Random number generator
* params: current parameter state
*/
sampler.newSample(comm, rgen, init_params);
Contributors
------------
- Guilhem Lavaux
- Jens Jasche
You can submit pull requests to the BLSS team admin.

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,591 @@
Python
======
This pages presents the features of the ARES/BORG Python module
Installation
------------
``bash get-aquila-modules.sh --clone`` automatically retrieves the
module.
Use the ``--python`` flag in ``build.sh`` (see :ref:`building <building>`). The
python package installation is automatic if you run ``make install``. At the end
of the make phase, a python module will be installed in the user site-package
directory and made available to python VM. If you also require to run with
python defined likelihood (see :ref:`how to write a likelihood in python
<building_python_likelihood_script>`) with hades then you also need to append
``--hades-python`` while executing ``build.sh``. This requirement will probably
go away later.
.. note::
If you compile with MPI support the Python binding interface
will look for the MPI4PY package. If it is not found, it will just
proceed as usual. However, if it is found, the MPI4PY must have been
compiled with the *same* MPI framework as ARES/BORG. Not doing so will
very likely result in a segmentation fault when importing borg. A
succesfull import will look like the following:
.. code:: python3
>>> import borg
Initializing console.
[INFO ] libLSS version v2.0.0alpha-47-g7d560cc built-in modules ares_fg;borg;dm_sheet;hades;hmclet;python
[INFO S ] Registered forward models:
[INFO S ] - 2LPT_CIC
[INFO S ] - 2LPT_CIC_OPENMP
[INFO S ] - 2LPT_DOUBLE
[INFO S ] - 2LPT_NGP
[INFO S ] - Downgrade
[INFO S ] - EnforceMass
[INFO S ] - HADES_LOG
[INFO S ] - HADES_PT
[INFO S ] - Haar
[INFO S ] - LPT_CIC
[INFO S ] - LPT_CIC_OPENMP
[INFO S ] - LPT_DOUBLE
[INFO S ] - LPT_NGP
[INFO S ] - PATCH_MODEL
[INFO S ] - PM_CIC
[INFO S ] - PM_CIC_OPENMP
[INFO S ] - PRIMORDIAL
[INFO S ] - PRIMORDIAL_FNL
[INFO S ] - Softplus
[INFO S ] - TRANSFER_EHU
[INFO S ] - Transfer
[INFO S ] - Upgrade
[INFO S ] - bias::BrokenPowerLaw
[INFO S ] - bias::DoubleBrokenPowerLaw
[INFO S ] - bias::EFT
[INFO S ] - bias::EFT_Thresh
[INFO S ] - bias::Linear
[INFO S ] - bias::ManyPower_1^1
[INFO S ] - bias::ManyPower_1^2
[INFO S ] - bias::ManyPower_1^4
[INFO S ] - bias::ManyPower_2^2
[INFO S ] - bias::Noop
[INFO S ] - bias::PowerLaw
[INFO ] Found MPI4PY.
[INFO ] CPU features: MMX [!AVX] [!AVX2] SSE SSE2 [!SSE3] [!SSE4.1] [!SSE4.2]
>>>
As you can see there is a line "Found MPI4PY".
Usage
-----
First step:
.. code:: python
import borg
# This retrieve the console management object
console = borg.console()
# This prints at the STD level
console.print_std("Hello!")
# Reduce verbosity
console.setVerboseLevel(2)
.. _building_your_first_chain:
Building your first chain
-------------------------
The BORG python pipeline closely follow the BORGForwardModel v2 API.
This means that the input is assumed to be Gaussian random number with
unit variance in Fourier space. Fortunately the generation of such
numbers is easy:
.. code:: python3
import numpy as np
# Define a physical box (that is optional for this step, but it will be useful later
box = borg.forward.BoxModel()
box.L = (200,200,200)
box.N = (64,64,64)
# Generate gaussian random numbers, Fourier transform them, and rescale to ensure unit-variance
ic = np.fft.rfftn(np.random.randn(*box.N))/box.N[0]**(1.5)
In the above code snippet we have also defined a BORG box, which is at
the moment limited to 3d. ``box.L`` is the physical size (in Mpc/h) of
the box in each direction, while ``box.N`` is the grid size. In the
above you see that the Fourier transformed density has been rescaled by
:math:`1/\sqrt{N^3}`. This comes because of simple linear algebraic
properties, and the requirement of unit variance in the Fourier
representation.
Now we need to create a new chain object:
.. code:: python3
chain = borg.forward.ChainForwardModel(box)
chain.addModel(borg.forward.models.HermiticEnforcer(box))
We have immediately added an element that enforces that the elements of
the input Fourier density field to be self-complex conjugated. This is
not strictly required here as ``ic`` was generated by ``np.fft.rfftn``.
Our first real element of the chain is the injection of primordial
gravity fluctuation:
.. code:: python3
chain.addModel(borg.forward.models.Primordial(box, 0.1))
This multiplies in Fourier space the input density with a function:
:math:`A(k) \propto -k^{n_S/2-2}` The exact constant of proportionality
depends on :math:`\sigma_8` (or :math:`A_S`), the volume and the Hubble
constant. Note the ``0.1`` which indicates the scale factor at which the
potential is seeded in the chain. The next elements depend on that
number.
The next element is to add a physical transfer function to produce
density fluctuations out of this gravitational potential:
.. code:: python3
chain.addModel(borg.forward.models.EisensteinHu(box))
This is a simple Einsenstein & Hu power spectrum, which does not change
the scale factor of the universe.
Now we need to add a real gravity solver. One simple solver is provided
by "BorgLpt" (BORG 1-Lagrangian Perturbation Theory, also known as
Zel'dovich approximation).
.. code:: python3
lpt = borg.forward.models.BorgLpt(box=box, box_out=box, ai=0.1, af=1.0, supersampling=4)
chain.addModel(lpt)
(**Question from Andrija**: What does the supersampling param control?
The ai and af look intuitive enough, for initial scale factor and final
one essentially controlling the time, but supersampling I don't
understand. Also doing help(borg.forward.models.BorgLpt) didn't help me
much in understanding)
In the above case we keep the object ``lpt`` in the current scope to be
able to access more internal state later.
We can now setup the cosmology:
.. code:: python3
cosmo_par = borg.cosmo.CosmologicalParameters()
cosmo_par.default()
print(repr(cosmo_par))
chain.setCosmoParams(cosmo_par)
We have used some sane defaults for the cosmology in the above. The
values of the parameters are printed using the print statement. All the
elements of the chain are being updated with the last statement. They
try to do this "lazily", i.e. if the cosmology has not changed nothing
will happen (as updating the internal cached state may be very costly).
The model is run with ``chain.forwardModel_v2(ic)``, which goes through
the entire chain. The final density field is not yet produced. To do
this we need to request it explicitly:
.. code:: python3
rho = np.empty(chain.getOutputBoxModel().N)
chain.getDensityFinal(rho)
``rho`` holds now density contrast of the simulation. In IPython, one
can show check a slice using:
.. code:: python3
from matplotlib import pyplot as plt
plt.imshow(rho[:,:,chain.getOutputBoxModel().N[2]//2])
plt.show()
Computing the adjoint gradient
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The evaluation of the adjoint gradient follows the same pattern as for
the forward evaluation. Instead of the pair ``forwardModel_v2`` and
``getDensityFinal``, one must use ``adjointModel_v2`` and
``getAdjointModel``. However keep in mind the shapes of the arrays are
reversed: ``adjointModel_v2`` requires an array according to the output
of the forward model. Thus we have:
.. code:: python3
dlogL_drho = np.empty(chain.getOutputBoxModel().N)
# Here fill up dlogL_drho from the gradient of the likelihood
chain.adjointModel_v2(dlogL_drho)
ic = np.empty(chain.getBoxModel().N)
chain.getAdjointModel(ic)
Note also that we have requested the initial conditions in real
representation (and not Fourier). A Fourier representation may be
requested by providing an adequate sized complex array.
Computing the velocity field
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
BORG comes pre-bundled with velocity field estimator (along with their
adjoint gradient of course). A very simple estimator is provided by the
CIC density estimator. It requires a particle based simulator to
estimate the velocity field from. Such particle based simulators are for
example BorgLpt, Borg2Lpt or BorgPM. If the types are not compatible, an
exception will be thrown.
The usage is simple, here is an example:
.. code:: python3
vmodel = borg.forward.velocity.CICModel(box, lpt)
out_v = np.empty((3,)+box.N)
vmodel.getVelocityField(out_v)
The first statement creates the velocity field estimator, with the
requested box to be produced and the particle based forward model
``lpt`` (same variable as in the :ref:`section "Building your first chain" <building_your_first_chain>`). The second statement
allocates the required memory. The last statement triggers the
computation. The above statements shall be run after executing
``forwardModel_v2`` on the ``chain`` object.
One can then show a slice (here of the x-component), and the check the
compatibility with the density field:
.. code:: python3
plt.imshow(out_v[0,:,:,chain.getOutputBoxModel().N[2]//2])
plt.show()
Computing some bias models directly
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PyBORG has a submodule called "bias" which provides a direct route to
some of the bundled bias models (in C++ those are the generic bias
models). Not all models are linked in though. The usage is relatively
straightforward. We will use EFTBiasDefault as an example:
.. code:: python3
import numpy as np
import borg
boxm = borg.forward.BoxModel()
model = borg.forward.models.HadesLinear(boxm, 0.1, 1.0)
bias_model = borg.bias.EFTBiasDefault(0.1)
density = np.random.normal(size=boxm.N)
biased_density = np.zeros(boxm.N)
params = np.ones(7)
bias_model.compute(model, 1.0, params, density, biased_density)
The example starts by loading the ``borg`` module. Then we just
construct a forward model element for the example using ``HadesLinear``.
In your code that should be a reasonable element that you used to
produce the matter density field. The bias model may try to discuss
directly with that element so it is a good practice to really provide
meaningful elements. Then we construct a bias model object
``EFTBiasDefault``. This one has a mandatory argument to specify the
``Lambda`` parameter in that specific model, which we set to
:math:`0.1h \mathrm{Mpc}^{-1}` here. The next steps are just
initialization of the field used for ``bias_model.compute``. As can be
directly inferred from the call the following arguments are required:
- a borg forward model (``model``)
- the value of nmean (though it could be ignored depending on the
specific bias model)
- a 1d numpy array of float64 for the parameters of the model
- the 3d density contrast (``density``)
- the output 3d biased density (``biased_density``)
Running with MPI
----------------
Using MPI requires some care that is not completely handled
automatically.
One may initialize the python with MPI like this:
.. code:: python3
import numpy as np
import borg
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
In rank and size you will now have the rank of the current process in
the MPI communicator, and size will hold the total size. Then a typical
initialization chain of forward models may be constructed as indicated
:ref:`there <building_your_first_chain>`. Assuming that chain is such an
object one may query the expected slabs with ``getMPISlice()``
(for the input) and ``getOutputMPISlice()`` (for the output):
.. code:: python3
startN0,localN0,in_N1,in_N2 = chain.getMPISlice()
out_startN0,out_localN0,out_N1,out_N2 = chain.getOutputMPISlice()
These may be used like this:
.. code:: python3
x = np.zeros((localN0,in_N1,in_N2))
if localN0 > 0:
x[:,:,:] = ref_data[startN0:(startN0+localN0),:,:]
with ``ref_data`` some array that covers the entire box. As you can see,
the ``x`` array requires only the part between startN0 and
startN0+localN0 of that array. In practice that array (``ref_data``) may
not have to exist in memory.
Then ``x`` may be directly provided to ``chain.forwardModel_v2`` as a
first argument. The output density field follow the same rule as the
input density field.
Writing a new forward model
---------------------------
The interface of the forward model in python closely follow the one in
C++. The basic skeleton is given by the following lines of code:
.. code:: python3
import jax
class MyModel(borg.forward.BaseForwardModel):
# Constructor
def __init__(self, box):
super().__init__(box, box)
# IO "preferences"
def getPreferredInput(self):
return borg.forward.PREFERRED_REAL
def getPreferredOutput(self):
return borg.forward.PREFERRED_REAL
# Forward part
def forwardModel_v2(self, input_array):
self.save = jax.numpy.array(input_array)
def getDensityFinal(self, output_array):
output_array[:] = self.save**2
# Adjoint part
def adjointModel_v2(self, input_ag):
self.ag = input_ag
def getAdjointModel(self, output_ag):
output_ag[:] = 2 * self.ag * self.save
There are four main group in the function that needs be implemented:
- the constructor. It is crucial that the constructor of the parent is
explicitly called. Otherwise the interface will not work. The parent
constructor takes two argument: the input box (of type
``borg.forward.BoxModel``) and the output box (same type).
- the function providing the "preferred IO" for the forward and adjoint
functions. In practice the preferrence is enforced for python. This
means that the value indicated here will change the kind of arrays
that are provided to the forward and adjoint part. At the moment two
type of IO are possible:
- PREFERRED_REAL: the model wants a 3d real space representation as
an argument
- PREFERRED_FOURIER: the model wants a 3d fourier space
representation as an argument
- then the forward evaluation part itself has to be implemented in two
pieces: ``forwardModel_v2`` and ``getDensityFinal`` (it is optional
depending on what is put after that model). It is expected that
``forwardModel_v2`` executes the main part of the computation but it
is not fully required.
- finally the computation of the adjoint gradient follows the same
pattern as the forward computation. The difference is that the types
and shapes of arrays are reversed. ``input_ag`` has a shape/type
corresponding to the **output** and ``output_ag`` to the **input**.
Finally, as shown above, the input/output array are using a numpy
interface. They can thus be used in JAX/Tensorflow/whatever. In the
example code above the input array is saved in a jax array and evaluated
later. This is legit, though bear in mind that means there will be
memory that will not be freed while you retain that reference.
.. _building_python_likelihood_script:
Build a python likelihood script
--------------------------------
Ini file
~~~~~~~~
.. code:: text
[python]
likelihood_path=test_likelihood.py
bias_sampler_type=slice
The hades_python initializers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A typical python likelihood requires three initialization function. They
must be registered using the helper decorators
borg.registerGravityBuilder (for the forward model),
borg.registerLikelihoodBuilder (for the bias+likelihood part),
borg.registerSamplerBuilder (for extra sampling strategies).
An example of their use is the following piece of code:
.. code:: python3
import borg
@borg.registerGravityBuilder
def build_gravity_model(state, box):
global model
chain = borg.forward.ChainForwardModel(box)
chain.addModel(borg.forward.models.HermiticEnforcer(box))
chain.addModel(borg.forward.models.Primordial(box, 1.0))
chain.addModel(borg.forward.models.EisensteinHu(box))
model = chain
return chain
@borg.registerLikelihoodBuilder
def build_likelihood(state, info):
boxm = model.getBoxModel()
return MyLikelihood(model, boxm.N, boxm.L)
@borg.registerSamplerBuilder
def build_sampler(state, info):
return []
The build_gravity_model function returns a BORGForwardModel object, and
take a MarkovState and a BoxModel as parameters. The build_likelihood
function must return a Likelihood3d object (check
help(borg.likelihood.Likelihood3d)). Finally build_sampler must return a
list of sampler object.
The forward model elements can be either the C++ or Python object and
both work transparently. Likelihoods may also be written in pure python
though MPI is still untested at this time (August 2020).
Writing a likelihood
~~~~~~~~~~~~~~~~~~~~
In the previous section we have seen how to build the objects required
by hades_python to analyze data. We have not approached how to write a
likelihood in python. A lot of likelihood and bias are already available
from the C++ side, for example ``borg.likelihood.GaussianPassthrough``,
``borg.likelihood.GaussianLinear`` or
``borg.likelihood.PoissonPowerLaw``. To create new ones easily in
python, one has to write a class inheriting from
``borg.likelihood.BaseLikelihood`` and implement a number of functions.
An example of a simple gaussian likelihood is shown herebelow:
.. code:: python3
import borg
cons = borg.console()
myprint = lambda x: cons.print_std(x) if type(x) == str else cons.print_std(
repr(x))
class MyLikelihood(borg.likelihood.BaseLikelihood):
def __init__(self, fwd, N, L):
myprint(f" Init {N}, {L}")
super().__init__(fwd, N, L)
def initializeLikelihood(self, state):
myprint("Init likelihood")
self.data = state['galaxy_data_0']
state.newArray3d("my_density_field", True, self.data.shape[0],
self.data.shape[1], self.data.shape[2])
def updateMetaParameters(self, state):
cpar = state['cosmology']
myprint(f"Cosmology is {cpar}")
self.getForwardModel().setCosmoParams(cpar)
def generateMockData(self, s_hat, state):
fwd = self.getForwardModel()
output = np.zeros(fwd.getOutputBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
state['galaxy_data_0'][:] = output + np.random.normal(
size=output.shape) * sigma_noise
state['my_density_field'][:] = output
like = ((state['galaxy_data_0'][:] - output)**2).sum() / sigma_noise**2
myprint(
f"Initial log_likelihood: {like}, var(s_hat) = {np.var(s_hat)}")
def logLikelihoodComplex(self, s_hat, gradientIsNext):
fwd = self.getForwardModel()
output = np.zeros(fwd.getBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
L = 0.5 * ((output - self.data)**2).sum() / sigma_noise**2
myprint(f"var(s_hat): {np.var(s_hat)}, Call to logLike: {L}")
return L
def gradientLikelihoodComplex(self, s_hat):
fwd = self.getForwardModel()
output = np.zeros(fwd.getOutputBoxModel().N)
fwd.forwardModel_v2(s_hat)
fwd.getDensityFinal(output)
mygradient = (output - self.data) / sigma_noise**2
fwd.adjointModel_v2(mygradient)
mygrad_hat = np.zeros(s_hat.shape, dtype=np.complex128)
fwd.getAdjointModel(mygrad_hat)
return mygrad_hat
The function ``myprint`` is an helper to create nice output that streams
correctly with the rest of the C++ code. It is not mandatory but
strongly recommended to use the borg.console() object as it will
seemlessly integrate with other BORG tools.
We will now look at each function one after the other:
- ``__init__`` is the constructor. It is crucial that the base
constructor is called in the constructor of the new class: it will
not be done implicitly by the python virtual machine. The base
constructor takes a ``BORGForwardModel`` object, and the grid
specifications ``N`` and ``L`` as tuples.
- ``initializeLikelihood`` is called at the initialization of the chain
and before restoration. If you want to store additional fields in the
mcmc, you should allocate them at that moment in the state object. In
the above example, a new 3d array is allocated to store the density
field after the forward model evaluation.
Note that the forward model has to be evaluated in the log likelihood
and its gradient. Though it is in principle required to implement
logLikelihood and gradientLikelihood (the real counterpart of the
complex functions hereabove), in practice they are not used for the run.
More python jupyter tutorials
-----------------------------
.. toctree::
:maxdepth: 1
extras/python-jupyter/PM-tCOLA
* A notebook to showcase tCOLA and its convergence by considering at :math:`P(k)` is here__.
__ extras/python-jupyter/PM-tCOLA.ipynb

View file

@ -0,0 +1,4 @@
virbius
=======
*To be written...*