No description
Find a file
2020-07-14 17:42:10 -04:00
map2map Add memmap to numpy data loading 2020-07-14 17:42:10 -04:00
scripts Add memmap to numpy data loading 2020-07-14 17:42:10 -04:00
.gitignore Initial commit 2019-11-30 15:27:25 -05:00
LICENSE Initial commit 2019-11-30 15:27:25 -05:00
README.md Add cropping anchors controlled by start, stop, step 2020-07-11 00:39:19 -04:00
setup.py Add memmap to numpy data loading 2020-07-14 17:42:10 -04:00

map2map

Neural network emulators to transform field/map data

Installation

Install in editable mode

pip install -e .

Usage

The command is m2m.py in your $PATH after installation. Take a look at the examples in scripts/*.slurm. For all command line options look at map2map/args.py or do m2m.py -h.

Data

Put each field in one npy file. Structure your data to start with the channel axis and then the spatial dimensions, e.g. (2, 64, 64) for a 2D vector field of size 64^2 and (1, 32, 32, 32) for a 3D scalar field of size 32^3. Specify the data path with glob patterns.

During training, pairs of input and target fields are loaded. Both input and target data can consist of multiple fields, which are then concatenated along the channel axis.

Data cropping

If the size of a pair of input and target fields is too large to fit in a GPU, we can crop part of them to form pairs of samples. Each field can be cropped multiple times, along each dimension. See --crop, --crop-start, --crop-stop, and --crop-step. The total sample size is the number of input and target pairs multiplied by the number of cropped samples per pair.

Data normalization

Input and target (output) data can be normalized by functions defined in map2map2/data/norms/. Also see Customization.

Model

Find the models in map2map/models/. Modify the existing models, or write new models somewhere and then follow Customization.

Training

Files generated

  • *.out: job stdout and stderr
  • state_{i}.pt: training state after the i-th epoch including the model state
  • checkpoint.pt: symlink to the latest state
  • runs/: directories of tensorboard logs

Tracking

Install tensorboard and launch it by

tensorboard --logdir PATH --samples_per_plugin images=IMAGES --port PORT
  • Use . as PATH in the training directory, or use the path to some parent directory for tensorboard to search recursively for multiple jobs.
  • Show IMAGES images, or all of them by setting it to 0.
  • Pick a free PORT. For remote jobs, do ssh port forwarding.

Customization

Models, criteria, optimizers and data normalizations can be customized without modifying map2map. They can be implemented as callbacks in a user directory which is then passed by --callback-at. The default locations are searched first before the callback directory. So be aware of name collisions.

This approach is good for experimentation. For example, one can play with a model Bar in path/to/foo.py, by calling m2m.py with --model foo.Bar --callback-at path/to.