# map2map Neural network emulators to transform field/map data * [Installation](#installation) * [Usage](#usage) * [Data](#data) * [Data normalization](#data-normalization) * [Model](#model) * [Training](#training) * [Files generated](#files-generated) * [Tracking](#tracking) * [Customization](#customization) ## Installation Install in editable mode ```bash pip install -e . ``` ## Usage The command is `m2m.py` in your `$PATH` after installation. Take a look at the examples in `scripts/*.slurm`. For all command line options look at `map2map/args.py` or do `m2m.py -h`. ### Data Put each field in one npy file. Structure your data to start with the channel axis and then the spatial dimensions, e.g. `(2, 64, 64)` for a 2D vector field of size `64^2` and `(1, 32, 32, 32)` for a 3D scalar field of size `32^3`. Specify the data path with [glob patterns](https://docs.python.org/3/library/glob.html). During training, pairs of input and target fields are loaded. Both input and target data can consist of multiple fields, which are then concatenated along the channel axis. #### Data cropping If the size of a pair of input and target fields is too large to fit in a GPU, we can crop part of them to form pairs of samples. Each field can be cropped multiple times, along each dimension. See `--crop`, `--crop-start`, `--crop-stop`, and `--crop-step`. The total sample size is the number of input and target pairs multiplied by the number of cropped samples per pair. #### Data normalization Input and target (output) data can be normalized by functions defined in `map2map2/data/norms/`. Also see [Customization](#customization). ### Model Find the models in `map2map/models/`. Modify the existing models, or write new models somewhere and then follow [Customization](#customization). ### Training #### Files generated * `*.out`: job stdout and stderr * `state_{i}.pt`: training state after the i-th epoch including the model state * `checkpoint.pt`: symlink to the latest state * `runs/`: directories of tensorboard logs #### Tracking Install tensorboard and launch it by ```bash tensorboard --logdir PATH --samples_per_plugin images=IMAGES --port PORT ``` * Use `.` as `PATH` in the training directory, or use the path to some parent directory for tensorboard to search recursively for multiple jobs. * Show `IMAGES` images, or all of them by setting it to 0. * Pick a free `PORT`. For remote jobs, do ssh port forwarding. ### Customization Models, criteria, optimizers and data normalizations can be customized without modifying map2map. They can be implemented as callbacks in a user directory which is then passed by `--callback-at`. The default locations are searched first before the callback directory. So be aware of name collisions. This approach is good for experimentation. For example, one can play with a model `Bar` in `path/to/foo.py`, by calling `m2m.py` with `--model foo.Bar --callback-at path/to`.