The CPU issue is the most important to do for this code. One has to think seriously about a
method to divide the sky to do simultaneous calculations on different and independent
CPU/memory boxes, on the grid or other systems.
If this method is not possible (correlations or other), one has to think about a parallel computing
machines usage.

Test with pp

With pp we can parallelize loops in python.
We parallelize at the level of the detectors, that say we run the computation of the 16 detectors of the focal plan in the same time.

Using small simulation (~100 sources), the gain is only a factor 5 when splitting it upon 16 CPU’s.
This feature could be explained by the fact that most of the time the code is more doing IO that pure CPU.
Using bigger simulations caused some memory issue...

Test with mpi

TODO: Parallelize in the c code at the level of the sources.