Tutorial: Cluster Finding in Simulated Simons Observatory Maps
The examples/SOSims directory contains scripts for working with the simulated Simons Observatory maps made by the Map-Based Simulation Pipeline Working Group. More tools for handling the simulated maps may be available elsewhere, but here you should be able to find enough to get you going with running Nemo on them.
Making multi-component simulated maps
The map-based simulations are provided in HEALPix format, with one map
for each component (CMB, CIB, tSZ, noise etc.) at each frequency. The
combineComponentMaps.py
script can be used to add these
together. To use this, you will first have to download the needed
simulation maps and
edit healpixSimsDir
in the script to point to the directory where
these can be found. You can then simply run this script with:
python combineComponentMaps.py
As provided, this script will make combined maps at four frequencies (93, 145, 225, 280 GHz) featuring CMB, CIB, tSZ, and noise.
Making CAR projection maps
Nemo runs on FITS images that have a valid World
Coordinate System (WCS) in the header. For Simons Observatory, these
use the plate carrée (CAR) projection. However, the map-based
simulations are currently provided in HEALPix format. The
healpix2CAR.py
script (a modified version of a script by
Mathew Madhavacheril taken from the TILe-C
package) can be used to re-project these to CAR. Look at the
HEALPIX_TO_CAR.sh
script to see examples of its use.
As one of its inputs, the healpix2CAR.py
takes a plain-text file
containing the desired output image header. This defines both the sky
area and the pixel scale. There are a few examples provided:
template_small.txt
: Corresponds to a subsection of the ACT D56 field.template_E-D56.txt
: Contains the full ACTPol E-D56 field footprint.template_AdvACT.txt
: The pixelization used for AdvACT maps. To use Nemo on very large maps (as intemplate_AdvACT.txt
), you will need to set-up the configuration file to break the map into tiles (see below).
If you look at the HEALPIX_TO_CAR.sh
script, you will see that it uses an option
to add a small amount of additional white noise to the output CAR maps.
This is currently being used to avoid the effect of ringing in the CAR maps
(which arises because the HEALPix maps are lower resolution than the output
CAR maps - when NSIDE = 8192 maps are available, this may not be a problem).
With the additional white noise, the matched-filtered maps made by Nemo from
these simulations look okay when viewed with, e.g.,
DS9.
Making beam files
Nemo understands beams given in a simple plain-text format, as used by ACT. For the map-based simulations, Gaussian beams are used with FHWM set according to the values given in Table 1 of the Simons Observatory: Science Goals and Forecasts paper. To generate the necessary beam files, run:
MAKE_BEAMS.sh
which in turn calls the makeGaussianBeam.py
script.
Running Nemo on small maps
After running the above scripts, you should now have a directory that contains
CAR-projected maps (e.g., TOnly_la093_CAR.fits
, TOnly_la_093_small_CAR.fits
,
etc.) and beam files (e.g., beam_gaussian_la093.txt
). Various configuration
files are provided that you can use to produce matched-filtered maps and catalogs
using Nemo - the one that seems to work the best currently uses only
three frequencies (93, 145, 225 GHz). You can run this with:
nemo MFMF_SOSim_3freq_small.yml
After this finishes, you will find matched-filtered maps under
MFMF_SOSim_3freq_small/filteredMaps/
, while
MFMF_SOSim_3freq_small/MFMF_SOSim_3freq_small_optimalCatalog.fits
contains the
candidate cluster catalog. Note that this config file filters the maps at only
one scale - you can filter at multiple scales by uncommenting the entries under
mapFilters
in MFMF_SOSim_3freq_small.yml
.
A configuration file is also supplied for running Nemo on the nominal full SO survey footprint, by breaking the map into tiles (see below).
Extracting the survey mass limit
If the calcSelFn
parameter set to True
in the Nemo configuration file,
or if nemo
is run using the -S
switch, the main nemo
script will
calculate and output estimates of the 90% mass
completeness threshold at some user-set signal-to-noise level (e.g., S/N > 5).
You will find various plots related to the survey completeness as a
function of redshift under MFMF_SOSim_3freq_small/diagnostics/
, including a
mass-limit map. These estimates are subject to the assumed scaling relation
parameters defined in the configuration file.
If necessary, you can re-run this part using the nemoSelFn
script:
nemoSelFn MFMF_SOSim_3freq_small.yml
The limits within the footprints of other surveys that intersect with the
filtered maps can be obtained by supplying survey mask files in the
selFnFootprints
parameter in the configuration file (these are commented out
in MFMF_SOSim_3freq_small.yml
).
Running Nemo on large maps
To do this, you will first need to output large area CAR maps using (for example)
the healpix2CAR.py
script with template_AdvACT.txt
(just running
HEALPIX_TO_CAR.sh
will do this). You will then need to use a config file that
tells Nemo how to break the map up into tiles, such as the supplied
MFMF_SOSims_3freq_tiles.yml
. The parameters that control the tiling used can
be found at the end of this file. In this case, the automatic tiling algorithm
will attempt to make tiles that cover 10 x 5 degrees on the sky, with a 1
degree overlap between the tiles.
The tiling algorithm makes use of a “survey mask”, which defines the region
that Nemo will search for clusters (or sources). The script
createSurveyMask.py
generates the survey mask FITS image, taking as input
a DS9 region file containing polygon-shaped regions (see
surveyMask.reg
), and a plain-text file that defines the image header
(template_AdvACT.txt
). You can run this with:
python createSurveyMask.py template_AdvACT.txt surveyMask.reg
This writes the output survey mask to surveyMask.fits
. You will need this
file to run Nemo using the included MFMF_SOSims_3freq_tiles.yml
configuration.
You can run Nemo in parallel using, e.g.,
mpiexec -np $NUM_PROCESSES nemo MFMF_SOSim_3freq_tiles.yml -M
replacing $NUM_PROCESSES
with the number of cores you want to run on. Nemo
will divide up the tiles between processors as evenly as it can. The file
slurm_nemo.sh
shows how Nemo can be run on a cluster that uses the
Slurm job scheduler.
Using the input simulation catalogs
The halo catalog for the WebSky simulations is 33 Gb in size. You can obtain a smaller version (28 Mb; just containing halos more massive than 1014 MSun), using
wget https://acru.ukzn.ac.za/~mjh/halos.fits https://acru.ukzn.ac.za/~mjh/halos.reg
(this fetches a DS9 region file as well). These were produced using readWebSkyInputCatalog.py
(which is a
modified version of the WebSky readhalos.py
script).
The nemoMass
script can be used to obtain mass estimates for cluster candidates, but
requires a table of redshifts to match against (by object name - at least for the moment).
To produce the redshifts.fits
catalog referred to by the MFMF_SOSims_3freq_tiles.yml
configuration file, you can run this,
python makeRedshiftsCatalog.py MFMF_SOSim_3freq_tiles/MFMF_SOSim_3freq_tiles_optimalCatalog.fits halos.fits
You will then be able to run nemoMass
with:
mpiexec -np $NUM_PROCESSES nemoMass MFMF_SOSim_3freq_tiles.yml -M
again, replacing $NUM_PROCESSES
with the number of cores you want to run on. The
slurm_mass.sh
scripts shows to run this using Slurm
(this takes less than 3 minutes for a catalog of ~30,000 clusters with the settings given).
Note that the cosmological and scaling relation parameters set in the massOptions
section of
both of the example configuration files given here (MFMF_SOSim_3freq_tiles.yml
and
MFMF_SOSim_3freq_small.yml
) have been set to approximately reproduce those
used in the WebSky simulations.