Emulating Galaxy Clusters

Title: Emulating Sunyaev-Zeldovich Images of Galaxy Clusters using Auto-Encoders

Authors: Tibor Rothschild, Daisuke Nagai, Han Aung, Sheridan B. Green, Michelle Ntampaka and John ZuHone

First Author’s Institution: Department of Physics, Yale University, New Haven, CT

Status: Submitted to MNRAS (open access on arXiv)

Galaxy clusters are among the largest gravitationally-bound structures in the known Universe. They typically contain hundreds of galaxies and play host to extremely massive dark matter haloes. It was Fritz Zwicky who coined the term “dark matter” in 1933 after observing the anomalously high velocities of galaxies in the Coma Cluster. Knowing the mass of the dark matter halos of galaxy clusters, hereafter referred to as the halo mass, can reveal the halo mass function: an important diagnostic for constraining cosmological parameters. The environment within a galaxy cluster is so energetic that it is bathed in a superheated plasma of ionised gas called the intracluster medium, or ICM.

One way to indirectly infer the halo mass of a galaxy cluster is to observe how the ICM interacts with cosmic microwave background (CMB) radiation. In particular, the low energy photons of the CMB can be scattered (via inverse Compton scattering) to higher energies by the high-energy electrons present in the hot ICM. This is the Sunyaev-Zeldovich (SZ) effect, and results in small discrepancies in the observed CMB spectrum. The combined SZ effect integrated over the entire cluster (denoted YSZ) is proportional to the halo mass via the so-called YSZ – M relation. However, modelling the SZ effect to obtain SZ maps is a difficult task for simulations, often requiring computationally expensive hydrodynamic simulations at sufficient resolutions to account for the complex baryonic physics within the ICM. In particular, the YSZ – M relation is sensitive to the mass accretion rates inside the cluster. The authors of today’s paper propose a new approach to generate new, synthetic SZ maps using machine learning.

Toward a Generative Model

The aim of machine learning is for a model to learn an abstract representation of some training data in order to carry out some task, such as regression, classification or, in the case of this paper, generation. Generative modelling seeks to create new data that resembles the training data as closely as possible. One of the techniques for achieving this is to use a conditional variational autoencoder (CVAE). There are two components: the encoder, trained to represent the data in some abstract parameter space, and the decoder, which uses a random seed in this parameter space to construct a new image. Both the encoder and decoder are convolutional neural networks, which are especially adept at extracting abstract features from images.

Figure 1: Schematic representation of the CVAE. The training SZ map is inputted to the encoder (shaded in red), along with values for the mass and mass accretion rate. The decoder (shaded in green) then creates a new, synthetic SZ map. (Figure 1 in the paper).

The training data consists of known SZ maps from the IllustrisTNG hydrodynamical simulations, specifically the TNG300 run. The model is trained with the image data, along with the known values of halo mass and mass accretion rate (MAR). Once fully trained, the decoder is able to generate new images given some desired mass and MAR values.

Morphological Replicator

Figure 2: Comparison of synthetic SZ maps generated by the CVAE model (left panel) with real, training SZ maps from IllustrisTNG (right panel). Figure 2 in the paper.

Figure 2 gives a side-by-side comparison of synthetic SZ maps, along with the real SZ maps used to train the model, where there is excellent visual agreement. To ensure that these synthetic maps are new (i.e. not just duplicates of the training data), the authors define a similarity metric to determine their closest matches to samples in the training. This also helps prevent the model from overfitting the training data.

Emulating the Physics

It’s one thing for a model to generate look-alikes. What matters most is whether the model can reproduce the physics, especially given the strong effect that baryonic physics and mass accretion has on the nature of the YSZ – M relation. The authors find that both the slope and scatter of the YSZ – M relation from the synthetic SZ maps agrees remarkably well with that of the IllustrisTNG training data, as shown in Figure 3.

Figure 3: (Left panel) Comparison of the best-fit YSZ – M relation for actual clusters and generated clusters (real vs. synthetic SZ maps). (Right panel) Distribution of residuals for true and generated clusters. The curve denotes the best-fit Gaussian distribution, from which the scatter and standard deviation can be derived. Combination of Figures 3 and 4 in the paper.

The model also reflects more nuanced physics. At a given, fixed mass, the higher the accretion rate, the smaller the YSZ value (due to the higher non-thermal pressure resulting from the accretion). This should manifest statistically with higher YSZ – M residuals having lower mass accretion rates, which the model accurately reflects.

Synthetic Simulations

Among the major benefits of machine-learning models is that there is no need to explicitly hard-code any initial conditions. This is since the CVAE is model-independent and learns by itself (so-called unsupervised learning) from the training data. Another benefit is efficiency; a model that can generate synthetic SZ maps without needing to run a computationally expensive cosmological simulation is especially valuable for large-scale studies, such as comparisons with observations. In particular, the authors suggest that these synthetic SZ maps can help act as statistical priors for observational studies into the cosmic microwave background, such as the upcoming CMB-HD millimetre survey. This may help to constrain cosmological parameters, ultimately refining our understanding of the Universe.

Featured image credit: NASA
Astrobite edited by: Alex Gough

About Mitchell Cavanagh

Mitchell is a PhD student in astrophysics at the University of Western Australia. His research is focused on the applications of machine learning to the study of galaxy formation and evolution. Outside of research, he is an avid bookworm and enjoys gaming, languages and code jams.

Leave a Reply