Galactic Alchemy

From Radio-Observables to Simulations and back again

SKA research at
Zurich University of Applied Sciences (ZHAW)

Centre for Artificial Intelligence (CAI)
Institute for Business Information Technology (IWI)
CAI Seminar
Sept 18, 2025
contact_qr.png Philipp Denzel, Yann Billeter, Frank-Peter Schilling, Elena Gavagnin

Generative AI for galaxy (component) emulation

domain_translation_scheme.png

Outlook

SKACH project

  • SKA project (funding: SERI/SKACH, collab: IVS & IWI):
    • SKACH & Square Kilometer Array observatory
    • generative modelling of plausible radio skies
    • my interests:
      • generative deep learning,
        probabilistic models,
        simulation-based inference
      • galactic evolution, dark matter,
        gravitational lensing

zhaw_skach_team_25.jpg

Figure 1: SKACH team at the Swiss SKA Days hosted by ZHAW (25-27 Aug 2025)

SKACH after 4 years

skach_summary_2024.png

Figure 2: Credit: SKACH

One Observatory, Two Telescopes, Three Sites

SKAO_Global_Map_August_2025.png

Figure 3: Credit: SKAO

SKA project in numbers

ska_project_in_numbers.png

Figure 4: Credit: SKAO

SKA-Mid in South Africa


Credit: SKAO

Mid Construction


Credit: SKAO

SKA-Low in Western Australia


Credit: SKAO

Science goals

ska_science_goals.png

Figure 5: Credit: SKAO

How to build galaxies

recipe_galaxy.jpg

Figure 6: recipe for galaxies as imagined by GPT5

Noise as starting point (just as in diffusion models)

cmb2D_5e-4.png

Figure 7: 2006, Credit: IllustrisTNG

Dark matter takes over


Credit: IllustrisTNG

Simulations as an expression of theory

  • complex, realistic models
  • self-consistent dynamics
  • physics: on a wide range of scales
  • implicit models:
    • what if we want to sample more
      of those galaxy models?

TNG300_compilation_with_radio_halos_2k.png

Figure 8: IllustrisTNG simulations

The cost of IllustrisTNG

Stated usage from Nelson et al (2017):

  • CPU core time: 55 Mh
  • on Hazel Hen (Cray XC40: typically 0.5kW per 24 core node)

So, approx. 2.29M node hours @ 0.5kW \(\rightarrow\) 1+ GWh (570'000 kg CO2e)

(A)I can do better

Our model suite ran on a mix of Nvidia V100/A100/H100/H200 GPUs

  • GAN-based models required 140.25 kWh for training (70 kg CO2e)
    • inference: ~1 kWh for inference
  • diffusion-based models required 520.25 kWh for training (260 kg CO2e)
    • inference: double the amount

Multi-domain galaxy image dataset

  • projected Illustris TNG50-1 galaxies
  • 7 domains: dark-matter, stars, gas,
    HI, temperature, magnetic field, 21cm
  • ∼ 3'000+ galaxies, 6 snapshots,
    4 rotations in 3D, ∼ 504'000 images
  • each galaxy avg. ∼ 100'000+ particles

domains_directions.png

Generative Deep Learning Models

  • conditional GANs (generative adversarial networks)
  • diffusion-based models
  • combination of both

conditional GANs

pix2pix_schema.png

DDPM

skais_diffusion_schema.png

Sampling from the models

(input, simulation, DDPM generated)

GasDM.inference_batch.0027.png

Figure 9: Gas ⟶ DM

(input, simulation, GAN generated)

GasStar.inference_batch.0023.png

Figure 10: Gas ⟶ Stars

(input, simulation, DDPM generated)

GasHI.inference_batch.0023.png

Figure 11: Gas ⟶ HI

(input, simulation, GAN generated)

Gas21cm.inference_batch.0023.png

Figure 12: Gas ⟶ mock 21cm brightness temperature

(input, simulation, DDPM generated)

GasTemp.inference_batch.0027.png

Figure 13: Gas ⟶ temperature

(input, simulation, GAN generated)

GasBF.inference_batch.0029.png

Figure 14: Gas ⟶ magnetic field strength

Does measuring quality = plausibility?

  • Pixel-level CV metrics do NOT work well for this:
    • MSE (mean squared error): \[ \text{MSE}\left(x, \hat{x}\right) = \frac{1}{N} \sum_{i=1}^{N} \left(x_i - \hat{x}_i\right)^2 \]
    • PSNR (peak signal noise ratio): \[ \text{PSNR}\left(x, \hat{x}\right) = 10 \cdot \log_{10} \left( \frac{\text{c}^2}{\text{MSE}\left(x, \hat{x}\right)} \right) \]
    • SSIM (structural similarity index measure): \[ \text{SSIM}\left(x, \hat{x}\right) = \frac{\left(2\mu_x\mu_{\hat{x}} + k_1\right)\left(2\sigma_{x\hat{x}} + k_2\right)}{\left(\mu_x^2 + \mu_{\hat{x}}^2 + k_1\right)\left(\sigma_x^2 + \sigma_{\hat{x}}^2 + k_2\right)} \]

Perceptual metrics

  • Fréchet Inception Distance: \[ \|\mu_r - \mu_g\|^2 + \text{Tr}\left(\Sigma_r + \Sigma_g - 2(\Sigma_r \Sigma_g)^{1/2}\right) \]
    • where \(\mu\) and \(\Sigma\) are mean and standard deviation
      of features extracted from neural networks (InceptionV3)
  • or LPIPS (Learned Perceptual Image Patch Similarity)

Astronomical/astrophysical metrics

  • structural astronomical CAS parameters by Conselice (2003)
    • Asymmetry: compare original and 180-degree-rotated image
    • Smoothness/Clumpiness: compare original and Gaussian-blurred image
    • Concentration: Means of spatial distributions within fixed radii
  • Centre of mass drift
  • Radially averaged profiles
  • Integrated quantities
  • Power spectra

Asymmetry

asymmetry_Conselice_2003.png

Figure 15: I is the original map and R the rotated map;
Asymmetry parameter by Conselice (2003);

Clumpiness

clumpiness_Conselice_2003.png

Figure 16: I is the original map and B the blurred map;
Clumpiness parameter by Conselice (2003)

Concentration

concentration_Conselice_2003.png

Figure 17: (we use only 2$×$r50 as proxy for our metric)
Concentration parameter by Conselice (2003)

Asymmetry deviation (between simulations and GAN-generated)

Gas21cm_GAN_asymmetry.png

Figure 18: Mean asymmetry deviation of the evaluation set (mock 21cm temperature);
Denzel et al. (in prep.)

GasStar_GAN_asymmetry.png

Figure 19: Mean asymmetry deviation of the evaluation set (stellar mass);
Denzel et al. (in prep.)

Clumpiness deviation (between simulations and GAN-generated)

GasDM_GAN_clumpiness.png

Figure 20: Mean clumpiness deviation of the evaluation set (Gas ⟶ DM);
Denzel et al. (in prep.)

DMGas_GAN_clumpiness.png

Figure 21: Mean clumpiness deviation of the evaluation set (DM ⟶ Gas);
Denzel et al. (in prep.)

Centre-of-mass drift (from simulations to DDPM-generated)

GasHI_DDPM_com_dist.png

Figure 22: Mean clumpiness deviation of the evaluation set (Gas ⟶ HI);
Denzel et al. (in prep.)

GasStar_DDPM_com_dist.png

Figure 23: Mean clumpiness deviation of the evaluation set (Gas ⟶ Stars);
Denzel et al. (in prep.)

Integrated quantities (concentration proxy)

integrated_quantities.png

Figure 24: Denzel et al. (in prep.)

Our Findings

  • Pixel-based metrics work only to a degree, but are
    insensitive to nuances determining physical plausibility
  • Perceptual metrics (such as FID) correlate strongly with astrophysics
  • Updated and tuned GAN architecture matches performance of diffusion models

What's next

  • Investigate perceptual metrics (LPIPS): interpretability?
  • Integrate digital-twin simulations of SKA telescope systematics
  • Expand domain translation from 2D to 3D
  • AI-enhancements for simulations on-the-fly
    (see PASC project ARTS4SKA project)
  • Plausible galaxy sampler for gravitational lens modelling
    (collab with UZH)

References & Contact

https://phdenzel.github.io/

talk_qr.png



Email: philipp.denzel@zhaw.ch

Created by phdenzel.