Bridging simulations and observations: new insights into galaxy formation simulations via out-of-distribution detection and Bayesian model comparison : evaluating galaxy formation simulations under limited computing budgets and sparse dataset sizes
Context: Cosmological simulations are a powerful tool for advancing our understanding of galaxy formation. A question that naturally arises in light of high-quality observational data is the closeness of the models to reality. Because of the high-dimensionality of the problem, many previous studies...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article (Journal) |
| Language: | English |
| Published: |
04 September 2025
|
| In: |
Astronomy and astrophysics
Year: 2025, Volume: 701, Pages: 1-18 |
| ISSN: | 1432-0746 |
| DOI: | 10.1051/0004-6361/202453399 |
| Online Access: | Verlag, kostenfrei, Volltext: https://doi.org/10.1051/0004-6361/202453399 Verlag, kostenfrei, Volltext: https://www.aanda.org/articles/aa/abs/2025/09/aa53399-24/aa53399-24.html |
| Author Notes: | Lingyi Zhou, Stefan T. Radev, William H. Oliver, Aura Obreja, Zehao Jin, and Tobias Buck |
| Summary: | Context: Cosmological simulations are a powerful tool for advancing our understanding of galaxy formation. A question that naturally arises in light of high-quality observational data is the closeness of the models to reality. Because of the high-dimensionality of the problem, many previous studies evaluated galaxy simulations using simplified summary statistics. Aims: We combine a simulation-based Bayesian model comparison with a novel mis-specification detection technique to compare galaxy images of six hydrodynamical models from the NIHAO and IllustrisTNG simulations against observations from SDSS. Methods: Since cosmological simulations are computationally costly, we first trained a k-sparse variational autoencoder on the abundant dataset of SDSS images. The variational autoencoder learned to extract informative latent embeddings and delineated the typical set of real images. To reveal simulation gaps, we performed out-of-distribution detection based on the logit functions of classifiers trained on the embeddings of simulated images. Finally, we performed an amortized Bayesian model comparison using a probabilistic classification to identify the relatively best-performing model along with partial explanations through SHapley Additive exPlanations values (SHAP). Results: We find that all six models are mis-specified compared to SDSS observations and can only explain part of reality. The relatively best-performing model comes from the standard NIHAO simulations without active galactic nucleus physics. Based on our inspection of the SHAP-values, we find that the main difference between NIHAO and IllustrisTNG is given by color and morphology. NIHAO is redder and clumpier than IllustrisTNG. Conclusions: By using explainable AI methods such as SHAP values in combination with innovative methods from a simulation-based Bayesian model comparison and new mis-specification detection techniques, we were able to quantitatively compare costly hydrodynamical simulations with real observations and gain physical intuition about the quality of the simulation models. Hence, our new methods help to explain which physical aspects of a particular simulation cause the simulation to match real observations better or worse. This unique feature helps us to inform simulators to improve their simulation model. |
|---|---|
| Item Description: | Gesehen am 29.01.2025 |
| Physical Description: | Online Resource |
| ISSN: | 1432-0746 |
| DOI: | 10.1051/0004-6361/202453399 |