Nicolás Astorga

Ph.D. student in Machine Learning

University of Cambridge | van der Schaar Lab

About Me

I’m a Ph.D. student at the University of Cambridge advised by Prof. Mihaela van der Schaar. My research explores Reasoning, Bayesian experimental design, and Optimization in the context of LLMs. Previously, I completed a two M.Sc. in Electrical Engineering and Computer Science at the University of Chile. I also worked as an ML Engineer at ALeRCE, and interned at Harvard IACS. My research contributions have been published in leading conferences, including NeurIPS, ICML, ICLR (Spotlight), ECCV, and AISTATS.

I am really interested in LLM research and leveraging their inductive biases to drive exploration and exploitation, and then using the collected experience to improve those biases (via in-context learning or training). This exploration–exploitation is with a goal in mind, making search methods and efficient experimentation also interesting. These ideas admit many implementations, which is part of the fun. I love that LLMs—and deep learning more broadly—are built to search and learn.

Interests

Large Language Models (LLMs)
Reasoning & RL with LLMs
Bayesian Experimental Design, Active Learning & Bayesian Optimization
Autoformulation for optimization with LLMs
Generative Models & Variational Inference

Education

Ph.D. in Machine Learning
University of Cambridge (2023–present)
Dual M.Sc. — Electrical Engineering; Computer Science
University of Chile (2020–2023)
B.Sc. — Computer, Electrical & Mechanical Engineering (Three Major)
University of Chile (2013–2019)

Publications (Highlighted)

Auto‑formulation of Mathematical Optimisation Models Using Large Language Models

Automatically formulating optimisation models from natural language using LLMs. Accepted at ICML 2025.

Nicolás Astorga, Tennison Liu, Y. Xiao, Mihaela van der Schaar

Jun 1, 2025

Active Task Disambiguation with LLMs

This paper formalizes task ambiguity and frames task disambiguation as Bayesian Experimental Design, yielding effective clarifying question selection for LLMs. Accepted at ICLR 2025.

Katarzyna Kobalczyk, Nicolás Astorga, Tennison Liu, Mihaela van der Schaar

Jan 22, 2025

Active Learning with LLMs for Partially Observed and Cost‑Aware Scenarios

Active learning strategies for partially observed, cost‑aware settings via LLMs. Accepted at NeurIPS 2024.

Nicolás Astorga, Tennison Liu, N. Seedat, Mihaela van der Schaar

Dec 1, 2024

Active Learning with LLMs for Partially Observed and Cost‑Aware Scenarios

Large Language Models to Enhance Bayesian Optimisation

Leveraging LLMs to improve Bayesian optimisation. Accepted at ICLR 2024.

Tennison Liu, Nicolás Astorga, N. Seedat, Mihaela van der Schaar

May 1, 2024

Publications (All)

Conferences — First author

N. Astorga*, T. Liu*, Y. Xiao, M. van der Schaar (2025). Auto-formulation of Mathematical Optimisation Models Using Large Language Models. ICML 2025. *Equal contribution.
K. Kobalczyk*, N. Astorga*, T. Liu, M. van der Schaar (2025). Active Task Disambiguation with Large Language Models. ICLR 2025 (Spotlight). *Equal contribution.
N. Astorga, T. Liu, N. Seedat, M. van der Schaar (2024). Active Learning with LLMs for Partially Observed and Cost-Aware Scenarios. NeurIPS 2024.
T. Liu*, N. Astorga*, N. Seedat, M. van der Schaar (2024). Large Language Models to Enhance Bayesian Optimisation. ICLR 2024. *Equal contribution.
N. Astorga, P. Huijse, P. Protopapas, P. Estévez (2020). MPCC: Matching Priors and Conditionals for Clustering. ECCV 2020, Glasgow.
N. Astorga, P. Huijse, P. A. Estévez, F. Förster (2018). Clustering of Astronomical Transient Candidates Using Deep Variational Embedding. IJCNN 2018, Rio de Janeiro.

Conferences — Second author

S. Ruhrberg, N. Astorga, M. van der Schaar (2025). Timely Clinical Diagnosis through Active Test Selection. NeurIPS 2025.
H. Amad, N. Astorga, J.-M. van der Schaar (2025). Continuously Updating Digital Twins Using Large Language Models. AISTATS 2025.
J. Piskorz, N. Astorga, J. Berrevoets, M. van der Schaar (2025). Active Feature Acquisition for Personalised Treatment Assignment. ICML 2025.

Journals

G. Cabrera-Vives, D. Moreno-Cartagena, N. Astorga, I. Reyes-Jainaga, et al. (2024). ATAT: Astronomical Transformer for Time Series and Tabular Data. Astronomy & Astrophysics.
M. Pérez-Carrasco, G. Cabrera-Vives, L. Hernández-García, F. Förster, N. Astorga, et al. (2023). Alert Classification for the ALeRCE Broker System: The Anomaly Detector. The Astronomical Journal.
F. Förster, G. Cabrera-Vives, E. Castillo-Navarrete, P. A. Estévez, N. Astorga, et al. (2021). The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker. The Astronomical Journal.
C. Modarres, N. Astorga, E. Droguett, V. Meruane (2018). Convolutional Neural Networks for Automated Damage Recognition and Damage Type Identification. Structural Control and Health Monitoring.

Workshop

H. Sun, T. Pouplin, N. Astorga, T. Liu, M. van der Schaar (2024). Improving LLM Generation with Inverse and Forward Alignment: Reward Modelling, Prompting, Fine-Tuning, and Inference-Time Optimisation. NeurIPS 2024 Workshop on System-2 Reasoning at Scale.

Professional Experience

Machine Learning Engineer

ALeRCE – Automatic Learning for the Rapid Classification of Events

February 2022 – April 2023 Santiago, Chile

Deployed production‑grade ML models via Kubernetes to classify LSST astronomical alerts in real time.
Built distributed PySpark pipelines to curate >30M light‑curve observations from multiple catalogues.
Collaborated in the ELAsTiCC challenge; proposed a Transformer‑based model for tabular/time‑series data; work accepted at Astronomy & Astrophysics.

Research Intern

Harvard University — Institute for Applied Computational Science

January 2019 – August 2019 Cambridge, MA, USA

Proposed MPCC, a GAN–VAE hybrid clustering framework (ECCV 2020) leveraging forward KL divergence and extending BigGAN.

Research Assistant

University of Chile — Lab. of Computational Intelligence

March 2016 – December 2023 Santiago, Chile

Developed a VAE‑based clustering method for astronomical transient detection (IJCNN 2018).
Integrated normalising flows into variational embeddings, improving ELBO by ≥10%.
Matched fully supervised performance using Gaussian processes in a semi‑supervised setting with only 10% labeled data.

Contact

nja46@cam.ac.uk
Centre for Mathematical Sciences, Wilberforce Rd, Cambridge, CB3 0WA,