About
I develop methods to identify real‑world safety risks in AI systems and make models more controllable and trustworthy.
Recent News!
🏝️ I will visit NeurIPS 2025, San Diego, ping me if you are around!
💡 New papers: LLM jailbreaking, Code agent, Code evaluation
🎤 Talk at the UNIST: “Designing safety systems for LLM-based services.”
Keywords
Safety
Robustness
Reasoning
Self‑Supervised
Selected Publications
Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
ArXiv 2025H. Koo, M. Kim, J. Kim
Gistify! Codebase-Level Understanding via Runtime Execution
ArXiv 2025H. Lee, M. Kim, C. Singh, M. Pereira, A. Sonwane, I. White, E. Stengel-Eskin, M. Bansal, Z. Shi, A. Sordoni, M.-A. Côté, X. Yuan, L. Caccia
BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills
ArXiv 2025A. Sonwane, I. White, H. Lee, M. Pereira, L. Caccia, M. Kim, Z. Shi, C. Singh, A. Sordoni, M.-A. Côté, X. Yuan
Learning to Solve Complex Problems via Dataset Decomposition
NeurIPS 2025W. Zhao, L. Caccia, Z. Shi, M. Kim, X. Yuan, W. Xu, M.-A. Côté, A. Sordoni
Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts
CoLM 2025S. Y. Arnob, Z. Su, M. Kim, O. Ostapenko, D. Precup, L. Caccia, A. Sordoni
Medical Red Teaming Protocol of Language Models: On the Importance of User Perspectives in Healthcare Settings
arXiv 2025J.-P. Corbeil*, M. Kim*, A. Sordoni, F. Beaulieu, P. Vozila
Instilling Parallel Reasoning into Language Models
ICML AI for Math WS 2025M. Macfarlane, M. Kim, N. Jojic, W. Xu, L. Caccia, X. Yuan, W. Zhao, Z. Shi, A. Sordoni
Enhancing Variational Autoencoders with Smooth Robust Latent Encoding
arXiv 2025H. Lee*, M. Kim*, S. Jang, J. Jeong, S. J. Hwang
debug-gym: A Text-Based Environment for Interactive Debugging
arXiv 2025X. Yuan, M. M. Moss, C. El Feghali, C. Singh, D. Moldavskaya, D. MacPhee, L. Caccia, M. Pereira, M. Kim, A. Sordoni, M.-A. Côté
Optimizing Query Generation for Enhanced Document Retrieval in RAG
arXiv 2024H. Koo, M. Kim, S. J. Hwang
Protein Representation Learning by Capturing Protein Sequence‑Structure‑Function Relationship
ICLR MLGenX WS 2024 (Spotlight)E. Ko*, S. Lee*, M. Kim*, D. Kim, S. J. Hwang
Language Detoxification with Attribute‑Discriminative Latent Space
ACL 2023M. Kim*, J. M. Kwak*, S. J. Hwang
Context‑dependent Instruction Tuning for Dialogue Response Generation
arXiv 2023J. M. Kwak, M. Kim, S. J. Hwang
Lightweight Neural Architecture Search with Parameter Remapping and Knowledge Distillation
AutoML WS 2022H. Lee*, S. An*, M. Kim, S. J. Hwang
Learning Transferable Adversarial Robust Representations via Multi‑view Consistency
NeurIPS SafetyML WS 2022M. Kim*, H. Ha*, D. B. Lee, S. J. Hwang
MRI‑based classification of neuropsychiatric systemic lupus erythematosus patients with self‑supervised contrastive learning
Frontiers in Neuroscience 2022M. Kim*, F. Inglese*, G. Steup‑Beekman, T. Huizinga, M. Van Buchem, J. Bresser, D. Kim, I. Ronen
T1 Image Synthesis with Deep Convolutional Generative Adversarial Networks
OHBM 2018M. Kim, C. Han, J. Park, D.-S. Kim
Experience
Postdoctoral Researcher — Microsoft Research–Montréal
Current
Research Internship — ERA–KASL AI Safety Research, University of Oxford
Jun–Aug 2024 • with Philip Torr, David Krueger, Adel Bibi, Fazl Barez
Research Collaboration — Theory Center, Microsoft Research Asia
Jul 2023–May 2024 • with Huishuai Zhang
Talks
AI Seminar, UNIST
Oct. 2025 — "Designing Safety Systems for LLM-based Services”
Mila X MSR, Microsoft
Oct. 2025 — “Learning to Extract Context for Context-aware LLM Inference”
Women in MSR – Project Green, Microsoft
Mar. 2025 — “Unsupervised Context Understanding for Safer LLMs”
Tea Talk, Mila
Feb. 2025 — “Designing safety systems for LLM-based services”
RWE AI Journal Club, Microsoft
Nov. 2024 — “How to obtain safety effectively and efficiently”
Guest Lecture, Korea University
May. 2024 — “Automatic Jailbreaking of the Text-to-Image Generative AI Systems”
Academic services
Conference
NeurIPS, ICLR, ICML, ACL, AAAI, ACML, ICCV
Journal
TPAMI, IEEE TNNNLS, TMLR, IEEE T-IFS, IEEE CIM
Organizer
WiML @ CoLM (2025), Safety Colloquium (2024), Women in AI/CS/EE at KAIST (2024), Women in AI at KAIST (2022)
Education
- Ph.D., Graduate School of AI, KAIST — Thesis: Towards Safe and Robust Representation with Self‑Supervised Learning (Advisor: Sung Ju Hwang)
- M.S., Electrical Engineering, KAIST — Thesis: Differential representation of face pareidolia (Advisor: Dae‑shik Kim)
- B.S., Bio & Brain Engineering; Computer Science, KAIST
Contact
minseon5113(at)gmail(dot)com