Minseon Kim
I am a post doc researcher at Microsoft Research–Montréal. ☃️ I completed my PhD at KAIST, advised by Professor Sung Ju Hwang.
My current research interests lie in identifying realistic safety risks in AI models and developing adaptable and controllable approaches to enhance the trustworthiness of AI models.
If you're interested in collaborating on research projects related to AI safety, feel free to contact me :)
Publication (*equal contribution)
Enhancing Variational Autoencoders with Smooth Robust Latent Encoding Hyomin Lee*, Minseon Kim*, Sangwon Jang, Jongheon Jeong, Sung Ju Hwang ArXiv 2025, PDF
debug-gym: A Text-Based Environment for Interactive Debugging Xingdi Yuan, Morgane M Moss, Charbel El Feghali, Chinmay Singh, Darya Moldavskaya, Drew MacPhee, Lucas Caccia, Matheus Pereira, Minseon Kim, Alessandro Sordoni, Marc-Alexandre Côté ArXiv 2025, PDF
Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts Samin Yeasar Arnob, Zhan Su, Minseon Kim, Oleksiy Ostapenko, Doina Precup, Lucas Caccia, Alessandro Sordoni ICML Workshop on Modularity for Collaborative, Decentralized, and Continual Deep Learning 2025, PDF
Automatic Jailbreaking of the Text-to-Image Generative AI Systems Minseon Kim, Hyomin Lee, Boqing Gong, Huishuai Zhang, Sung Ju Hwang ICML Next Generation of AI Safety Workshop 2024, PDF, Project Page, Code
Optimizing Query Generation for Enhanced Document Retrieval in RAG Hamin Koo, Minseon Kim, Sung Ju Hwang Arxiv 2024, PDF
Protein Representation Learning by Capturing Protein Sequence-Structure-Function Relationship Eunji Ko*, Seul Lee*, Minseon Kim*, Dongki Kim, Sung Ju Hwang ICLR MLGenX workshop 2024 (Spotlight), PDF
Effective Targeted Attacks for Adversarial Self-Supervised Learning Minseon Kim, Hyeonjeong Ha, Sooel Son, Sung Ju Hwang NeurIPS 2023, PDF, Code
Generalizable Lightweight Proxy for Robust NAS against Diverse Perturbations Hyeonjeong Ha*, Minseon Kim*, Sung Ju Hwang NeurIPS 2023, PDF, Code
Language Detoxification with Attribute-Discriminative Latent Space Minseon Kim*, Jin Myung Kwak*, Sung Ju Hwang ACL 2023, PDF
Context-dependent Instruction Tuning for Dialogue Response Generation Jin Myung Kwak, Minseon Kim, Sung Ju Hwang ArXiv 2023, PDF
Meta-Prediction Model for Distillation-aware NAS on Unseen Datasets Hayeon Lee*, Sohyun An*, Minseon Kim, Sung Ju Hwang ICLR 2023 (Spotlight), PDF, Code
Rethinking the Entropy of Instance in Adversarial Training Minseon Kim, Jihoon Tack, Jinwoo Shin, Sung Ju Hwang IEEE SaTML 2023, PDF, Code
Lightweight Neural Architecture Search with Parameter Remapping and Knowledge Distillation Hayeon Lee*, Sohyun An*, Minseon Kim, Sung Ju Hwang AutoML workshop 2022, PDF
Learning Transferable Adversarial Robust Representations via Multi-view Consistency Minseon Kim*, Hyeonjeong Ha*, Dong Bok Lee, Sung Ju Hwang NeurIPS SafetyML workshop 2022, Under review, PDF
Consistency Regularization for Adversarial Robustness Jihoon Tack, Sihyun Yu, Jongheon Jeong, Minseon Kim, Sung Ju Hwang, and Jinwoo Shin AAAI 2022, PDF, Code
MRI-based classification of neuropsychiatric systemic lupus erythematosus patients with self-supervised contrastive learning M. Kim*, F. Inglese*, G. Steup-Beekman, T. Huizinga, M. Van Buchem, J. Bresser, D. Kim, I. Ronen Frontiers in Neuroscience 2022 (Impact Factor: 4.67), PDF