모바일 메뉴 닫기
 

커뮤니티

제목
[세미나] Seoul World Model: Grounding World Simulation Models in a Real-World Metropolis / NAVER AI Lab
작성자
첨단컴퓨팅학부
작성일
2026.04.03
최종수정일
2026.04.03
분류
세미나
링크URL
https://seoul-world-model.github.io
게시글 내용



일시: 2026. 4. 8. (수요일), 오후 1시

장소: 제4공학관 D502호

Speaker: Jin-Hwa Kim(김진화), Ph.D. / Leader of Generation Research at NAVER AI Lab

Title: Seoul World Model: Grounding World Simulation Models in a Real-World Metropolis

Abstract:

Recent advances in generative world models have enabled the synthesis of visually plausible environments, yet most approaches remain limited to fully imagined worlds without grounding in physical reality. In this talk, we present Seoul World Model (SWM), a city-scale world simulation model that generates videos of a real-world metropolis by leveraging large-scale street-view data from Seoul. SWM formulates video generation as a retrieval-augmented process, where nearby street-view images are used to anchor autoregressive generation to real-world geometry and appearance. This grounding, however, introduces unique challenges, including temporal misalignment between retrieved references and dynamic scenes, limited trajectory diversity, and sparsity in vehicle-captured data. To address these issues, we propose cross-temporal pairing for robust supervision, a view interpolation pipeline that transforms sparse street-view images into coherent training videos, and a Virtual Lookahead Sink mechanism that stabilizes long-horizon generation through continuous future re-grounding. We evaluate SWM across multiple cities, including Busan, and Ann Arbor, demonstrating its ability to generate spatially faithful and temporally consistent urban videos over long trajectories spanning hundreds of meters. Beyond passive generation, SWM supports diverse camera movements and text-driven scenario control, opening new possibilities for real-world grounded simulation in applications such as autonomous driving, urban planning, and scenario forecasting. Video demos and additional materials are available at "https://seoul-world-model.github.io".


Bio:

Jin-Hwa Kim has been the Leader of Generation Research at NAVER AI Lab, working since August 2021, and a Guest Associate Professor at the Artificial Intelligence Institute of Seoul National University (SNU AIIS) since August 2022. He has studied multimodal deep learning, multimodal generation, ethical and safe AI, and other related topics. In 2018, he received a Ph.D. from Seoul National University under the supervision of Professor Byoung-Tak Zhang for the work on “Multimodal Deep Learning for Visually-grounded Reasoning.” In September 2017, he received 2017 Google Ph.D. Fellowship in Machine Learning, Ph.D. Completion Scholarship by Seoul National University, and the VQA Challenge 2018 runners-up at the CVPR 2018 VQA Challenge and Visual Dialog Workshop. He was Research Intern at Facebook AI Research (Menlo Park, CA) mentored by Yuandong Tian, Devi Parikh, and Dhruv Batra, from January to May in 2017.