Hi there, nice to meet you!
My name is Donghu Kim. I am on a Master's Degree program in KAIST (advised by Jaegul Choo), studying reinforcement learning and embodied AI.
Towards building a generalist robotic agent, I am a huge supporter of the belief that RL will be the one that generates the low-level data [1, 2]. In this direction, I am invested in pushing the absolute limit of efficiency in RL for control: Can we make RL work with only 1K samples? Can we do it within an hour? As far fetched as the goal may seem, there are so many exciting components we can tackle, including feature learning, exploration, architecture design, optimizer, task transfer, etc.
I still have a long long way to go; if you want to discuss anything research related, I'd be more than happy to be engaged!
Email  /  CV  /  Google Scholar  /  Github
We build a light-weight LLM agent that answers chemical toxicity questions based on the Korean Tox-Info database.
We further regularize SimBa architecture by projecting both parameters and features onto a hypersphere, leading to better scaling properties in model size and compute.
We propose a well-regularized architecture that avoids overfitting, allowing parameter and compute scale up in RL.
We present DoDont, a skill discovery algorithm that learns diverse behaviors while following the behaviors in "do" videos while avoiding the behaviors in "don't" videos.
We investigate which pre-training objectives are beneficial for in-distribution, near-out-of-distribution, and far-out-of-distribution generalization in visual reinforcement learning.
To allow the network to continually adapt and generalize, we introduce Hare and Tortoise architecture, inspired by the complementary learning system of the human brain.
Note: These slides are made for studying purposes only, and likely have got something wrong here and there. If you happen to find some, feel free to make fun of me via e-mail :).
Template based on Hojoon Lee's website.