Mingen Li

Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

Mingen Li, Houjian Yu, Yixuan Huang, Youngjin Hong, Changhyun Choi^†,

ICRA (Accepted), 2026 Best Paper Award Finalist on Robot Learning

[Project Page] [Paper] [Video]

We present a fully autonomous hierarchical framework for cable routing that integrates vision–language models (VLMs) for high-level reasoning with reinforcement learning (RL) for low-level control. The system interprets language goals, generates multi-step plans, and recovers from failures, achieving a 92.5% success rate across diverse long-horizon scenarios.

LACY: A Vision-Language Model-based Language-Action Cycle for Self-Improving Robotic Manipulation

Youngjin Hong^*, Houjian Yu^*, Mingen Li, Changhyun Choi,

^*Equal contribution

ICRA (Accepted), 2026

[Project Page] [Paper] [Video]

To bridge the gap between robotic action and language understanding, we introduce a unified vision-language framework that learns both language-to-action (L2A) and action-to-language (A2L) mappings within a vision-language model. By jointly training on action generation, explanation, and consistency verification (L2C), LACY can generate new data without human annotation.

Routing Manipulation of Deformable Linear Object Using Reinforcement Learning and Diffusion Policy

Mingen Li, Houjian Yu, Changhyun Choi,

ICRA, 2025

[Project Page] [Paper] [Video]

To address the high flexibility of ropes and their frequent contact with uncertain environments, we propose a reinforcement learning and diffusion policy–based framework for robust and delicate deformable linear object (DLO) routing. This enables gentle, contact-aware manipulation that prevents rope damage while remaining robust in rough environments.

A Parameter-Efficient Tuning Framework for Language-Guided Object Grounding and Robot Grasping

Houjian Yu, Mingen Li, Alireza Rezazadeh, Yang Yang, Changhyun Choi,

^*Equal contribution

ICRA, 2025

[Project Page] [Paper] [Video]

We propose a CLIP-based parameter-efficient tuning (PET) framework that enables lightweight, adaptable multimodal learning for referring expression segmentation and referring grasping tasks. Our approach introduces a bi-directional vision-language adapter for pixel-level grounding and a depth fusion branch to integrate geometric cues. The model achieves state-of-the-art grounding accuracy while demonstrating strong generalization to spatial reasoning and multi-object scenarios with minimal computation.

Learning for Deformable Linear Object Insertion Leveraging Flexibility Estimation from Visual Cues

Mingen Li, Changhyun Choi,

ICRA, 2024

[Project Page] [Paper] [Video]

Manipulating deformable linear objects such as wires, rubber, and ropes is essential for robotic automation but difficult due to their diverse material properties. We propose a two-stage framework that first estimates material flexibility from visual cues, then uses reinforcement learning to perform insertion tasks conditioned on this estimation. The flexibility estimation module learns material characteristics in simulation and generalizes to real-world interaction. Our approach achieves 85.6% success in simulation and 66.7% on real robots, demonstrating strong adaptability across different DLO types.

Robotic Manipulation of Deformable Rope-Like Objects Using Differentiable Compliant Position-Based Dynamics

Fei Liu^*, Entong Su^*, Jingpei Lu, Mingen Li, Michael Yip,

^*Equal contribution

RA-L, 2023

[Paper]

Modeling and controlling rope-like deformable objects is vital for tasks such as autonomous suturing but remains challenging due to complex physics and real-to-sim discrepancies. We introduce a differentiable compliant position-based dynamics (XPBD) framework that accurately models rope behavior through geometric constraints capturing stretch, shear, bend, and twist effects. The differentiable formulation enables parameter estimation and real-to-sim adaptation, making it well suited for optimization and learning. Experiments with Baxter and the da Vinci Research Kit (DVRK) validate its robustness and accuracy across diverse rope materials.

Parameter Identification and Motion Control for Articulated Rigid Body Robots Using Differentiable Position-based Dynamics

Fei Liu^*, Mingen Li^*, Jingpei Lu, Entong Su, Michael Yip,

^*Equal contribution

[Paper]

Accurate and differentiable simulation modeling is essential for robot control, design, and learning, yet most existing simulators sacrifice either speed, stability, or differentiability. We introduce a differentiable position-based dynamics (PBD) framework that unifies articulated robot modeling, optimal design, and model-based control within a single simulation pipeline. The framework provides native gradients through automatic differentiation over positional and angular constraints, enabling efficient optimization across robot parameters and motion. We validate its capability through optimal robot design, torque and stiffness estimation, and real-world impedance control, showing strong accuracy and real-to-sim consistency.

Mingen Li

Research

Teaching

Professional Service