Document
GUI Agents with Foundation Models: Data Resource, Framework and Application
Tutorial @ IJCAI 2025
[Survey]    [Tutorial]   

Summary

The rapid advancement of foundation models like large vision language models (VLMs) has paved the way for intelligent agents capable of autonomously interacting with Graphical User Interfaces (GUIs). This tutorial provides a comprehensive overview of the latest innovations in GUI agents and influential work across data resource, framework, and application.

Schedule

Tutorial Organizers

 

Shuai Wang

Technological Expert

Huawei Noah's Ark Lab

 

Kaiwen Zhou

Technological Expert

Huawei Noah's Ark Lab

 

Rui Shao

Professor

Harbin Institute of Technology (Shenzhen)

 
 

Gongwei Chen

PostDoc

Harbin Institute of Technology (Shenzhen)

 

Yuqi Zhou

Ph.D. Candidate

Renmin University of China