1

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

OpenAgents: An Open Platform for Language Agents in the Wild

Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning

Binding Language Models in Symbolic Languages

In-Context Learning for Few-Shot Dialogue State Tracking

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling.

A Survey on Spoken Language Understanding: Recent Advances and New Frontiers