Mastodawn

Mike Amundsen Sep 12, 2023

Google DeepMind and the University of Tokyo Researchers Introduce WebAgent: An LLM-Driven Agent that can Complete the Tasks on Real Websites Following Natural Language Instructions

https://www.marktechpost.com/2023/07/29/google-deepmind-and-the-university-of-tokyo-researchers-introduce-webagent-an-llm-driven-agent-that-can-complete-the-tasks-on-real-websites-following-natural-language-instructions/

"Thorough research shows that linking task planning with HTML summary in specialized language models is crucial for task performance, increasing the success rate on real-world online navigation by over 50%." -- #AneeshTickoo

For more see: https://arxiv.org/abs/2307.12856

#gen_ai #api360 #webAgent

Google DeepMind and the University of Tokyo Researchers Introduce WebAgent: An LLM-Driven Agent that can Complete the Tasks on Real Websites Following Natural Language Instructions

Several natural language activities, including arithmetic, common sense, logical reasoning, question-and-answer tasks, text production, and even interactive decision-making tasks, may be solved using large language models (LLM). By utilizing the ability of HTML comprehension and multi-step reasoning, LLMs have recently shown excellent success in autonomous web navigation, where the agents control computers or browse the internet to satisfy the given natural language instructions through the sequence of computer actions. The absence of a preset action space, the lengthier HTML observations compared to simulators, and the lack of HTML domain knowledge in LLMs have all negatively impacted web navigation on real-world

MarkTechPost