[論文介紹] GAIA: A Benchmark for General AI Assistants GAIA - 衡量你的 Agent 算不算 General Assistant! Jun 27.7 min read.論文介紹
[論文介紹] ChatEval: Towards Better LLM-Based Evaluators Through Multi-Agent Debate LLM Agent 是什麼?Agent 之間如何進行 Debate 來完成任務? Jun 23.15 min read.論文介紹