Xent Leaderboard

The Xent benchmark uses games to evaluate an LLM agent's general capabilities. Xent games have LLM agents play text-based games in a fully LLM-defined environment. There are three games in the current benchmark set: “Condense”, “Contrast”, and “Synthesize”. LLM agents play these games in a series of “maps”, where each game map has different environmental data.

Xent Benchmark Details Xent Game Theory Xent Game Details About Xent

Summary

Player Leaderboard

Summary

Player Leaderboard

Detailed Analysis

Summary

Player Leaderboard

Detailed Analysis

Summary

Player Leaderboard