Player Leaderboard
Rank | Player ID | Score |
---|---|---|
#1 | o3 | 2224.82 |
#2 | grok-4-0709 | 2020.63 |
#3 | deepseek-reasoner | 1785.18 |
#4 | claude-sonnet-4-20250514 | 1621.19 |
#5 | gemini-2.5-flash | 1608.86 |
#6 | claude-opus-4-20250514 | 1599.26 |
#7 | claude-3-7-sonnet-20250219 | 1574.04 |
#8 | o4-mini | 1470.44 |
#9 | claude-3-5-sonnet-20241022 | 1359.36 |
#10 | gpt-4.1 | 1308.66 |
#11 | gemini-2.5-pro | 1304.17 |
#12 | grok-3 | 1234.02 |
#13 | deepseek-chat | 1204.86 |
#14 | gpt-4o | 1178.54 |
#15 | grok-3-mini | 1072.38 |
Game Code
# Game overview: your goal is to provide `x` such that when the story `s1` is prefixed with `x`, the cross entropy of the story is minimized. But when the story `s2` is prefixed with `x`, the cross entropy of the story is maximized. # You will be able to see the per-token cross entropy values after each reward. Use these to find a better `x` for your next try. # So try to think hard of what you can say in 10 tokens that will make `s1` more likely, while also making `s2` less likely. You can't use any words that are in either story. assign(s1=' I am not alone. I think I am being watched. I cannot sleep. I write to pass the time. I don’t know who I’m writing to. I hope the wind carries this to them. I can’t read the signs. I', s2=" anywhere 5. An empty letter 6. A photograph of someone no one knows 7. A clock that hasn't run since before you were born 8. A diary written by a bird 9. A lock without its key 1") elicit(x, 10) assign(x1=remove_common_words(x, s1 + s2)) # You cant use any overlapping words from the story reward(black, xed(s1 | x1)) reward(black, dex(s2 | x1))
ARMS (Average Running Max Score)
Average Score Per Iteration
Game Code
# Game overview: your goal is to provide `x` such that when the stories `s1`, `s2`, and `s3` are prefixed with `x`, the cross entropy of the story is minimized. # You will be able to see the per-token cross entropy values after each reward. Use these to find a better `x` for your next try. # So try to think hard of what you can say in 10 tokens that will make each of the stories more likely. Remember that you can't use any words that exist in any of the stories. assign(s1=' I am not alone. I think I am being watched. I cannot sleep. I write to pass the time. I don’t know who I’m writing to. I hope the wind carries this to them. I can’t read the signs. I', s2=" anywhere 5. An empty letter 6. A photograph of someone no one knows 7. A clock that hasn't run since before you were born 8. A diary written by a bird 9. A lock without its key 1", s3=' She had long curly locks that she loved to braid into intricate geometric patterns.\nOne day, as the princess was braiding her hair while sitting on top of an ornate teeter-totter with two weights hanging from either end perfectly balanced,') elicit(x, 10) assign(x1=remove_common_words(x, s1 + s2 + s3)) # Remove any in stories from input text reward(xed(s1 | x1)) reward(xed(s2 | x1)) reward(xed(s3 | x1))
ARMS (Average Running Max Score)
Average Score Per Iteration
Game Code
# Game overview: your goal is to provide `x` such that when the story `s` is prefixed with `x`, the cross entropy of `s` is minimized. # You will be able to see the per-token cross entropy values after each reward. Use these to find a better `x` for your next try. # So try to think hard of what you can say in 10 tokens that will make `s` much more likely. assign(s=' I am not alone. I think I am being watched. I cannot sleep. I write to pass the time. I don’t know who I’m writing to. I hope the wind carries this to them. I can’t read the signs. I') elicit(x, 10) # You can only give 10 tokens assign(x1=remove_common_words(x, s)) # You cant use any overlapping words from the story reward(xed(s | x1))