|
Dr Abeba Birhane@abeba.bsky.social |
despite all the hype, this recent work shows that LLMs are not good at abstract reasoning arxiv.org/abs/2406.11012
1 replies 13 reposts 34 likes
|
Dr Abeba Birhane@abeba.bsky.social |
despite all the hype, this recent work shows that LLMs are not good at abstract reasoning arxiv.org/abs/2406.11012
1 replies 13 reposts 34 likes
|
Dodecahedron
@dodechedrononon.bsky.social
[ View ] |
"Our results show that even the best-performing LLM, GPT-4o, which has otherwise shown impressive reasoning abilities on a wide variety of benchmarks, can only fully solve 8% of the games." ...impressive reasoning on a wide variety of benchmarks...
0 replies 0 reposts 0 likes