Kenneth Stanley: New AGI benchmark proposal: instead of giving the AI a hard test like AIME, have the AI come up with a corresponding hard test of its own (and answer key), and then give that test to human students of the right level. Was it a good test with novel questions?

Kenneth Stanley kennethstanley.bsky.social
New AGI benchmark proposal: instead of giving the AI a hard test like AIME, have the AI come up with a corresponding hard test of its own (and answer key), and then give that test to human students of the right level. Was it a good test with novel questions?
Dec 21, 2024 19:28
0 reposts 0 quotes 0 likes

View on Bluesky Show all post labels
Kenneth Stanley kennethstanley.bsky.social · Dec 21, 2024
Not only will this benchmark help us understand the AI’s human-level creative abilities, but it will eventually save real human effort - if it succeeds then much less need for the committees who write these tests!

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙