- New AGI benchmark proposal: instead of giving the AI a hard test like AIME, have the AI come up with a corresponding hard test of its own (and answer key), and then give that test to human students of the right level. Was it a good test with novel questions?Dec 21, 2024 19:28
- Not only will this benchmark help us understand the AI’s human-level creative abilities, but it will eventually save real human effort - if it succeeds then much less need for the committees who write these tests!