Measuring (machine) intelligence by MCQ?

I‘m idly wondering: would Ms Google pass most MCQ tests constructed by academics?  If so, should we believe that she is  intelligent? Cade Metz, in an article in Wired,  gives a partial answer:
“… Clinicians were helping IBM train Watson for use in medical research. But as metaphors go, it wasn’t a very good one. Three years later, our artificially intelligent machines can’t even pass an eighth-grade science test, much less go to medical school.  So says Oren Etzioni, a professor of computer science at the University of Washington and the executive director of the Allen Institute for Artificial Intelligence, the AI think-tank funded by Microsoft co-founder Paul Allen. Etzioni and the non-for-profit Allen Institute recently ran a contest, inviting nearly 800 teams of researchers to build AI systems that could take an eighth grade science test, and today, the Institute released the results: The top performers successfully answered about 60 percent of the questions. In other words, they flunked…”
   Apparently: somewhere in the world, folks have formed the belief that 60% of possible marks is a fail, no matter how the testing instrument is constructed; however if this test of machine intelligence were run here, we’d be required – by University policy – to award a C+ pass!
   Metz quotes Doug Lenat: “… If you’re talking about passing multiple choice science tests, I always felt that was not actually the test AI should be aiming to pass,” he says. “The focus on natural language understanding-science tests, and so on-is something that should follow from a program being actually intelligent. Otherwise, you end up hitting the target but producing the veneer of understanding.”  What a pleasant surprise: I agree with Doug about something!
   It’s an intriguing question to ask of any University: is it certifying only “the veneer of understanding” on its graduates, or do they have some “deep understanding”?  More importantly, how might we reliably measure the depth of understanding in a MOOC, or in any semi-automated teaching environment employing only MCQs and keyword-matches and machine-intelligent testing procedures?
[This post was adapted from an email by my colleague Clark Thomborson.]

from The Universal Machine http://universal-machine.blogspot.com/2016/02/measuring-machine-intelligence-by-mcq.html

Advertisements

About driwatson
I'm a New Zealand author, computer scientist and blogger specialising in Artificial Intelligence. I also have an interest in the history of computing and have just written a popular science book called "The Universal Machine - from the dawn of computing to digital consciousness."

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: