The latest llms get a perfect score on the south Korean SAT and can pass the bar. More than pure marketing if you ask me. That does not mean 90% of business that claim ai are nothing more than marketing or the business that are pretty much just a front end for GPT APIs. llms like claud even check their work for hallucinations. Even if we limited all ai to llms they would still be groundbreaking.
Korean SAT are highly standardized in multiple choice form and there is an immense library of past exams that both test takers and examiners use. I would be more impressed if the LLMs could show also step by step problem work out…
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world
This is a most excellent place for technology news and articles.
The latest llms get a perfect score on the south Korean SAT and can pass the bar. More than pure marketing if you ask me. That does not mean 90% of business that claim ai are nothing more than marketing or the business that are pretty much just a front end for GPT APIs. llms like claud even check their work for hallucinations. Even if we limited all ai to llms they would still be groundbreaking.
Korean SAT are highly standardized in multiple choice form and there is an immense library of past exams that both test takers and examiners use. I would be more impressed if the LLMs could show also step by step problem work out…
Claud 3.5 and o1 might be able to do that; if not, they are close to being able to do that. Still better than 99.99% of earthly humans