We have to stop ignoring AI’s hallucination problem

@astreus@lemmy.ml

“We invented a new kind of calculator. It usually returns the correct value for the mathematics you asked it to evaluate! But sometimes it makes up wrong answers for reasons we don’t understand. So if it’s important to you that you know the actual answer, you should always use a second, better calculator to check our work.”

Then what is the point of this new calculator?

Fantastic comment, from the article.

@CaptainSpaceman@lemmy.world

Its not just a calculator though.

Image generation requires no fact checking whatsoever, and some of the tools can do it well.

That said, LLMs will always have limitations and true AI is still a ways away.

@elephantium@lemmy.world

Image generation requires no fact checking whatsoever

Sure it does. Let’s say IKEA wants to use midjourney to generate images for its furniture assembly instructions. The instructions are already written, so the prompt is something like “step 3 of assembling the BorkBork kitchen table”.

Would you just auto-insert whatever it generated and send it straight to the printer for 20000 copies?

Or would you look at the image and make sure that it didn’t show a couch instead?

If you choose the latter, that’s fact checking.

That said, LLMs will always have limitations and true AI is still a ways away.

I can’t agree more strongly with this point!

@lateraltwo@lemmy.world

It’s a nascent stage technology that reflects the world’s words back at you in statistical order by way parsing user generated prompts. It’s a reactive system with no autonomy to deviate from a template upon reset. It’s no Rokos Basilisk inherently, just because

@tourist@lemmy.world

am I understanding correctly that it’s just a fancy random word generator

@Couldbealeotard@lemmy.world

Yes, but it’s, like, really fancy.

@Gigasser@lemmy.world

Not random, moreso probabilistic, which is almost the same thing granted.

Logi

It’s like letting auto complete always pick the next word in the sentence without typing anything yourself. But fancier.

@Zerfallen@lemmy.world

It would be a great comment if it represented reality, but as an analogy it’s completely off.

LLM-based AI represents functionality that nothing other than the human mind and extensive research or singular expertise can replicate. There is no already existing ‘second, better calculator’ that has the same breadth of capabilities, particularly in areas involving language.

If you’re only using it as a calculator (which was never the strength of an LLM in the first place), for problems you could already solve with a calculator because you understand what is required, then uh… yeah i mean use a calculator, that is the appropriate tool.

@elephantium@lemmy.world

Some problems lend themselves to “guess-and-check” approaches. This calculator is great at guessing, and it’s usually “close enough”.

The other calculator can check efficiently, but it can’t solve the original problem.

Essentially this is the entire motivation for numerical methods.

@Aceticon@lemmy.world

In my personal experience given that’s how I general manage to shortcut a lot of labour intensive intellectual tasks, using intuition to guess possible answers/results and then working backwards from them to determine which one is right and even prove it, is generally faster (I guess how often it’s so depends on how good one’s intuition is in a given field, which in turn correlates with experience in it) because it’s usually faster to show that a result is correct than to arrive at it (and if it’s not, you just do it the old fashion way).

That said, it’s far from guaranteed faster and for those things with more than one solution might yield working but sub-optimal ones.

Further, merelly just the intuition step does not yield a result that can be trusted without validation.

Maybe by being used as intuition is in this process, LLMs can help accelerate the search for results in subjects one has not enough experience in to have good intuition on but has enough experience (or there are ways or tools to do it inherent to that domain) to do the “validation of possible results” part.

We have to stop ignoring AI’s hallucination problem

We have to stop ignoring AI’s hallucination problem

Technology

Our Rules

Approved Bots

We have to stop ignoring AI’s hallucination problemplus-square

We have to stop ignoring AI’s hallucination problemplus-square

Technology

Our Rules

Approved Bots

We have to stop ignoring AI’s hallucination problem

We have to stop ignoring AI’s hallucination problem