The research from Purdue University, first spotted by news outlet Futurism, was presented earlier this month at the Computer-Human Interaction Conference in Hawaii and looked at 517 programming questions on Stack Overflow that were then fed to ChatGPT.
“Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose,” the new study explained. “Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style.”
Disturbingly, programmers in the study didn’t always catch the mistakes being produced by the AI chatbot.
“However, they also overlooked the misinformation in the ChatGPT answers 39% of the time,” according to the study. “This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.”
This is a most excellent place for technology news and articles.
Billions and billions invested to produce accuracy slightly less than flipping a coin.
Yes there are mistakes, but if you direct it to the right direction, it can give you correct answers
It can, it also sometimes can’t unless you ask it “could it be x answer”
In my experience, if you have the necessary skills to point it at the right direction, you don’t need to use it at the first place
Yesterday, I wrote all of this, working javascript code https://github.com/igorlogius/gather-from-tabs/discussions/8 And I don’t know a lick of javascript I know other languages but that barely was needed. I just gave it plain language instructions and reported the errors until it worked.
it’s just a convenience, not a magic wand. Sure relying on AI blindly and exclusively is a horrible idea (that lots of people peddle and quite a few suckers buy), but there’s room for a supervised and careful use of AI, same as we started using google instead of manpages and (grudgingly, for the older of us) tolerated the addition of syntax highlighting and even some code completion to all but the most basic text editors.
Yeah it’s wrong a lot but as a developer, damn it’s useful. I use Gemini for asking questions and Copilot in my IDE personally, and it’s really good at doing mundane text editing bullshit quickly and writing boilerplate, which is a massive time saver. Gemini has at least pointed me in the right direction with quite obscure issues or helped pinpoint the cause of hidden bugs many times. I treat it like an intelligent rubber duck rather than expecting it to just solve everything for me outright.
Same here. It’s good for writing your basic unit tests, and the explain feature is useful getting for getting your head wrapped around complex syntax, especially as bad as searching for useful documentation has gotten on Google and ddg.
Not a programmer by any means (haven’t done any since college) but I’ve asked it for help in writing Jira queries or Excel mess and it’s been pretty solid with that stuff.
Sounds low
Yes, and even if it was only right 1% of the time it would still be amazing
Also hallucinations are not a universally bad thing.
Just like answers on the Internet, you have to read the output and not just paste it blindly. I find the answers are usually useful, even if they aren’t completely accurate. Figuring out the last bit is why we are paid as programmers.
“Major new Technology still in Infancy Needs Improvements”
– headline every fucking day
“Will this technology save us from ourselves, or are we just jerking off?”
Better than Jerry in the next cubicle over.
Who would have thought that an artificial intelligence trained on human intelligence would be just as dumb
Hm. This is what I got.
I think about 90% of the screenshots we see of LLMs failing hilariously are doctored. Lemmy users really want to believe it’s that bad through.
Edit:
Yesterday, someone posted a doctored one on here saying everyone eats it up even if you use a ridiculous font in your poorly doctored photo. People who want to believe are quite easy to fool.
My experience with an AI coding tool today.
Me: Can you optimize this method.
AI: Okay, here’s an optimized method.
Me seeing the AI completely removed a critical conditional check.
Me: Hey, you completely removed this check with variable xyz
Ai: oops you’re right, here you go I fixed it.
It did this 3 times on 3 different optimization requests.
It was 0 for 3
Although there was some good suggestions in the suggestions once you get past the blatant first error
My favorite is when I ask for something and it gets stuck in a loop, pasting the same comment over and over
I always thought of it as a tool to write boilerplate faster, so no surprises for me
We have to wait a bit to have an useful assistant (but maybe something like copilot or more coded focused ai are better)
People down vote me when I point this out in response to “AI will take our jobs” doomerism.
Well, I do it 99% of the times
If you ask the wrong questions you get the wrong results. If you don’t check the response for accuracy, you get invalid answers.
It’s just a tool. Don’t use it wrong because you’re lazy.
Lemmy is trying really, really hard to convince you that coding is going to be a viable career in 5 years.
Lemmy is trying real hard to convince you that AI is going to do everyone’s job in 5 years—including yours