Can AI writing detectors be trusted? We dig into the theory behind them.

Clearly the Founding Fathers were not advanced enough to have crafted the US Constitution unaided. It’s only reasonable to imagine that ancient aliens could have landed, given them an AI to assist them, and then departed with nobody the wiser.

I am certain we can find evidence of this if we dig hard enough.

@busturn@lemmy.world
link
fedilink
English
61Y

I’ve recently checked my years-old essay using one of these AI plagiarism detectors and it said that the essay was 90% AI written. So either it’s all bs or I’m a time travelling AI.

Flying Squid
link
fedilink
English
11Y

CAUGHT!

They only know what they have been fed.

What more likely first/base feeding than the US Constitution’s declarations and it’s amendments?

@kikuchiyo@lemmy.ml
link
fedilink
English
21Y

I’m waiting for new conspiracy theories after that article hahah.

@paddirn@lemmy.world
link
fedilink
English
41Y

Obviously the US Constitution was written by AI, we’re living in a simulation. Wake up sheeple, the Matrix is real!

@InternetTubes@lemmy.world
link
fedilink
English
1
edit-2
1Y

removed by mod

@dethb0y@lemmy.world
link
fedilink
English
111Y

Because AI detectors suck and are the modern day equivalent of dowsing rods?

@jocanib@lemmy.world
creator
link
fedilink
English
9
edit-2
1Y

They’re circular. If the text is too predictable it was written by an LLM* but LLMs are designed to regurgitate the next word most commonly used by humans in any given context.

*AI is a complete misnomer for the hi-tech magic 8ball

@Zeth0s@lemmy.world
link
fedilink
English
-1
edit-2
1Y

The next most commonly used word would result in a loop of common word. LLMs do not work like that

@jocanib@lemmy.world
creator
link
fedilink
English
51Y

In context. And that is exactly how they work. It’s just a statistical prediction model with billions of parameters.

@Zeth0s@lemmy.world
link
fedilink
English
2
edit-2
1Y

regurgitate the next word most commonly used by humans in any given context.

is not what it does. That would create non sensical text (you can try yourself).

This is a summary of the method, as summarized by gtp-4:


Sure, here is a detailed description of how text is generated with ChatGPT, which is based on the GPT architecture:

  1. Initial Prompt: The process begins with an input prompt. This could be something like “Tell me about the weather today” or any other string of text.
  1. Tokenization: The input text is broken down into smaller parts, called tokens, which can represent words, parts of words, or punctuation. GPT uses a byte pair encoding (BPE) tokenization, which essentially breaks down text into commonly occurring chunks.
  1. Embedding: Each token is then turned into a vector via an embedding. This vector captures semantic information about the token and serves as the input for the model.
  1. Processing the Input: The GPT model processes the input vectors sequentially with a stack of transformer layers. Each layer applies self-attention and feeds its output into the next layer.
  1. Self-Attention Mechanism: The self-attention mechanism in the Transformer model allows it to weigh the importance of different words when predicting the next word. For example, when trying to predict the last word in the sentence “The cat sat on the ____,” the words “cat” and “on” are likely to have more influence on the prediction than “The”. This weighing is learned during training and allows the model to generate more coherent and contextually appropriate responses.
  1. Output Layer: The output from the final transformer layer for the last input token goes through a linear layer followed by a softmax function, which turns it into a probability distribution over the possible next tokens in the vocabulary. Each possible next token is assigned a probability.
  1. Sampling with Temperature: The next token is chosen based on these probabilities. One common method is to sample from this distribution, which introduces some randomness into the process. The temperature parameter controls the amount of randomness: a higher temperature makes the distribution more uniform and the output more random, while a lower temperature makes the model more likely to choose the highest-probability token.
  1. Decoding: The chosen token is then decoded back into text and appended to the output.
  1. Next Iteration: The process then repeats for the next token: the model takes the output so far (including the newly-generated token), processes it, and generates probabilities for the next token. This continues until a maximum length is reached, or an end-of-sequence token is produced.
  1. Post-Processing: Any necessary post-processing is applied, such as cleaning up tokenization artifacts.

In this way, the model generates a sequence of tokens, one at a time, based on the input prompt and the tokens it has generated so far. Please note that while this process typically uses sampling with a temperature parameter, other methods like beam search or top-k sampling can also be used to choose the next token. These methods have different trade-offs in terms of computational efficiency, diversity, and quality of output.


You are missing the key part where the text is tranformed in a vector space of “concepts” where semanticic relationships are represented, that is where the inference happens. The inference is not on words to get the next commonly used word, otherwise it wouldn’t work. And you also missed the final sampling to introduce a randomness in the word selection.

I don’t understand why are you so upset for a chain of complex mathematical functions that complete and input sentence. Why are you angry?

@jocanib@lemmy.world
creator
link
fedilink
English
11Y

You’re agreeing with me but using more words.

I’m more annoyed than upset. This technology is eating resources which are badly needed elsewhere and all we get in return is absolute junk which will infest the literature for decades to come.

@Zeth0s@lemmy.world
link
fedilink
English
21Y

I am not agreeing with you because “regurgitate the next most commonly world” is not what it does.

That said, the technology is not doing anything wrong. The people using it are doing it. The technology is a great achievement of human kind, possibly one of the greatest. If people decide to use it to print sh*t is people fault. Quantum mechanics is one of the greatest achievement of human kind, if people decided to use it to kill people, it is a fault of people. Many humans are simply shitty, don’t blame a clever mathematical function and its clever implementation

@dan1101@lemmy.world
link
fedilink
English
48
edit-2
1Y

As expected, they can’t be trusted. And the more AI evolves, the less likely AI content will be detectable IMO.

@jocanib@lemmy.world
creator
link
fedilink
English
51Y

It will almost always be detectable if you just read what is written. Especially for academic work. It doesn’t know what a citation is, only what one looks like and where they appear. It can’t summarise a paper accurately. It’s easy to force laughably bad output by just asking the right sort of question.

The simplest approach for setting homework is to give them the LLM output and get them to check it for errors and omissions. LLMs can’t critique their own work and students probably learn more from chasing down errors than filling a blank sheet of paper for the sake of it.

@Zeth0s@lemmy.world
link
fedilink
English
81Y

This is not entirely correct, in my experience. With the current version pf gtp-4 you might be right, but the initial versions were extremely good. Clearly you have to work with it, you cannot ask for the whole work

@jocanib@lemmy.world
creator
link
fedilink
English
41Y

That’s not true! There’s heaps of early-GPT articles pointing out how much bullshit it regurgitates (eg Why does ChatGPT constantly lie?). And no evidence at all that the breathless fanboys have even stopped to check.

@Zeth0s@lemmy.world
link
fedilink
English
3
edit-2
1Y

I meant initial versions of chatGTP 4. ChatGTP isn’t lying, simply because lying implies a malevolent intent. Gtp-4 has no intent, it just provides an output given an input, that can be either wrong or correct. A model able to provide more correct answers is a more accurate model. Computing accuracy for a LLM is not trivial, but gpt-4 is still a good model. User has to know how to use it, what to expect and how to evaluate the result. If they are unable to do so it’s completely their fault.

Why are you so pissed of a good nlp model?

@Asifall@lemmy.world
link
fedilink
English
31Y

I think there’s a big difference between being able to identify an AI by talking to it and being able to identify something written by an AI, especially if a human has looked over it for obvious errors.

Create a post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


  • 1 user online
  • 197 users / day
  • 590 users / week
  • 1.38K users / month
  • 4.49K users / 6 months
  • 1 subscriber
  • 7.41K Posts
  • 84.7K Comments
  • Modlog