June 20

Brief: 7 tools to detect AI-generated text

2  comments

This post inaugurates a new type of content: brief posts intended to convey specific information in as little time as possible. For this brief, here are tools you can use to check whether something you are reading was written by a human or generated by AI. But do they work?

  • AI Content Detector. Using similar algorithms to what transformers do to generate text, this tool analyses recognisable probabilistic AI patterns, as well as an element of perplexity that is tell-tale of spontaneous, non-predictable, human writing. It boasts a near 98% success rate.
  • GPTZero. This is a classification tool trained on a large corpus of human-written text and AI-generated text to learn to differentiate the two in English prose up to 5000 characters. It provides predictions on a sentence, paragraph, and document level, so you can zero in on specific segments of a text if the document-level analysis shows a mix of human and AI.
  • Content at scale AI Detector. This is a similar tool, in that it was trained over billions of pages to “accurately forecast the most probable word choices that lead to a higher AI detection probability”. It claims to work at a deeper level than most competitors. It requires 25 words at least to perform an analysis.
  • Crossplag AI Content Detector. Crossplag explains that this detector uses a combination of Machine Learning algorithms and natural language processing techniques. It provides a rating on a Human – AI scale, for texts up to 3000 words.
  • Compilatio AI Detector. This works in English, French, Italian or Spanish, and only allows small chunks of text (fewer than 2000 characters).
  • Writer AI Content Detector. Interestingly, this tool was designed to help writers who use AI to create texts to evaluate how noticeable this is. Since search engines are likely to demote AI-generated copy, it pays to edit the text to fool Google and friends (good luck with that). But you can also use it to test 1500 character long paragraphs that you didn’t create.
  • ZeroGPT. This is not the same as GPTZero. Based on in-house deep learning algorithms and research, this again analyses a text to gauge the percentage of AI-generated portions in it.
 

How many AI detectors do you need?

In most situations, none. If the text you are reading is entertaining and informative, who cares how it was created? But when you need to assess someone’s work or the trustworthiness of a source (remember AI-generated text is not yet always truthful) the above will come in handy.

There are two reasons for providing a large list.

First of all, none of these tools is 100% reliable. By combining the results of multiple detectors that use different techniques, you can get a better idea of what you are really working with. AI detectors have been wrongly used in the past to punish students, which is sadly ironic. Combining results might save you the despair? Secondly, AI technology evolves constantly, and there is no way of knowing which of those tools has caught up to the latest version of the AI generators on the market. With this list, you have a choice.

 

Does it work?

I would have no problem using AI to help me create informative text. But, on this page the only words that aren’t mine were short snippets copied from the respective AI detector pages to provide accurate descriptions. I submitted the text above “Does it work?” to each of these tools. Here are the results :

  • AI Content Detector. “Human score 100%”. Great result.
  • GPTZero. “Your text is likely to be written entirely by a human”. Good job.
  • Content at scale AI Detector. “89% Highly likely to be Human!” Phew.
  • Crossplag AI Content Detector. “0% AI”. Great result.
  • Compilatio AI Detector. “Human 99% reliability“. Considering some of the snippets I copied may have been AI generated, this feels like a perfect result.
  • Writer AI Content Detector. “100% human-generated content. Fantastic”. Indeed.
  • ZeroGPT. “16.86% AI GPT” This seems to be the least reliable performer on this specific test.

Next, I asked ChatGPT to summarise the contents of the text about “Does it work?” and submitted that to the various tools. Here are the results:

  • AI Content Detector. “100% Human” Uh oh.
  • GPTZero. “Your text is likely to be written entirely by a human”. Uh oh.
  • Content at scale AI Detector. “97% Highly likely to be human”. Uh oh.
  • Crossplag AI Content Detector. “0% AI” Uh oh.
  • Compilatio AI Detector. “Human, 99% reliability”. Uh oh.
  • Writer AI Content Detector. “100% Human-generated content. Fantastic” Not fantastic at all.
  • ZeroGPT. “0% AI GPT”. So the AI-generated summary feels more human than my original text. That’s not very kind πŸ˜‰

Oh dear …

The ChatGPT essay used in the test below
 

So, finally (this is supposed to be a brief format, after all), I asked Chat GPT to write a short essay on how Covid was handled throughout the world, then submitted that to the scrutiny of the detectors. Here are the results.

  • AI Content Detector. “HUMAN SCORE: 90 %” No no no …
  • GPTZero. “Your text may include parts written by AI”. Better. GPT zero also highlighted what feld AI-generated.
GPTZero’s results.
  • Content at scale AI Detector. 79% Likely both human and AI-generated content.
Content at scale AI Detector response.
  • Crossplag AI Content Detector. “10% AI. This text is mainly written by a human.” Ouch.
  • Compilatio AI Detector. “Human 99% reliability”. Ouch.
  • Writer AI Content Detector. “98% human-generated content. Fantastic” Ouch.
  • ZeroGPT. “35.74% AI GPT”. One of the best, still not very good.

Well, that did not pan out as I anticipated. At the date of publication (June 20th 2023) I would not feel confident using any of those tools. While they do seem able to identify human-written text, they are largely incapable of spotting something that is AI-generated. Maybe ChatGPT has recently gotten so good that updates of these tools are necessary. I wouldn’t criticize any of them, as the task they are tackling is both complex and shifting. Maybe they would have performed beautifully in other examples. But I would always advise caution – and testing – before using any of the above for something critical or that puts your responsibility on the line.

As with anything AI-related today, there are no hard truths. You need to test everything for yourself, and fact check, fact check, fact check.

On that bombshell, this largely human blogger bids you farewell.


Tags


You may also like

Leave a Reply

Your email address will not be published. Required fields are marked

  1. Thanks Pascal!

    Ahem,
    are any of these detectors produced by an AI?
    If so, your results are not surprising at all…
    πŸ˜‰

    E.g.:
    "… An AI hired a guy online to solve a CAPTCHA, that kind of annoying image test that's supposed to convince a site that you're human. When the guy asked if it was an AI, it lied and said it was a visually impaired person so it could achieve its goal. …"

    ( – Edited Google translate
    from a talk by Max Tegmark.)

    ( P.S.
    Only part of comment visible so editing not so easy on a phone. I guess you’re stuck with that?)

    1. Hi Kristian, I do not think those detectors are produced by AI πŸ˜‰
      Are you having difficulties leaving comments from a phone? I’m working on the design of the website now, and will try to test this myself.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Get in touch

Name*
Email*
Message
0 of 350