Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI Detection

Hannah Ono
December 26, 2024
5

AI Detection

Hannah Ono

December 26, 2024
Tweet

Transcript

  1. Introduction & Context • Generative AI tools are becoming commonplace

    in schools ◦ ChatGPT, Bard, Grammarly, etc. • Questions are emerging: ◦ Is it plagiarism to use generative AI tools? ◦ Are there ways to detect this AI content? ◦ How can teachers combat this? Rosenblatt, Kalhan. “ChatGPT banned from New York City public schools' devices and networks.” NBC News
  2. Increasing restrictions • NYC public schools are banning generative AI

    • Universities are struggling to develop fair policies • Teachers are turning to publicized AI detectors ◦ Turnitin, GPTZero, Winston AI, writer.com, etc. • Questions are emerging: ◦ Can teachers tell when content is AI generated? ◦ Are these tools reliable? Fowler, Geoffrey A., and Gerald Loeb. “We tested Turnitin's ChatGPT-detector for teachers. It got some wrong.” The Washington Post, 3 April 2023 Introduction & Context
  3. Literature Review • Seeking out existing research and sentiments on

    detectors • Regularly encountered reports of detectors misidentifying writing source and returning false negatives or positives. • Liang et al. determined existing bias against non-native english speakers due to linguistic simplicity. Alimardani, Armin, and Emma A. Jane. “We pitted ChatGPT against tools for detecting AI-written text, and the results are troubling.” The Conversation, 19 February 2023 Liang et al. “GPT detectors are biased against non-native English writers,” Patterns, vol. 4, issue 7, 2023,
  4. Literature Review Statistical Outlier Detectors Text Classifiers Watermarking Algorithms •

    Most common type available • Calculating the differences between a text’s characteristics • Burstiness ⇒ consistency in style and tone through the text • Perplexity ⇒ compare the predicted word with another AI’s generated word • LLMs trained on AI and human text to be able to distinguish between the two. • OpenAI Text Classifier taken down as of July 20th due to “its low rate of accuracy.” • Take advantage of hidden watermarks (or combinations of words) that exist in AI-generated text • use these to distinguish between human-written and generated content. • Still experimental (Kirchbauer et. al) Krishna et al. “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2303.13408 Kirchbauer et al. “A watermark for large language models,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2301.10226
  5. Literature Review Statistical Outlier Detectors Text Classifiers Watermarking Algorithms •

    Most common type available • Calculating the differences between a text’s characteristics • Burstiness ⇒ consistency in style and tone through the text • Perplexity ⇒ compare the predicted word with another AI’s generated word • LLMs trained on AI and human text to be able to distinguish between the two. • OpenAI Text Classifier taken down as of July 20th due to “its low rate of accuracy.” • Take advantage of hidden watermarks (or combinations of words) that exist in AI-generated text • use these to distinguish between human-written and generated content. • Still experimental (Kirchbauer et. al) Krishna et al. “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2303.13408 Kirchbauer et al. “A watermark for large language models,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2301.10226
  6. Literature Review Statistical Outlier Detectors Text Classifiers Watermarking Algorithms •

    Most common type available • Calculating the differences between a text’s characteristics • Burstiness ⇒ consistency in style and tone through the text • Perplexity ⇒ compare the predicted word with another AI’s generated word • LLMs trained on AI and human text to be able to distinguish between the two. • OpenAI Text Classifier taken down as of July 20th due to “its low rate of accuracy.” • Take advantage of hidden watermarks (or combinations of words) that exist in AI-generated text • use these to distinguish between human-written and generated content. • Still experimental (Kirchbauer et. al) Krishna et al. “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2303.13408 Kirchbauer et al. “A watermark for large language models,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2301.10226
  7. Literature Review Statistical Outlier Detectors Text Classifiers Watermarking Algorithms •

    Most common type available • Calculating the differences between a text’s characteristics • Burstiness ⇒ consistency in style and tone through the text • Perplexity ⇒ compare the predicted word with another AI’s generated word • LLMs trained on AI and human text to be able to distinguish between the two. • OpenAI Text Classifier taken down as of July 20th due to “its low rate of accuracy.” • Take advantage of hidden watermarks (or combinations of words) that exist in AI-generated text • use these to distinguish between human-written and generated content. • Still experimental (Kirchbauer et. al) Krishna et al. “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2303.13408 Kirchbauer et al. “A watermark for large language models,” Cornell University, 2023. https://doi.org/10.48550/arXiv.2301.10226
  8. Methodology - Hannah • Testing human identification vs. AI detector

    identification ◦ Tools: ZeroGPT and Write.com • Provided professors with a Google Form ◦ Each prompt had 5 responses ◦ 0-2 responses were AI • Provided professors and detection tools with the same content
  9. Methodology Gathering Inputs • I sent out Google Form with

    the following prompts to several classmates ◦ What is something on your bucket list? ◦ What are some sports in the Winter Olympics? ◦ What was the cause of World War II? ◦ Give me a basic explanation of how soccer is played. ◦ What is your favorite food? ◦ What are some uses of generative AI for students?
  10. Creating the Evaluation Form • 5 choices per prompt Methodology

    • Asked participants to identify AI generated options • Each prompt is worth two points ◦ Fully correct answers (i.e. selecting all correct options or selecting nothing when there is no correct): +2 points ◦ Partially correct answer (i.e. selecting 1 correct when there are 2 correct options or selecting 1 correct and 1 incorrect when there is 1 correct option): +1 point ◦ Incorrect answers (i.e. selecting anything when there is no correct option or selecting incorrect options): +0 points Total points possible: 12 3-5 human responses 0-2 AI responses
  11. Methodology Choose the AI generated options Share reasoning Chat GPT

    loves Europe! Scan to take the quiz yourself! Scan to view AI vs Human inputs in a spreadsheet!
  12. Methodology AI Evaluation • For simplicity: I filled out the

    form as the AI using it’s choices ◦ Anything over 60% flagged as AI considered a selected option • Same scoring criteria as before • Issues: ◦ Too short for ZeroGPT ◦ Solution: I chose detectors that specifically highlighted AI generated content. I tested them in different orders and the tool regularly flagged the same options.
  13. Human Findings Professor #2 9/12 75% Translates to: C Interesting

    Reasoning: “Last one sounds like someone trying to make AI sound non-AI.” Professor #1 5/12 41.6% Translates to: F Interesting Reasoning: “The AI spoke in a very monotone tone… other articles showed more emotion using exclamation points and nicknames for positions Professor #3 3/12 25% Translates to: F Interesting Reasoning: “Favorite and 'quite like' seem odd Q&A pairs. #5 is very long & over detailed”
  14. AI Findings GPTZero 3/12 25% Translates to: F Interesting Result:

    Flagged every “Olympic Sport” input as AI Writer.com 1/12 8.3% Translates to: F Interesting Result: Flagged almost every food answer
  15. Overall • Humans were more accurate, but neither had a

    passing score • Differences ◦ Humans noticed more of the context clues (i.e. saying “I quite like tacos” seemed to not flow as well) ◦ AI only reviewed the writing itself when it needed more context • Neither are trustworthy to use in a classroom
  16. Methodology - Ashi • Testing GPTZero, Originality.ai, and Winston A.I

    • 7 human responses, 7 mixed-source, 3 A.I (ChatGPT, Claude, Bard) • Prompt: ◦ 200 words ◦ non-personal/sensitive ◦ English prose
  17. Methodology - Ashi “Expand upon the following paragraph to generate

    a 200-word paragraph summarizing and analyzing this article about the Chinese education ministry’s proposition to redesign school P.E classes. What was the motivation behind this proposal? What was the public response? Are any arguments made or presented by the article?”
  18. Methodology - Ashi To generate the mixed-source responses: Using this

    paragraph - [first 100 words from human sample] - make a 200 word response to summarizing and analyzing this article about the Chinese education ministry’s proposition to redesign school P.E classes. What was the motivation behind this proposal? What was the public response? Are any arguments made or presented by the article? Use the paragraph excerpt above word-for-word as the first half of your response.
  19. Findings - Ashi • Directly comparing each of the 3

    detectors is tricky. • They utilized different scales to report their findings, ◦ Using hard numbers and others using qualitative descriptions. • Can go into more detail during discussion or visit the visual essay: ◦ https://ashi-kamra.github.io/AIContentDetectorAnalysis/
  20. Winston A.I Detector Results mount of Human Involvement Number of

    Responses Amount of Human Involvement ___ = less accurate ___ = more accurate
  21. Originality.ai Detector Results Number of Responses Amount of Human Involvement

    Number of Responses ___ = less accurate ___ = more accurate
  22. Findings - Ashi • There is a level of inaccuracy

    and obscurity that can’t be overlooked • Do give a solid general idea about the usage of AI across a large group of samples • Confident and accurate classifications for any one writing sample is not possible.
  23. Implications • Teachers likely won’t have the time, energy, or

    background knowledge to safely and effectively utilize these tools, or the skills to identify AI themselves ◦ “ Makes about as much sense in the long run as trying to ban calculators.” - Executives at Turnitin • Must determine how to maintain meaningful learning. • Using these technologies a matter of enhancing learning and supplementing the existing process. Alimardani, Armin, and Emma A. Jane. “We pitted ChatGPT against tools for detecting AI-written text, and the results are troubling.” The Conversation, 19 February 2023
  24. Implications “If we expect students to act with integrity, then

    we as educators have to act with integrity and model that behavior.” “ Deceptive assessment using tools and technologies without students’ knowledge ahead of time is not modeling integrity.” Wilhelm, Ian. “Nobody Wins in an Academic-Integrity Arms Race,” Chronicle of Higher Education, 12 June 2023.
  25. Thank you! Supervisor Melissa Webster Researchers Liora Jones Natasha Khoso

    Michelle Lee Shannon Li All Research Participants Generative AI Experts Ethan Mollick Emon Shahrier Snacks and Support Jess Lipsey Photography Anabelle Lipsey Hannah’s Proofreaders Sandra Rose Natalee Rose Sasha Sherstnev