A new report from plagiarism detector Copyleaks found that 60% of OpenAI’s GPT-3.5 outputs contained some form of plagiarism.

Why it matters: Content creators from authors and songwriters to The New York Times are arguing in court that generative AI trained on copyrighted material ends up spitting out exact copies.

  • themoonisacheese@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    8 months ago

    Right? Nod doubt that output can be similar to training data, and I would believe that some of it is plagiarism, but plagiarism detectors are infamous among uni students for being completely unreliable and flagging pronouns, dates and citations. Until someone can go “here’s an example of actual plagiarism” (which is obvious when pointed out), these claims make no sense.

    • linearchaos@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      8 months ago

      If it’s plagiarizing, so are Google search results summaries.

      It’s not like it doesn’t cite where it found the data.