OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

  • BURN@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    1 year ago

    Creating an AI model is a commercial work. They’re made to make money. Now these models are dependent on other artists data to train on. The models would be useless if they weren’t able to train on anything.

    I hold the stance that using copyrighted data as part of a training set is a violation of copyright. That still hasn’t been fully challenged in court, so there’s no specific legal definition yet.

    Due to the requirement of copywritten materials to make the model function I feel that they are using copyrighted works in order to build a commercial product.

    Also AI doesn’t learn. LLMs build statistical models based on sentence structure of what they’ve seen before. There’s no level of understanding or inherent knowledge, and there’s nothing new being added.