cross-posted from: https://lemmy.intai.tech/post/43759
cross-posted from: https://lemmy.world/post/949452
OpenAI’s ChatGPT and Sam Altman are in massive trouble. OpenAI is getting sued in the US for illegally using content from the internet to train their LLM or large language models
It’s wild to see people in the piracy community of all places have an issue with someone benefiting from data they got online for free.
The difference is that they are profitting from other people’s work and properties, I don’t profit from watching a movie or playing a game for free, I just save some money.
You do if you make games or movies and those things give you inspiration.
This is just how learning is done though, whether it’s AI or human.
Absolutely not comparable. Inspiration and an amalgation of everything a LLM consumes are completely different things.
I’d argue that what we do is an amalgamation of what we are exposed to, to a great extent. And we are exposed to way less information than a LLM.
Key difference is that they’re making (alot of) money of off the stolen work, and in a way that’s only possible for the already filthy rich
Wouldn’t mind it personally if it was foss though, like their name suggests
FWIW even if it was FOSS I’d still care. For me it’s more about intent. If your business model/livelihood relies on stealing from people there’s a problem. That’s as true on a business level as it is an individual one.
Doesn’t mean I have an answer as sometimes it’s extremely complex. The easy analogy is how we pirate TV shows and movies. Netflix originally proved this could be mitigated by providing the material cheaply and easily. People don’t want to steal (on average).
I find people in general are much more willing to part with their money than the big corps think. I’ll even go to the extent to say that we enjoy doing so. Just look at Twitch – tonnes of money are thrown at streamers because it’s fun and convenient, or at TikTok vendors selling useless stuff on live streaming. We just don’t like to be lied to and treated like cash cows.
Many of us are sharing without reward and have strong ethical beliefs regarding for-profit distribution of material versus non-profit sharing.
It really isn’t that bonkers. A lot software thought is about licensing. See GPL and Creative Commons and all that stuff thats all about how things can be profited from/responsibilities around it. Benefiting from free data is one thing. Privately profiting at the expense or not sharing the capability/advances that came from it is another. Willing to bet there’s GPL violations via the training sets.
Is it even possible to attach licenses to text posts on social media?