🤔 Interesting

cat_fishing@feddit.online · 9 days ago

🤔 Interesting

terranoid@lemmy.cafe · 9 days ago

Finally someone said it. I honestly was wondering why no one was complaining about this… I’ve worked on some open source myself, licensed it GPL, and never intended for it to be used as training data.

Doesn’t the GPL cover shit like this? There should be mass lawsuits hitting any AI that used open source software and didn’t just specifically use BSD projects or something.

If you train an LLM on GPL code, it should be illegal to sell that LLM and use it commercially without revealing ALL THE SOURCE you used and the source to regenerate that model.

AeonFelis@lemmy.world · 9 days ago

If you train an LLM on GPL code, it should be illegal to sell that LLM and use it commercially without revealing ALL THE SOURCE you used and the source to regenerate that model.

Also if that LLM is used to generate code - that code must also be GPL.

terranoid@lemmy.cafe · 9 days ago

I’d love to see lawsuits force Microsoft and Nvidia and OpenAI to open source everything they had AI touch 😁

Chronographs@lemmy.zip · 9 days ago

Yeah I mean they train ai on commercially copyrighted stuff like books that they straight up pirate so if that doesn’t stop them the open source community certainly won’t

A_norny_mousse@piefed.zip · 9 days ago

Doesn’t the GPL cover shit like this? There should be mass lawsuits

I hope it’ll happen eventually.

Currently the USA (and that’s where most of this shit comes from) is aggressively pro AI to the point of breaking the law with government support.

BTW what OP says has happened to Linux (at large) through Google/Android, too. The GPL hasn’t stopped them but surely put some limits on their exploitation of FOSS

merc@sh.itjust.works · 9 days ago

never intended for it to be used as training data.

You could have chosen a different license than the GPL.

Doesn’t the GPL cover shit like this?

No. Didn’t you read the license you used?

balsoft@lemmy.ml · 9 days ago

GPL absolutely should cover shit like this. Training an LLM on your code makes it definitionally a derivative work, therefore it must be licensed under GPL too (with limited fair use exceptions which shouldn’t apply here). The problem is that the US government is not willing to enforce this at all, because it is owned by the same billionaires as the AI companies.

merc@sh.itjust.works · 8 days ago

Training an LLM on your code makes it definitionally a derivative work

If so, then every painter who studied Picasso is making Picasso-derived works. That’s not how copyright works.

The_Decryptor@aussie.zone · 9 days ago

The problem is that the US government is not willing to enforce this at all, because it is owned by the same billionaires as the AI companies.

That’d be an uphill battle, even prominent OSS projects would fight against that unfortunately.

If the output of an LLM would found to be derivative of the input, that’d cause lots of problems for (e.g.) Linux, they love claude and have been funneling its output into the kernel for a while now, they’d rather not think about the licensing situation there.