• merc@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    4
    ·
    9 days ago

    never intended for it to be used as training data.

    You could have chosen a different license than the GPL.

    Doesn’t the GPL cover shit like this?

    No. Didn’t you read the license you used?

    • balsoft@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 days ago

      GPL absolutely should cover shit like this. Training an LLM on your code makes it definitionally a derivative work, therefore it must be licensed under GPL too (with limited fair use exceptions which shouldn’t apply here). The problem is that the US government is not willing to enforce this at all, because it is owned by the same billionaires as the AI companies.

      • The_Decryptor@aussie.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 days ago

        The problem is that the US government is not willing to enforce this at all, because it is owned by the same billionaires as the AI companies.

        That’d be an uphill battle, even prominent OSS projects would fight against that unfortunately.

        If the output of an LLM would found to be derivative of the input, that’d cause lots of problems for (e.g.) Linux, they love claude and have been funneling its output into the kernel for a while now, they’d rather not think about the licensing situation there.

      • merc@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 days ago

        Training an LLM on your code makes it definitionally a derivative work

        If so, then every painter who studied Picasso is making Picasso-derived works. That’s not how copyright works.