• P03 Locke@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    11
    ·
    8 months ago

    AI is trained off actual lyrics, which is why companies who create these models are at risk (they don’t own the data they’re feeding into the model.)

    Nobody is “at risk” of anything here. You don’t have to own data to use data, just like you’re not liable for the content of an Internet page because it was downloaded to your browser’s cache.

    Everybody who agrees with these lawsuits have a severe misunderstanding of how LLMs and other AI models work. They are large matrices of weights and numbers, not copies of the data they consume. The entire Stable Diffusion model is a 4GB file, trained from billions of images. It’s impossible to “copy” petabytes of images and somehow end up with a few gigabytes of numbers. The transformation is a lossy process, and its result does not fit the definition of copyright.

    • fuzzywolf23@beehaw.org
      link
      fedilink
      arrow-up
      5
      ·
      8 months ago

      That doesn’t make it “not copyright Infringement”, that just makes it an efficient compression algorithm. With the right prompt, you can recover copies of the original.

      • P03 Locke@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        4
        ·
        8 months ago

        With the right prompt, you can recover copies of the original.

        Clearly somebody who’s never used the software.