• igorlogius@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        1 year ago

        And they sell it to the highest bidder to train their next LLM, which seems to be all the rage at the moment.

    • thisbenzingring@wirebase.org
      link
      fedilink
      English
      arrow-up
      14
      ·
      edit-2
      1 year ago

      Text data on a compressed drive is so small. You have a modern server and accessing text files in a compressed drive is not noticeable performance hit. The compression ratio is massive for text and markup language files

      • thepianistfroggollum@lemmynsfw.com
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        4
        ·
        1 year ago

        Yes, text doesn’t take up much space, but decades of text can easily take up a lot of space, especially when you track things like edits.

        Not to mention that this data isn’t in text files. It’s going to be in a database, so the number of records that need to be parsed will impact performance. How big that impact is depends on how they set the database up.