After two weeks of demonstrations, some media buyers anticipate Reddit turning to alternative revenue streams.

  • QHC@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    AI scrapers never used the API. That’s just a convenient scapegoat.

    Don’t believe Huffman’s lies!

    • Candelestine@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I would love to learn more. Can you share any links? My googlefu is insufficient to cut through the bs without knowing more about the terms I need to be searching for.

      • cubism_pitta@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        It’s just the way a large data gathering project works. If you want to get data from 100s of sites a scraper is universal and can work for all while using an API would require (assuming all the sites HAVE an API) custom code for each.

      • QHC@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        There’s just no compelling reason to do it that way.

        LLMs like ChatGPT are getting data from the entire web and then having humans manually tag and identify everything. Getting data from the API is actually less useful to that end, and they’d need to integrate separately with every individual website.

        Most websites don’t even have an API in the first place, either, so scraping would still be necessary for most sites.