• nialv7@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    6 days ago

    We had a trust based system for so long. No one is forced to honor robots.txt, but most big players did. Almost restores my faith in humanity a little bit. And then AI companies came and destroyed everything. This is why we can’t have nice things.

  • mfed1122@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    6 days ago

    Okay what about…what about uhhh… Static site builders that render the whole page out as an image map, making it visible for humans but useless for crawlers 🤔🤔🤔

      • mfed1122@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 days ago

        I wasn’t being totally serious, but also, I do think that while accessibility concerns come from a good place, there is some practical limitation that must be accepted when building fringe and counter-cultural things. Like, my hidden rebel base can’t have a wheelchair accessible ramp at the entrance, because then my base isn’t hidden anymore. It sucks that some solutions can’t work for everyone, but if we just throw them out because it won’t work for 5% of people, we end up with nothing. I’d rather have a solution that works for 95% of people than no solution at all. I’m not saying that people who use screen readers are second-class citizens. If crawlers were vision-based then I might suggest matching text to background colors so that only screen readers work to understand the site. Because something that works for 5% of people is also better than no solution at all. We need to tolerate having imperfect first attempts and understand that more sophisticated infrastructure comes later.

        But yes my image map idea is pretty much a joke nonetheless

      • notarobot@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 days ago

        Those are not the tech bros. The tech bros are the ones who move fast and break things. The internet was built by engineers and developers

  • zbyte64@awful.systems
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 days ago

    Is there nightshade but for text and code? Maybe my source headers should include a bunch of special characters that then give a prompt injection. And sprinkle some nonsensical code comments before the real code comment.

  • Spaz@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    6 days ago

    Is there a migration tool? If not would be awesome to migrate everything including issues and stuff. Bet even more people would move.

    • BlameTheAntifa@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 days ago

      Codeberg has very good migration tools built in. You need to do one repo at a time, but it can move issues, releases, and everything.

  • SufferingSteve@feddit.nu
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    7 days ago

    There once was a dream of the semantic web, also known as web2. The semantic web could have enabled easy to ingest information of webpages, removing soo much of the computation required to get the information. Thus preventing much of the AI crawling cpu overhead.

    What we got as web2 instead was social media. Destroying facts and making people depressed at a newer before seen rate.

    Web3 was about enabling us to securely transfer value between people digitally and without middlemen.

    What crypto gave us was fraud, expensive jpgs and scams. The term web is now even so eroded that it has lost much of its meaning. The information age gave way for the misinformation age, where everything is fake.

    • GreenShimada@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 days ago

      Mr. Internet, tear down these walls! (for all these walled gardens)

      Return the internet to the wild. Let it run feral like dinosaurs on an island.

      Let the grannies and idiots stick themselves in the reservations and asylums run by billionaires.

      Let’s all make Neocities pages about our hobbies and dirtiest, innermost thoughts. With gifs all over.

      • Furbag@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        I’m down with that. Web 1.5? Let’s do it. I’ll get my Geocities page up and then we can rev up that hit counter.

    • tourist@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 days ago

      Web3 was about enabling us to securely transfer value between people digitally and without middlemen.

      It’s ironic that the middlemen showed up anyway and busted all the security of those transfers

      You want some bipcoin to buy weed drugs on the slip road? Don’t bother figuring out how to set up that wallet shit, come to our nifty token exchange where you can buy and sell all kinds of bipcoins

      oh btw every government on the planet showed up and dug through our insecure records. hope you weren’t actually buying shroom drugs on the slip rod

      also we got hacked, you lost all your bipcoins sorry

      At least, that’s my recollection of events. I was getting my illegal narcotics the old fashioned way.

      • Serinus@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 days ago

        I feel like half of the blame capitalism gets is valid, but the other half is just society. I don’t care what kind of system you’re under, you’re going to have to deal with other people.

        Oh, and if you try the system where you don’t have to deal with people, that just means other people end up handling you.

        • kazerniel@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          It matters a lot though what kind of goal the system incentivises. Imagine if it was people’s happiness and freedom instead of quarterly profits.

        • Amju Wolf@pawb.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 days ago

          In this case it is purely fault of the money incentive though. Noone would spend so much effort and computation power on AI if they didn’t think it could make them money.

          The funniest part is though that it’s only theoretical anyway, everyone is only losing on it and they’re most likely never gonna make it back.

        • Marshezezz@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          0
          ·
          6 days ago

          Could you clarify on what you mean with “dealing with people”? I’m not really sure the point you’re trying to make with that

          • Serinus@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            6 days ago

            The complaint that got blamed on capitalism was:

            The information age gave way for the misinformation age, where everything is fake.

            and if there’s one entity/person most responsible for that, it’s Putin or the GOP. Most of it is political, and very little to do with capitalism itself. Except that capitalism surrounds and is intertwined with everything.

            Still, if you get rid of capitalism, it doesn’t get rid of politics. I’d argue that the root of the issue is the GOP trying to hoard power (money and otherwise), and power is going to exist with or without capitalism. Is North Korea capitalist? Do they have issues with disinfo?

            This Christian Sharia Law movement doesn’t exist for money.

            • Marshezezz@lemmy.blahaj.zone
              link
              fedilink
              English
              arrow-up
              0
              ·
              4 days ago

              Capitalism itself is political. But the point I was making was that capitalism was the driving force of enshittenfication of all our technology that could be used for helping us all but instead it’s only about profit. Which is capitalism…

              • Serinus@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                4 days ago

                Two of the largest drivers are religion, christians wanting their Sharia Law, and Russia taking political control of the US.

                Capitalism is in the top three, sure. It’s also part of the driver of that technology.

                I don’t think we should worship capitalism as we have, but I don’t think getting rid of capitalism as a whole solves more problems than it creates.

                Give me capitalism with heavy socialist controls and political separation please, thanks. The general idea of using money as a measure of what society owes you isn’t terrible. It’s allowing that measure to get so out of whack and have such inordinate control of everything that is the problem.

  • zifk@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 days ago

    Anubis isn’t supposed to be hard to avoid, but expensive to avoid. Not really surprised that a big company might be willing to throw a bunch of cash at it.

    • randomblock1@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 days ago

      No, it’s expensive to comply (at a massive scale), but easy to avoid. Just change the user agent. There’s even a dedicated extension for bypassing Anubis.

      Even then AI servers have plenty of compute, it realistically doesn’t cost much. Maybe like a thousandth of a cent per solve? They’re spending billions on GPU power, they don’t care.

      I’ve been saying this since day 1 of Anubis but nobody wants to hear it.

      • T156@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        The website would also have to display to users at the end of the day. It’s a similar problem as trying to solve media piracy. Worst comes to it, the crawlers could read the page like a person would.

        • acockworkorange@mander.xyz
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 days ago

          You could have a server for open code access with very limited bandwidth and another for authenticated users with higher bandwidth.

    • sudo@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      7 days ago

      This is what I’ve kept saying about POW being a shit bot management tactic. Its a flat tax across all users, real or fake. The fake users are making money to access your site and will just eat the added expense. You can raise the tax to cost more than what your data is worth to them, but that also affects your real users. Nothing about Anubis even attempts to differentiate between bots and real users.

      If the bots take the time, they can set up a pipeline to solve Anubis tokens outside of the browser more efficiently than real users.

      • black_flag@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 days ago

        Yeah but ai companies are losing money so in the long run Anubis seems like it should eventually return to working.

        • r00ty@kbin.life
          link
          fedilink
          arrow-up
          1
          ·
          7 days ago

          It’s the usual enshittification tactic. Make AI cheap so companies fire tech workers. Keep it cheap long enough that we all have established careers as McDonald’s branch managers, then whack up the prices once they’re locked in.

  • PhilipTheBucket@piefed.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    I feel like at some point it needs to be active response. Phase 1 is a teergrube type of slowness to muck up the crawlers, with warnings in the headers and response body, and then phase 2 is a DDOS in response or maybe just a drone strike and cut out the middleman. Once you’ve actively evading Anubis, fuckin’ game on.

    • TurboWafflz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 days ago

      I think the best thing to do is to not block them when they’re detected but poison them instead. Feed them tons of text generated by tiny old language models, it’s harder to detect and also messes up their training and makes the models less reliable. Of course you would want to do that on a separate server so it doesn’t slow down real users, but you probably don’t need much power since the scrapers probably don’t really care about the speed

    • BlameTheAntifa@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 days ago

      The problem is that hundreds of bad actors doing the same thing independently of one another means it does not qualify as a DDoS attack. Maybe it’s time we start legally restricting bots and crawlers, though.