The Cylons were created by man

cylon Elon Musk

The war against the bots rages on.

For this website, as well as others I host, I have attempted to install security measures to cut down on various types of bot traffic. Some have worked pretty well; spam comments are way down, for instance. But the newest wave of Internet bots is more tenacious, ruthless, and borg-like. They adapt.

These are scraper bots. Automated scripts that bypass traditional filters meant to regulate non-human user traffic in order to scan everything written on a web page and plug it into a huge repository called a Large Language Model to train so-called "artificial intelligence" (a misnomer as it is neither) programs. These scraper bots spoof their identifying markers so they appear to be a regular user using a regular web browser on a regular computer, but they're not. In Battlestar Galactica metaphor, they're skinjobs—they may look like human beings but they're still Cylons.

Thus far there seems to be no way to adequately combat these plagiarism factories without either adding whole layers of expensive third-party software firewalls or forcing every human user to log in with password credentials. I've tried blocking the bot IP addresses, they just cycle through new ones. I managed to eliminate a lot of them by blocking all browsers using Chinese, but then within a day or two they were back using English. I've tried blocking their spoofed configurations—generally they prefer to show as a MacOS using an outdated version of Chrome with an obsolete screen resolution—but that only nails a small fraction of them since most don't really use such configs; those are the fake IDs shown inside the bar, not the different fake IDs used to get past the bouncer.

My latest attempt at blocking them, which I will not explain here, appears to be effective for the moment. No bot traffic for several hours now. But like any good Borg drones, I rather expect to check the logs tomorrow and find that they've adapted.

Fortunately, these bots don't use up a ton of resources; since they don't actually render the site on a browser, the bandwidth usage on each hit is relatively small. But it adds up. And they're everywhere—estimates are that over 50% of web traffic today is bots and that as much as 80% of that is "AI" scrapers. Other estimates are less specific, but measure over a third of all web traffic as "bad bots," i.e. malicious actors of one sort or another, but whether or not "AI" scrapers qualify as "bad" depends on who's doing the study.

I blame Elon and Zuck, but let's face it, if it wasn't them it'd be some other entitled asshats thinking they can just do what they want and steal everyone's work with impunity.

I'll now wait and see if my latest mitigation is worth anything, and if so start applying it to client sites.

← Previous: How is it December already? (December 1, 2025)

|

Next: Better late than never (December 5, 2025) →

Comments

  • Posted by Bill on December 3, 2025 (5 months ago)

    Is it too late to fire all of cyberspace into another system's Sun?

Post your comment

RSS feed for comments on this page | RSS feed for all comments

← Previous: How is it December already? / Next: Better late than never →