Stop crawling my HTML you dickheads – use the API

One of the (many) depressing things about the “AI” future in which we’re living, is that it exposes just how many people are willing to outsource their critical thinking. Brute force is preferred to thinking about how to efficiently tackle a problem. For some reason, my websites are regularly targetted by “scrapers” who want to gobble up all the HTML for their inscrutable purposes. The thing is, as much as I try to make my website as semantic as possible, HTML is not great for this sort of task. It is hard to parse, prone to breaking, and rarely consistent. Like most WordPress blogs, my site has an API. In the <head> of every page is something like: HTML<link rel=https://api.w.org/ href=https://shkspr.mobi/blog/wp-json/> Go visit https://shkspr.mobi/blog/wp-json/ and…

Read more on Hacker News