• 0 Posts
  • 8 Comments
Joined 1Y ago
cake
Cake day: Jul 13, 2023

help-circle
rss

Am I missing something in this article? I’m not defending either company, but it doesn’t seem like they actually have any evidence to confirm either is doing this.

The world’s top two AI startups are ignoring requests by media publishers to stop scraping their web content for free model training data, Business Insider has learned.

It claims this, but then they say this about the source of this info:

TollBit, a startup aiming to broker paid licensing deals between publishers and AI companies, found several AI companies are acting in this way and informed certain large publishers in a Friday letter, which was reported earlier by Reuters. The letter did not include the names of any of the AI companies accused of skirting the rule.

So their source doesn’t actually say which companies are doing this, but then they jump straight into this:

AI companies, including OpenAI and Anthropic, are simply choosing to “bypass” robots.txt in order to retrieve or scrape all of the content from a given website or page.

So they’re just concluding that based on nothing and reporting it as fact?


In what way?

Why couldn’t even a basic reinforcement learning model be used to brute force “figure out what input gives desired X output”?


Machine learning could find those strengths and weaknesses and learn to work around them likely better than a human could. It’s just trial and error. There’s nothing about the human brain that makes it better suited to understanding the inner logic of an LLM.


Did you really post this just because it has the cop car light emoji and all caps at the top, without having any idea what it actually means? That’s hilarious.


I don’t have that issue on Mull, so maybe try that?


are there seriously already people being paid to shill on lemmy?

every single one of your posts is about how great grapheneos is.



Even with a TV there are options. I know Android/Google TV and LG WebOS both have apps available to sideload.