@gaylord_fartmaster - Lemmy

@gaylord_fartmaster@lemmy.world

Am I missing something in this article? I’m not defending either company, but it doesn’t seem like they actually have any evidence to confirm either is doing this.

The world’s top two AI startups are ignoring requests by media publishers to stop scraping their web content for free model training data, Business Insider has learned.

It claims this, but then they say this about the source of this info:

TollBit, a startup aiming to broker paid licensing deals between publishers and AI companies, found several AI companies are acting in this way and informed certain large publishers in a Friday letter, which was reported earlier by Reuters. The letter did not include the names of any of the AI companies accused of skirting the rule.

So their source doesn’t actually say which companies are doing this, but then they jump straight into this:

AI companies, including OpenAI and Anthropic, are simply choosing to “bypass” robots.txt in order to retrieve or scrape all of the content from a given website or page.

So they’re just concluding that based on nothing and reporting it as fact?

@gaylord_fartmaster@lemmy.world

In what way?

Why couldn’t even a basic reinforcement learning model be used to brute force “figure out what input gives desired X output”?

@gaylord_fartmaster@lemmy.world

Machine learning could find those strengths and weaknesses and learn to work around them likely better than a human could. It’s just trial and error. There’s nothing about the human brain that makes it better suited to understanding the inner logic of an LLM.

@gaylord_fartmaster@lemmy.world

Did you really post this just because it has the cop car light emoji and all caps at the top, without having any idea what it actually means? That’s hilarious.

@gaylord_fartmaster@lemmy.world

I don’t have that issue on Mull, so maybe try that?

@gaylord_fartmaster@lemmy.world

are there seriously already people being paid to shill on lemmy?

every single one of your posts is about how great grapheneos is.

@gaylord_fartmaster@lemmy.world

I haven’t done it personally because I have an Android TV box connected to my LG TV with SmartTubeNext, but here’s an archived reddit post with a guide for it.

@gaylord_fartmaster@lemmy.world

Even with a TV there are options. I know Android/Google TV and LG WebOS both have apps available to sideload.