southpaw

@nycsouthpaw.bsky.social

People will really yell at you if you suggest AI models should pay for a license to all the works they ingest and cite their sources in their answers. I think it’s in part bc they believe in the magic of a machine intelligence. But we see more and more undeniable examples of straight up copying.

May 09, 2024 at 17:36 UTC

31 replies 436 reposts 1328 likes

DBCII @dbcii.bsky.social
[ View ] May 09, 2024 at 18:44 UTC

They’d certainly be higher quality if they were investing paid content instead of free content.

1 replies 0 reposts 1 likes

Bjorn Fritz @bjorn-fritz.bsky.social
[ View ] May 10, 2024 at 10:58 UTC

Theres no such thing as magic, or creative machines. Creativity is based on the lived experience of being born, living and dying in a natural world – which machines will lack for a long time to come.

0 replies 0 reposts 0 likes

MU-RDDOQ @aaroncollom.bsky.social
[ View ] May 09, 2024 at 17:39 UTC

the citation thing is weird because that functionality makes AI useful? Verifiable? If you resist citation you are defending an ability to bullshit?

2 replies 0 reposts 19 likes

Pat Scaramuzza @genocideman.bsky.social
[ View ] May 09, 2024 at 18:54 UTC

It's called Overfitting. The more specific your question, the more likely it'll find a single reference that answers your question, and it'll parrot that reference verbatim. Normally a LLM will blend together all it knows about a topic, but if it only knows one thing that's what you'll get.

0 replies 0 reposts 1 likes

nottooloud @nottooloud.bsky.social
[ View ] May 09, 2024 at 20:28 UTC

Not of it is A.I., of course. It's machine learning, which is, by definition, copying. "When I see this, put that next to it." Of course they should pay the creator of "that".

0 replies 0 reposts 0 likes

TommyBenBergman @tbapple.bsky.social
[ View ] May 09, 2024 at 17:49 UTC

Expecting AI models to cite sources would be similar to requiring Social Media companies to moderate posts or requiring bars to ID their patrons.

0 replies 0 reposts 6 likes

Zach Allen-Kelly @zachwallen.bsky.social
[ View ] May 09, 2024 at 18:13 UTC

It was clear this was stolen from something because coconut is too funny of a closer for it to have come up with on it's own

0 replies 0 reposts 2 likes

John Price @calmthoughts.bsky.social
[ View ] May 09, 2024 at 20:10 UTC

Yes. No tech, but i believe that.

0 replies 0 reposts 1 likes

slrellison.bsky.social @slrellison.bsky.social
[ View ] May 12, 2024 at 06:03 UTC

Do we charge human authors a license to all the works they ingest? And require them to cite their sources? Not, I think, for fiction, and the latter mainly for verification, not credit. Flagrant plagiarism is dealt with when it happens; not in case it might.

0 replies 0 reposts 0 likes

Neal Curtis @nihilcurtis.bsky.social
[ View ] May 09, 2024 at 18:51 UTC

📌

0 replies 0 reposts 0 likes

Samantha Ferreira Wants to Talk About Anime History @sam-animeherald.bsky.social
[ View ] May 09, 2024 at 17:49 UTC

At this point, my immediate reaction to LLM engines scraping my publication's stuff is "fuck you, pay me." I never gave consent, and I refuse to grant that to simpering tech bros who sneer that they're "going to put us out of business"

0 replies 1 reposts 14 likes

southpaw @nycsouthpaw.bsky.social
[ View ] May 09, 2024 at 17:41 UTC

To google’s credit, when I went to replicate this, it does cite the 2019 Quora post—though obviously it doesn’t detect the joke.

3 replies 4 reposts 104 likes

Lyndon Hood @lyndonhood.bsky.social
[ View ] May 09, 2024 at 17:50 UTC

I feel like actually producing true and accurate references is still hard in the extreme sense. I mean they can fully produce copies of things but from it's point of view it's not referencing any particular thing. (And you may have seen what happens when you ask for references *in* the output )

2 replies 0 reposts 2 likes

Kanna Banana 🏳️‍⚧️🇧🇷 @kanna-banana.bsky.social
[ View ] May 09, 2024 at 17:52 UTC

i knew it was too funny to not be made by a real person

0 replies 0 reposts 2 likes

Sean Eric Fagan @kithrup.bsky.social
[ View ] May 09, 2024 at 17:40 UTC

As far as I can tell, google didn't copy there, it is just a link to quora. This could be a presentation issue -- I did the experiment in Safari on macOS, and it certainly did show the link.

0 replies 0 reposts 1 likes

jacob @happinessdata.bsky.social
[ View ] May 09, 2024 at 17:39 UTC

When I run it, it at least does give its (horrible) sourcing

0 replies 0 reposts 2 likes

LibrarianWriterGeek @sctadsen.bsky.social
[ View ] May 09, 2024 at 17:39 UTC

Imagine holding AI to more or less the same standard we hold college undergrads to...

0 replies 0 reposts 16 likes

C. W. House @cwhouse.bsky.social
[ View ] May 09, 2024 at 20:08 UTC

This has been my experience with AI-generated source code. The AI-generated solution is usually a direct copy of the top Stack Exchange hit. (When it isn't, it's often unusable and will hallucinate things like nonexistent packages or else it mixes up languages.)

0 replies 0 reposts 1 likes

Lester Sabotage @lesabot.bsky.social
[ View ] May 09, 2024 at 22:51 UTC

Considering every model is the combination of all of its inputs any "citation" would just be the list of everything that the model was trained on

0 replies 0 reposts 0 likes

Amelia McNamara @ameliamn.bsky.social
[ View ] May 09, 2024 at 18:08 UTC

It's been a while since I've done a true spit-take, thanks for this.

0 replies 0 reposts 1 likes

Ali Fleih @alifleih.bsky.social
[ View ] May 09, 2024 at 19:13 UTC

Easy fix: Exclude all Quora results.

0 replies 0 reposts 1 likes

Murder Hornet @knife-gator.bsky.social
[ View ] May 09, 2024 at 18:13 UTC

AI is just plagiarism with extra steps.

0 replies 0 reposts 0 likes

Godspeed You! Woke Moralists @dashwallkick.bsky.social
[ View ] May 09, 2024 at 18:09 UTC

The fact that this is how it operates and this is literally all it can do, despite the obvious total loss of context, should be a warning to the people buying in on it. And yet!

0 replies 0 reposts 8 likes

❀°。der Siebenschläfer *.ﾟ✿ ⋆ @sababausa.bsky.social
[ View ] May 09, 2024 at 19:18 UTC

OpenAI argues in their copyright defense that getting it to spit out rote copies of training data requires the user to “hack” the platform in violation of the terms, but it happens a lot and it’s trivially easy to induce

bsky.app/profile/saba...

1 replies 0 reposts 4 likes

Dave @davebrowne.bsky.social
[ View ] May 09, 2024 at 17:44 UTC

I'm positive lawyers are working on how they can get AI classified as a person so all the scraping can be called education so they don't have to pay anyone shit.

1 replies 0 reposts 3 likes

Jonny Lobo @jonnylobo.bsky.social
[ View ] May 09, 2024 at 17:49 UTC

It's because aside from the labor-saving push from the top, most enthusiasm for AI comes from a fundamental disdain for expertise. Citing sources, providing compensation, such gestures of considering the people who actually produce things ruins the experience of mindless consumption for them.

0 replies 0 reposts 6 likes

R. Wm. "Ruedii" @rwruedii.bsky.social
[ View ] May 09, 2024 at 19:06 UTC

AI lacks a sense of humor and can't detect satire. This could be a MAJOR problem.

0 replies 0 reposts 0 likes

iquanyin @iquanyin.bsky.social
[ View ] May 09, 2024 at 19:19 UTC

🤣

0 replies 0 reposts 0 likes

Wattle of Bits 🏳️‍🌈 @wattle.bsky.social
[ View ] May 09, 2024 at 18:11 UTC

It's not intelligent! It's Autocomplete!

1 replies 0 reposts 1 likes

Mark @marksimploding.bsky.social
[ View ] May 09, 2024 at 17:38 UTC

I just thought it was because they were the ones stealing.

0 replies 0 reposts 0 likes

picklefactory @picklefactory.org
[ View ] May 09, 2024 at 18:03 UTC

my understanding has been that fair use ultimately permits this. though feel free to argue with Masnick for the next 48 hours about it, I think I might actually learn something compared to the usual interlocutors telling him that §230 makes Facebook a publisher

1 replies 0 reposts 4 likes