A bipartisan group of senators introduced a new bill to make it easier to authenticate and detect artificial intelligence-generated content and protect journalists and artists from having their work gobbled up by AI models without their permission.
The Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act) would direct the National Institute of Standards and Technology (NIST) to create standards and guidelines that help prove the origin of content and detect synthetic content, like through watermarking. It also directs the agency to create security measures to prevent tampering and requires AI tools for creative or journalistic content to let users attach information about their origin and prohibit that information from being removed. Under the bill, such content also could not be used to train AI models.
Content owners, including broadcasters, artists, and newspapers, could sue companies they believe used their materials without permission or tampered with authentication markers. State attorneys general and the Federal Trade Commission could also enforce the bill, which its backers say prohibits anyone from “removing, disabling, or tampering with content provenance information” outside of an exception for some security research purposes.
(A copy of the bill is in he article, here is the important part imo:
Prohibits the use of “covered content” (digital representations of copyrighted works) with content provenance to either train an AI- /algorithm-based system or create synthetic content without the express, informed consent and adherence to the terms of use of such content, including compensation)
Ladies and gentlemen of the jury, before you stands 8-year-old Billy Smith. He stands accused of training on copyrighted material. We actually have live video of him looking and reading books from the library. He he trained on the contents of over 100 books this year.
We ask you to enforce the maximum penalty and send his parents to prison.
I get what you’re saying, but there’s something of a difference between someone studying something for months or years then writing about it, and a language model ran by one of the tech giants scraping media and immediately generating stuff from it, for commercial use, for the profit of the company that owns it.
It’s kinda like how plagiarising somebody’s book word for word never used to be a crime when it was a painstaking process of manually writing it back out for every copy. When the printing press came out, though? It allowed dodgy businesses to large-scale fuck over authors, and the law had to play catch-up.
I don’t actually think this proposal is that well thought out, but I also don’t think we should think of AI models or corporations as being people - they aren’t people, and they shouldn’t necessarily have the same rights and privileges that we do.
There’s a lot of private people training models (Lora, Dora’s etc) / fine-tuning checkpoints and what have you
Training models is not just giant tech corps anymore
I know, I have one running locally on my PC, it’s neat.
I still don’t think that changes my point, though - that a large AI model, particularly one that can scrape the whole web of any content it can find, then immediately be used to generate a practically infinite amount of content in seconds is very different to the idea of a little 8 year old in a library reading books then writing something himself.
And I still maintain that companies aren’t people and shouldn’t necessarily have the same rights as a person.
What of the images random people generate from software like dall e? Those are made from the same training data, and what this poicy does to them is make media creation more inaccessible even though the technology exists. Also, copying a book word for word by hand isnt/wasnt plagarism, its unlicensed duplication. Plagarism would be changing just the proper nouns and pretending like its a completely seperate book
No matter how much you’d like for it to be the case, proprietary algorithms owned by big corporations are not remotely comparable to children.
tell that to civitai users lol
If I gave you an arbitrary image from Midjourney and all of the training data from it, I doubt you could match it to the “source art.” AI images are usually transformative.
This, exactly. AI is generating new images. Oh whoop de do, they did it by mixing a bunch of pixels. As though making an image out of tiny photos isnt literally the same thing and considered transformative. People just have a double standard about a program instead of a person doing it. (Except for that subset kd online artists, they’re just bezerk about copyright and credit in general)
Which part of “an even worse and scummier form of plagiarism” you didn’t understand?
What part of “transformative” did you not understand?
Different scale, but just go on and defend your billion dollar industry, because “what if it was open source” despite the open source community would never have the ability and the resources to train these models.
What are you talking about? The open source community has trained these kinds of models. They’re out there.
I honestly could not give less of a shit who’s training the models. I’m not gonna boycott C# because it was developed by Microsoft. There are open source implementations of generative AI that make use of freely-available models.
Thanks chatgpt
Pleased to take part in creating the scarcity free future by letting hustle bros to ruin art communities, and letting terminally online people to create endless followups to Metropolis Pt. II instead of them sending death threats to Dream Theater!