AI commitments and gaps

Deep dream generator: “Post-apocalyptical playground at which kids make their own rules.”

U.S. president Bidenś recently announced AI commitments agreed by US government with Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI.

Fun fact: Meta unleashed its Llama2 (even though there are questions on its openness) just before committing to protecting proprietary and unreleased model weights.

In any case; the USA has a totally different approach from the EU with their AI Act. These commitments provide a great opportunity to do an early check on how self-regulation in AI could shape-up.

There are three observations that stand out.

Vagueness

It has already been observed that most of said ‘commitments’ made by big tech are vague, generally non-committal, or confirmation of current practices.

Considering the success of the EU in getting big tech to change (e.g. GDPR, USB-C) I am convinced that in tech, strong legislation does not stifle creativity and innovation; but fuels it.

Data void

There are also notable omissions. The one that sticks out for me is the lack of commitment with respect to training data. And that at a moment that legal cases over data theft and copyright infringement are popping up in various places. In that context, Getty Images hopes that training on licensed content will become a thing.

Admittedly, discussions on data ownership are super interesting. But full clarity on the data going into foundational models (and the policies around it) would also sharpen the extent to which data biases may put model fairness and ethics at risk.

Content validation

By far the most interesting commitment is around identification of AI-generated content:

“The companies commit to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system“

Considering the expected amount of generated content, I expect not watermarking of AI-generated content (the vast majority of future data volumes) will be problematic.

And it also addresses the problem from the wrong side. In the future, the question will not be “What is fake?”, but rather “What is real?”

This points in the direction of watermarking of human-produced content to be the way forward. Think of an NFT for every photo you make with your smartphone of digital camera. I didn´t hear Apple, Samsung, or Nikon about this yet. But I wouldn´t be surprised if we see announcements in the near future.