Waking up

DeepDreamGenerator: “Journalist sitting at a desk at the newspaper office. The journalist grabs their own writings and trying to protect them, while fighting off a large crawler that wants to feed it to an AI.¨

Great to see journalists initiating change in their own organization.

  • Fri Aug 25: Guardian journalist Ariel B. reports that other news media have started blocking GPTbot. The subtly note in his article: “The Guardian’s robot.txt file does not disallow GPTBot.” (Version Sept 3, 2023)
  • Fri Sept 1: Guardian leadership has taken notice and blocks GPTbot – as reported here.

As I have noted earlier, data access is a major topic when it comes to achieving a healthy power balance in the information space here and here. Glad to see more and more companies take this seriously.

Personally, I currently see little incentives for companies, organization, or individuals to allow their data to be crawled by for profit.

AI commitments and gaps

Deep dream generator: “Post-apocalyptical playground at which kids make their own rules.”

U.S. president Bidenś recently announced AI commitments agreed by US government with Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI.

Fun fact: Meta unleashed its Llama2 (even though there are questions on its openness) just before committing to protecting proprietary and unreleased model weights.

In any case; the USA has a totally different approach from the EU with their AI Act. These commitments provide a great opportunity to do an early check on how self-regulation in AI could shape-up.

There are three observations that stand out.


It has already been observed that most of said ‘commitments’ made by big tech are vague, generally non-committal, or confirmation of current practices.

Considering the success of the EU in getting big tech to change (e.g. GDPR, USB-C) I am convinced that in tech, strong legislation does not stifle creativity and innovation; but fuels it.

Data void

There are also notable omissions. The one that sticks out for me is the lack of commitment with respect to training data. And that at a moment that legal cases over data theft and copyright infringement are popping up in various places. In that context, Getty Images hopes that training on licensed content will become a thing.

Admittedly, discussions on data ownership are super interesting. But full clarity on the data going into foundational models (and the policies around it) would also sharpen the extent to which data biases may put model fairness and ethics at risk.

Content validation

By far the most interesting commitment is around identification of AI-generated content:

The companies commit to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system

Considering the expected amount of generated content, I expect not watermarking of AI-generated content (the vast majority of future data volumes) will be problematic.

And it also addresses the problem from the wrong side. In the future, the question will not be “What is fake?”, but rather “What is real?”

This points in the direction of watermarking of human-produced content to be the way forward. Think of an NFT for every photo you make with your smartphone of digital camera. I didn´t hear Apple, Samsung, or Nikon about this yet. But I wouldn´t be surprised if we see announcements in the near future.

The telling statistics of Chat-GPT

Picture by Deep Dream Generator: “A design for the Tarot card ‘The Fool’ in the style of Gustav Klimt”

This weekend I took Tarot as a little test case for the OpenAI API.

It turned into a quick lesson in how not to use Chat-GPT (in so many ways 🙂 ).

Fortune telling by ‘reading the cards’ seemed like a good use-case for generative AI: ChatGPT can draw random cards and explain their meaning in the convincing tone of a Tarot Zealot.

I had to tweak the prompt a bit when Chat-GPT explained in a condesending tone that it was an AI and could not actually ‘draw cards’. But after that, the whole fortune telling business could be swiftly disrupted in a few lines of code.

For example:

Card 1  is 'The Wheel of Fortune'
Card 2  is 'The Moon'
Card 3  is 'The Star'

Wait one minute for your fortune teller to give her explanation. 
She has to align with higher powers on the interpretation of these cards...

Hear your fortune:

Ah, my dear client, what an interesting trio of cards you have drawn! The Wheel
of Fortune represents life's constant changes and cycles, indicating that a
major turning point is approaching. The Moon signifies a time of uncertainty,
where emotions and intuition guide your path. However, fear not, for The Star
brings hope and inspiration, shining light on your journey. This whimsical
combination suggests that you should prepare for unexpected shifts, trust your
instincts, and stay optimistic, as ultimately, a bright and promising future
lies ahead. Remember, my friend, laughter is the best crystal ball gazer!

But after trying my “Tarbot” a couple of times, an odd pattern emerged: most of the ‘random’ cards were from the major arcana. That is a special category of 22 of the 78 Tarot cards that is considered more interesting, deep, and complex by Tarot aficionados.

In a longer experiment, I got to no less than 76% major arcana. A true random draw whould have resulted in c. 28%.

Ouch! How is that for a bias towards what seems interesting?

Various scandals illustrate the despicable role that big consultancies play in the global capitalist system, while they try to reap benefits of their worthless services by applying dubious marketing and sales tactics

Mariana Mazzucato and Rosie Collington – The big con

The book paints a naive caricature of the consulting industry, downplays the role and responsibility of other actors and, unfortunately, lacks a realistic alternative for flexibly solving skill and capacity deficits (especially in the public sector); thereby undermining any justified concerns.

Open is the new closed

Image by Stable Diffusion: “Silicon valley giant trying to have its cake and eat it”

The joke about OpenAI having to rebrand to ClosedAI (triggered by the secrecy around its GPT-4 unveiling) is pretty apt. All in the spirit of what the VC community calls ‘creating a moat’.

The involuntary openness of Meta on its large language model, LLaMA, got another giant, Google, thinking. Their conclusion is that, in the end, a proprietary model will not create competitive differentiation, as is clear from a leaked memo.

In the memo, Google seems to embrace open source as route forward for generative AI: smaller models, different approaches to fine-tuning, leveraging the crowd, etc. Sounds swell.

The funny thing is, however, that around the same time word came out that Google intends to share AI research less freely. How to reconcile these two perspectives? The memo give some clear pointers.

Working hypothesis: Google will try to make actively orchestrate the open source efforts on LLMs through controlled releases of models and research that enable incremental improvements. Meanwhile, Google will increasingly shield their cutting-edge research as they are really frustrated that OpenAI became a massive success leveraging fundamental research on transformers that originated at Google.

Adding that all up, it seems that the grip of ‘big tech’ on AI will not be challenged by open source anytime soon. Curious how this will play out.

There are a huge number of ways in which Artificial General Intelligence (AGI) can take over the world, rendering humanity essentially useless

Max Tegmark – Life 3.0

Nov. 2017: Interesting exploration of the implications of AGI, faulted by the typical preference of Analytical Philosophy for construction of intricate, highly  theoretical scenario’s, under-emphasizing basic challenges (in the case of AGI: lack of robustness / antifragility).

Jun. 2023: The writer has leveraged the recent rise of LLMs like ChatGPT to further fuel fear about an AGI break-out – even though other AI-related risks require more imminent attention.

When it comes to regulating AI, I root for the bureaucrats

Image by Deep Dream Generator: “Evil AI conquers Silicon Valley, taking no hostages”

Following the launch of GPT-4, the silicon valley elite started hyping the AI-scare in an open letter.

The progress in LLMs and similar generative AI models is impressive and will have major impact on both society and the enterprise. But fearmongering is totally unhelpful and obscures the real issues.

Rather than naively stopping AI-development, society should focus on two more specific topics that are under-emphasized in the recent public debate:

  1. What data to train on? Current models aim to train on, roughly, ‘the totality of the internet’. Which already leads it interesting legal challenges about copyright infringement and ownership
  2. What applications to pursue? The current generation of AI can do amazing party tricks and can lead to major efficiency improvements. But it can also be used for deception and has has severe limitations and biases.

The key question to address: “How can we use existing legislation to protect society from misuse of AI and what additional legislation is needed?” Sounds less lofty that what the open letter calls for, but is much more constructive. Moreover, this perspective calls into question not just new AI models ‘more powerful than GPT-4’, but also existing models and the governance applied to them.

Already far before the recent open letter was written, the EU published the AI Act to address AI-related risks. Brought to you by the same institution that forced Apple to adopt compatible charging cables. It’s not perfect. It’s not complete. But it is a good start. It would have been so nice if the writers of the open letter had give credit where it is due.

When it comes to protecting my rights, security, and safety as a citizen; I put much more trust in EU bureaucrats than in the silicon valley echo-chamber that tends to over-index on libertarianism and techno-utopianism.