Open is the new closed

Image by Stable Diffusion: “Silicon valley giant trying to have its cake and eat it”

The joke about OpenAI having to rebrand to ClosedAI (triggered by the secrecy around its GPT-4 unveiling) is pretty apt. All in the spirit of what the VC community calls ‘creating a moat’.

The involuntary openness of Meta on its large language model, LLaMA, got another giant, Google, thinking. Their conclusion is that, in the end, a proprietary model will not create competitive differentiation, as is clear from a leaked memo.

In the memo, Google seems to embrace open source as route forward for generative AI: smaller models, different approaches to fine-tuning, leveraging the crowd, etc. Sounds swell.

The funny thing is, however, that around the same time word came out that Google intends to share AI research less freely. How to reconcile these two perspectives? The memo give some clear pointers.

Working hypothesis: Google will try to make actively orchestrate the open source efforts on LLMs through controlled releases of models and research that enable incremental improvements. Meanwhile, Google will increasingly shield their cutting-edge research as they are really frustrated that OpenAI became a massive success leveraging fundamental research on transformers that originated at Google.

Adding that all up, it seems that the grip of ‘big tech’ on AI will not be challenged by open source anytime soon. Curious how this will play out.

When it comes to regulating AI, I root for the bureaucrats

Image by Deep Dream Generator: “Evil AI conquers Silicon Valley, taking no hostages”

Following the launch of GPT-4, the silicon valley elite started hyping the AI-scare in an open letter.

The progress in LLMs and similar generative AI models is impressive and will have major impact on both society and the enterprise. But fearmongering is totally unhelpful and obscures the real issues.

Rather than naively stopping AI-development, society should focus on two more specific topics that are under-emphasized in the recent public debate:

  1. What data to train on? Current models aim to train on, roughly, ‘the totality of the internet’. Which already leads it interesting legal challenges about copyright infringement and ownership
  2. What applications to pursue? The current generation of AI can do amazing party tricks and can lead to major efficiency improvements. But it can also be used for deception and has has severe limitations and biases.

The key question to address: “How can we use existing legislation to protect society from misuse of AI and what additional legislation is needed?” Sounds less lofty that what the open letter calls for, but is much more constructive. Moreover, this perspective calls into question not just new AI models ‘more powerful than GPT-4’, but also existing models and the governance applied to them.

Already far before the recent open letter was written, the EU published the AI Act to address AI-related risks. Brought to you by the same institution that forced Apple to adopt compatible charging cables. It’s not perfect. It’s not complete. But it is a good start. It would have been so nice if the writers of the open letter had give credit where it is due.

When it comes to protecting my rights, security, and safety as a citizen; I put much more trust in EU bureaucrats than in the silicon valley echo-chamber that tends to over-index on libertarianism and techno-utopianism.

21; or a different number

In COVID times, we are constantly bombarded with figures, statistics, and bold claims. How many people will die? On average, how long do patients stay in ICU? How many ICU bed will be needed? How long will it take to flatten the curve?

Some of these figures will turn out to be true. Others less so. And many will be structurally biased.

This is best illustrated by an innocent example: The most successful weatherman is not the one who makes the most accurate prediction of rain or shine, but the one who predicts rain a bit too often. No-one will blame him when it turns out to be a sunny day. A faulty prediction of sunshine will be less appreciated by his audience.

Similarly, it is best to over-estimate the number of ICU beds you need two weeks from now. That is just proper risk management.

Just be aware: correctly interpreting the figures you see in the news may require game theory as much as it requires statistics.

V for Variance

Turn data driven decision making into continuous learning

Most humans dislike change. And continuously ongoing change is even worse.

What does this mean for data driven decision making?

First of all, you can count on a lot of resistance when you roll-out AI-driven solutions: Can the models be trusted? Is my professionalism still valued? Will I lose my autonomy? However important to address, managing such concerns is not my topic here.

Suppose that you have made it to a full roll-out unscratched. Analytics drive key decisions. Benefits are measurable. Most likely, your decisions will become more structured. While predictive and prescriptive analytics can unlock great value, these come with a risk.

Stability.

Everyone in your organization will love it when little changes. That is, unless your company is truly digital. Stability will create the suggestion that everything is under control. That targets will be met and nothing can go wrong.

By contrast, statistical models live by change. They need to observe change to predict change. And that means that you should be consciously creating the variance you need to continue learning.

Luckily, there is a lot of change that occurs naturally. Customers change their ways. Stores do not execute recommendations. Suppliers’ price hikes are charged-on to customers. You name it. Although it is a good start, this type of variance may be mightily skewed. Or not representative for what you want to learn. In other words, you most likely will need a different kind of variance. To turn data driven decision making into continuous learning, you need to have a strategy for conscious, targeted, and ongoing experimentation and testing.

Remember, remember: learn to love the unexpected.

The Data’s Advocate

How to assure that insights change business decisions

The Board has decreed that you have to become a data-driven organization. To avoid obsolescence, things need to change. The old way of doing business is no longer viable. Only Data can make you smarter.

So, there you go. An Analytics department is set-up. A Big Data platform is is put in place. Data scientists are hired. Models are fitted. Insights flood the organization. And after a while all graphs start pointing to the top-right corner. Right?

Nope: Sales wants to sell, Operations wants to operate, and Marketing wants to do whatever it is that Marketing wants to do. No-one ever wants a proper analysis. Especially if the outcome is likely to challenge the status quo. There is a shop to run, a client to manage, and a problem to fix. If Analytics does not directly help to do just that, it is deemed useless. And everyone who does not get that, frankly, does not understand the business. That’s how, in many organizations, Data is side-tracked.

The first priority, is of course to assure that your insights are relevant and focus on improving key business decisions. But that is not enough. These insights should actually change the decisions your organization makes. And frankly, many business owners are unable to take an impartial perspective with respect to unexpected challenges – especially when under pressure.

What is needed, is someone who can ensure the fact-based perspective is taken into account in decision making.

Someone needs to defend data-driven insights without looking for compromises from the start. Someone needs to be in a position to challenge Category Management on their Sales v. Margin trade-offs. Someone needs to hold firm on the risk assessment in the face of an exciting Sales opportunity.

In short: you need a Data’s Advocate.

That is not to say that Data should always prevail over ‘Gut’ or ‘Experience’, or ‘Entrepreneurship’, or whatever you call it. But the trade-off should be an explicit one. Both to assure better decisions, and to build awareness on what it means to be a data driven organization.