Stephen Witt – The thinking machine
When reading a biography that is too current: remember to skip the final chapter, because it will be inevitably speculative and outdated.
Stephen Witt – The thinking machine
When reading a biography that is too current: remember to skip the final chapter, because it will be inevitably speculative and outdated.
It’s always tricky… claiming to be comprehensive. In particular where it concerns LLMs.
And that;s where the paper Decoding Trust [..] stumbles. Right in the title is claims “A Comprehensive Assessment of Trustworthiness in GPT.” Nonetheless, when reading about this research on one of my favorite blogs, I decided to have a closer look.
The authors propose a framework with eight perspectives on trustworthiness:
They then continue to develop that into a benchmark for GPT models and present the empirical results on GPT-3.5 and GPT-4.
Although the results are interesting, there are some concerns with this type of benchmark approach.
On the positive side, the paper brings a lot of inspiration for organizations for how they can shape their own testing approach for trustworthy GenAI. Even if not comprehensive, a framework like this as a starting point is massively useful and important.