Content with AI content? Not so much.

Feb 6

First published on LinkedIn on January 27, 2023.

Dogs and kids chase birds and other animals. Grownups don't. Somewhere, growing up, we were taught it just wasn't nice to hurt other species or other people. It's time to refresh that lesson, as we consider leveraging AI to help our organizations build content and code.

Because with Ghat GPT, the millions of us who've signed up to use it in the past couple of months are actively teaching it as we go. And I'm pretty sure people don't realize that nuance as we "test out" or "play with" the capabilities of this massively impressive, yet ethically naive language learning system. Taught, tuned and trained for years by very smart engineers, and now turned out into the public for further tuning and training and use.

In a nutshell, (and I'm over-simplifying these engineering feats) generative AI platforms like Chat GPT learn by consuming vast collections of language from existing, human generated content, drawing on patterns they’ve identified in the content they've been fed.

These patterns are why they can write in the style of, say, Hemingway, or Shakespeare, having "studied" their works, along with every nuance written about their works in Wikipedia and other online resources posted before about 2021 (as only AI can).

In a fascinating interview with Yann LeCun, Meta's chief AI scientist, he noted,

“OpenAI's program has, moreover, made extensive use of a technique called reinforcement learning through human feedback, which gets human agents to help rank output of the machine in order to improve it, much like Google's Page Rank for the web.”

With every piece of (admittedly) magical content created in 30 seconds or less, there might lie implicit biases, unrecognizable (by those of us not specifically trained in a field) errors, and bias in the sheer volume of works by better-known sources over lesser-known sources is going to happen quite naturally.

So there's that.

It's early days.

As we marketers consider using AI to augment our work, use it wisely. And know you're bound to need to fact check, edit to make it your own, and stand by your content once it's published. Are you ready? It doesn't always work the way we think it does.

Just ask CNET, who used their own AI engine to publish financial services content on CNET Money, and withdrew the test (and the posts) when relatively simple factual (even mathematical) errors were found.

"We've paused and will restart using the AI tool when we feel confident the tool and our editorial processes will prevent both human and AI errors," Connie Gugliemo wrote on CNET.

Good for them - both in testing, and in stopping their work for more involvement from humans in terms of editorial and other citation processes, and for sharing their discoveries (good and bad) along the way.

Sidebar: I'd be very interested to hear of the PR firms who are prepared today to address this kind of mis-information dissemination after the fact. PLEASE ping me, let's talk.

Then there's another potential training issue, that of leaking your own company's content or intellectual property into the public domain through an AI engine. Amazon is busy managing that risk right now, as they've begun to find text generated by ChatGPT that "closely resembles" internal company data.

Given the promise of AI to help engineers write code, it's not surprising that IP is slipping innocently into the public domain. What's sad is that - in a sophisticated organization like Amazon - the folks who manage risk and the folks who train on ethics (in two different areas of any organization) are playing catch up right now.

I think of all the people reading and writing about the magic of AI without the resources of a large organization or great PR firm to protect them, and can only imagine it'll be merely weeks or months before someone without systems of support will be legally at risk for content they've posted without proper curation. And when it happens, will it be their fault?

I found it interesting that big tech (Google, Meta et. al.) have been busy developing AI for decades, and haven't yet made a sustained release of their products into the public domain like scrappy Open AI did with their products. As noted in this fascinating article in the Washington Post (subscription may be required) this stood out to me:

“Some AI ethicists fear that Big Tech’s rush to market could expose billions of people to potential harms — such as sharing inaccurate information, generating fake photos or giving students the ability to cheat on school tests — before trust and safety experts have been able to study the risks. Others in the field share OpenAI’s philosophy that releasing the tools to the public, often nominally in a ‘beta'‘ phase after mitigating some predictable risks, is the only way to assess real world harms.”

Finally, there have been instances of ChatGPT passing Wharton's MBA exam, getting technical interview questions right, and being implemented to help people flirt better on dating sites to set them apart.

Disclosures and more resources:

I've been using ChatGPT myself (not for public consumption), and have truly enjoyed learning more about the history of AI, which is about 20 years old. I support its use.

But more importantly, I support the preparation companies of all sizes will need to make to educate their employees on the use of AI. Marketers, attorneys and HR, caveat emptor! LMK how I might help. (I addressed this kind of thing when social media was a brand new tool for businesses to use.)

History of ChatGPT - thank you, TLDR newsletter, for the archived copy of the Fortune article.
Article on AI content detectors in ZDNet — probably more on this topic later.

Janet Johnson

Content with AI content? Not so much.

You're the boss of AI — act like it.

The view from here: AI will get ugly.