By now, you may either have heard of generative artificial intelligence (AI) tools or put them to the test yourselves. Generative AI tools like ChatGPT, Cohere, and DALL-E2 are popular tools that allow organizations to generate images, text, sounds and creative content based on a prompt. While these tools can provide practical benefits such as improved efficiency and productivity, they raise privacy risks which are important to mitigate.
In a simplistic sense, generative AI tools work by creating new content from a large language model, which is trained on a very large set of data. The content is created in response to the user’s input prompt in a manner akin to autocomplete text prediction tools which predict the next word in your sentence based on the words that precede it. While most of the dataset for tools such as ChatGPT comes from the internet, many generative AI tools explicitly state that user input could be used to train the model as well, forming part of that training dataset.