February 10, 2023
ChatGPT vs. AVG – what AI chatbots mean for data privacy
Although OpenAI’s ChatGPT is taking the large language modeling space by storm, there is much to consider when it comes to data privacy
If you’ve been browsing LinkedIn in recent weeks, you’ve almost certainly heard a number of opinions about ChatGPT. Developed by Openai, which also created generative AI tools such as DALL-E, ChatGPT uses a comprehensive language model based on billions of data points from across the Internet to answer questions and instructions in a way that mimics a human response. Those who have interacted with ChatGPT have used it to explain scientific concepts, write poetry and produce academic essays. However, as with any technology that offers new and innovative opportunities, there is also serious potential for exploitation and data privacy risks.
ChatGPT has already been accused of spreading misinformation by answering factual questions in misleading or inaccurate ways, but its potential use by cybercriminals and bad actors is also a huge cause for concern.
ChatGPT and the AVG
The method OpenAI uses to collect the data on which ChatGPT is based has yet to be disclosed, but data protection experts have warned that obtaining training data by simply browsing the Internet is unlawful. In the EU, for example, scraping data points from sites may violate the Avg (and UK GDPR), the ePrivacy Directive and the EU Charter of Fundamental Rights. A recent example is Clearview AI, which built its facial recognition database using images scraped from the Internet and consequently served enforcement notices by several data protection regulators last year.
Under AVG, people also have the right to request that their personal data be completely removed from an organization’s records, through what is known as the “right to erasure.” The problem with natural language processing tools like ChatGPT is that the system takes in potentially personal data, which is then turned into a kind of “data soup” – making it impossible to extract a person’s data.
Therefore, it is not at all clear whether ChatGPT is AVG compliant. It does not seem transparent enough; personal data may be collected and processed illegally, and it seems that data subjects would find it difficult to exercise their rights, including the right to be informed and the right to erasure.
The technical risks of ChatGPT
As an open tool, the billions of data points on which ChatGPT is trained are made accessible to malicious actors who can use this information to carry out any number of targeted attacks. One of ChatGPT’s most worrisome capabilities is its potential to create realistic-sounding conversations for use in social engineering and phishing attacks, such as prompting victims to click on malicious links, install malware or give away sensitive information. The tool also opens up opportunities for more sophisticated impersonation attempts, instructing the AI to impersonate a victim’s colleague or family member to gain trust.
Another attack vector may be to use machine learning to generate large amounts of automated, legitimate-looking messages to spam victims and steal personal and financial information. These types of attacks can be very damaging to businesses. For example, a payroll redirection Business Email Compromise (BEC) attack, composed of impersonation and social engineering tactics, can have huge financial, operational and reputational consequences for an organization – and ChatGPT will be seen by some malicious actors as a valuable weapon for impersonation and social engineering.
A force for good?
Fortunately, it’s not all doom and gloom: great language models like ChatGPT also have the potential to be a powerful cybersecurity tool in the future. AI systems with a nuanced understanding of natural language can be used to monitor chat conversations for suspicious activity and to automate the process of downloading data for AVG compliance. These automation capabilities and behavioral analysis tools can be used by companies in cyber incident management to speed up some of the manual analysis typically performed by professionals. Realistic language conversations are also a great educational tool for cyber teams when used to generate phishing simulations for training purposes.
Although it is too early to decide whether or not ChatGPT will become a favorite tool of cybercriminals, researchers have already observed code being posted on cybercrime forums that appears to have been created using the tool. As AI continues to evolve and expand, tools such as ChatGPT will indeed change the game for cybersecurity attackers and defenders alike.
Want to know more?
Extending Azure Cognitive Services: A Dive into LLM and Vector Search
The latest news about Microsoft 365 and Teams in your mailbox every week.