People have been using the popular AI chatbot ChatGPT, developed by OpenAI, to help with everything from writing emails, term papers to finding bug fixes in code. That last one is what has gotten a few Samsung employees in hot water.

Earlier this month it was reported by The Economist Korea that there were three incidents of entering Samsung Electronics corporate information into ChatGPT. In one incident an employee entered problematic source code and asked for a solution. Another employee entered program code and requested code optimization. A third employee submitted meeting contents in order to prepare meeting minutes.

After the incidents were discovered, company executives and employees were notified and urged to be careful using ChatGPT because once data is put onto Open AI’s external servers it is impossible for Samsung to retrieve or delete it. Samsun Electronics is looking to protective measures to prevent future information leaks. However, if it happens again access to ChatGPT may be blocked.

Cybersecurity leaders weigh in

“This incident isn't actually that surprising. People see ChatGPT as a useful tool, which it is, but forget that anything they share with it is shared with it and also all its users. So any proprietary, or even mildly sensitive, or personal, information needs to be scrubbed from interactions with the bot,” said Mike Parkin, Senior Technical Engineer at Vulcan Cyber. “While it's possible for companies to safely deploy tools that leverage the GPT engine in ways that replicate ChatGPT, they have to do it on their own instances where the data's not shared outside the organization.”

Parkin said the incidents at Samsung are an example of people failing Risk Management 101 by exposing sensitive information on an insecure channel.

“ChatGPT can be great for a lot of things, but people need to stop pretending that it's confidential or secure. It's not,” he said. “It's not an evil monster that'll destroy the world, but that doesn't mean it can't do some damage when you feed it information it really shouldn't have.”

“ChatGPT is probably the most exciting thing to come out in recent times,” said Bugcrowd Founder and CTO Casey Ellis. “The unfortunate downside is that an excited user is often also a less cautious one, and one that is more likely to make privacy and security related mistakes, both personally, and on behalf of their employers.”

Ellis said similar behaviors occurred with tools like Github, Stack Overflow, Reddit, and even search engines.

“This isn’t a new phenomenon in human behavior,” Ellis said. “With that in mind, I think it’s wise for security leaders to apply the idea of ‘building it like it’s broken’ and presuming that, at least somewhere inside their organization, incidents like the one at Samsung will take place. In terms of prevention, user awareness training is pretty valuable in these early stages, and I imagine that a fresh look at access controls will be on the radar of many organizations over the coming months.”

“As a general rule, if you wouldn’t share the content you’re typing into ChatGPT with someone who works for your company’s direct competitor, then do not share it with ChatGPT,” said Melissa Bischoping Director of Endpoint Security Research at Tanium. “Even if you think the data you’re sharing may be generic and non-damaging, you should review any relevant policies with your legal and compliance teams to clear up any doubt. Companies have rapidly started rolling out acceptable use policies for AI/ML tooling if they didn’t have them already, so leaders should prioritize education, Q&A, and crystal-clear understanding of the risk and limitations.”   

Bischoping said the issue with sharing data with ChatGPT is that the creators can see all the data entered and use it as the model continues to train and grow. In addition, once the information is part of the next model, it could be possible for the AI to give the information to another party.

“As organizations want to enable use of powerful tools like ChatGPT, they should explore options that allow them to leverage a privately trained model so their valuable data is only used by their model, and not leveraged by iterations on training for the next publicly available model,” Bischoping continued. “Large language models are not going anywhere, but with great power comes great responsibility.   Educate employees on what information is highly sensitive, how to treat that data in regards to humans or computer systems, and consider investing in a non-public model to use for your intellectual property’s protection.”