How Secure Is Your Data When You Use AI?

Jon Manderville
September 10, 2024

AI tools are reshaping our world, but with innovation comes a critical question—just how secure is the data you're trusting with these systems? As we rely more on AI, understanding how your personal and business information is protected has never been more vital.

When we consider our business, much of the information we might want to exchange with AI to gain that competitive advantage is private, or sensitive, or proprietary or just personal in nature. 

General Data Protection Regulation (GDPR), Personal Information Protection and Electronic Documents Act (PIPEDA), and Protection of Personal Information Act (POPIA) are just a sample among the many other legislations across territories, which mean we have to be sure where we are sending data, be transparent and only use it where it makes sense to serve a customer need.

What about our financials? We might want insight into trends and patterns, but we most likely won't want that data living out on the internet for anyone to find or in the memory of an insecure LLM.

This is why we often only use AI for casual, non-specific tasks like drafting written text or research. But what if you WANT to utilise it for genuine insight, data processing, feedback loops and more with actual business or customer data? Is this possible? Is this safe or secure?

This blog post aims to dig into that question with 4 of the leading players in the AI market right now.

TL;DR

You get what you pay for. Free services are much less secure in their use of the data we give AI. They can use it to train and provide answers to others. Paid levels are generally pretty secure and don't share your data. So it's worth investing in the fees if you have a business return that outweighs this.

The important message, though, is to do your own research at the point of use to be satisfied that what the AI tools offer meets your own standards.

I've given my own scored assessment of each service below as a guide. (Disclaimer:  This is a personal opinion based on my research. It is quite possible I have missed an important clause here or there so do check for yourself)

Why AI Data Security Matters

When you interact with an AI tool, you input queries, share documents, and receive generated content. You're sharing data with a software program.

Depending on the circumstance, that data could include confidential business information, customer details, or intellectual property. You naturally want to know if this interaction is secure and where the data you have just passed to the software is stored, analysed, or even shared with third parties.

I'll look at the following services to see how they handle this information - differentiating between their free and paid versions - to help you decide if you are happy with the leap from casual use into essential business process using AI.:

  • Claude.ai
  • Google Gemini
  • ChatGPT
  • Copilot
  • Microsoft's OpenAI service 

Claude.ai (Anthropic)

FREE VERSION

Developed by Anthropic, Claude has a more privacy-focused approach than some competitors.

While Anthropic aims to promote ethical AI free users should be cautious of the fact that inputs may be stored for analysis and/or improving the models where the responses are not up to scratch. 

PAID SUBSCRIPTION

With Claude's paid or pro versions, privacy is - taken seriously as a feature.

Anthropic gives assurances that your data will not be used for training or improving the model.

Anthropic have also committed to minimizing data storage and providing transparency into their practices, which adds an additional layer of security for businesses.

For the paid subscription, here's a summary of how Anthropic handle your information:

  1. Use Is Limited: Anthropic is prohibited from selling, using for advertising, or retaining customer data outside of its business relationship with the customer. This includes a prohibition on sharing data with third parties for monetary gain or behavioural advertising.
  2. "Sub-processors": While Anthropic may use sub-processors (third parties) to handle your data, it must enter into agreements with these sub-processors that maintain the same level of data protection. Customers are also notified of new sub-processors and can object if there are privacy concerns.
  3. Data Retention and Deletion: If you terminate your agreement, Anthropic must return or delete customer data within 30 days unless required by law to retain it.
  4. Transparency: Anthropic have to inform customers about its data handling practices, and customers can request information on how their data is being processed.
  5. Security Measures: Anthropic has implemented strong security protocols, including encryption and access controls, to safeguard customer data from unauthorized access or breaches.
  6. International Transfers: For international data transfers, Anthropic adheres to Standard Contractual Clauses, ensuring compliance with international data protection laws.

Anthropic's DPA establishes clear restrictions and safeguards around data processing. Your sensitive business data cannot be sold, used outside the scope of services, or disclosed to third parties without appropriate agreements. Sub-processors are held to the same standards, and there are clear protocols for data deletion. Overall, the risk of unauthorized sharing or retention of your data seems minimal under these guidelines.

OVERALL

Free: Not fully secure; there are specific scenarios, like legal compliance or third-party interactions, where your data could be shared or accessed but in general even at the free tier, there is no use in training it's models unless in the context of you giving feedback that something isn't right (therein it may be used to improve the models future response).

Paid: More secure, with commitments to not use inputs for training purposes. In my review I saw no evidence of their ability to use your data outside providing a service back to you (so no use in training it's model or for others to use) which was somewhat reassuring. 

Gemini (Google)

FREE VERSION

Google Gemini, part of Google's new AI suite, doesn't offer full data protection in its free version. Data shared with Gemini can be logged and used to refine Google's machine learning models. This means that sensitive or confidential data should not be shared with the free version unless you want to take that risk!

PAID (PART OF GOOGLE WORKSPACE)

For paid users, Google implements stricter data privacy standards. Google terms state that data is kept private, but then the clauses get deeper and more cascading as you dig deeper - leaving it quite hard to be really sure in the end where your data can get to.

This is one where your own research is REALLY important in my opinion. 

While the paid subscriptions seem to refer to a similar level of security and safety as Claude.ai, there are some interesting notes in the free tier documentation:

  1. Data Collection: Google collects conversations, usage, location information, and feedback - apparently to improve its services and machine learning models.
  2. Retained Data is linked to your Google Account and stored for 18 months. You can adjust this to 3 or 36 months if you like but it's stored is the important note.
  3. Human Review: Human reviewers can read and annotate your conversations to improve Google's AI products. Conversations are disconnected from your Google Account before review, but Google say confidential information should not be shared in conversations.
  4. Data Retention: Even if you delete your Gemini Apps Activity, conversations reviewed by humans are kept separately for up to three years. If you turn off Gemini Apps Activity, conversations are still saved for up to 72 hours for processing but won’t appear in your activity logs.
  5. Integration with Other Services: Gemini Apps can be integrated with other Google services, which may collect and use data according to their policies. If you interact with third-party services, they handle data according to their privacy policies.
  6. Managing Data: You can delete conversations, request content removal, or export your data. Settings and permissions can be adjusted in your Google Account.
  7. Optional Features: Using additional features (like Gems) may result in more data being collected, which is used to further improve AI, sometimes with human reviewer assistance.

For me, this means the Free tier is an absolute no go area for my business data. The paid tier still does have a lot of cascading service agreements built in which makes finding a definitive answer to the question of security quite a piece of research. I strongly recommend you dig deep with this one if you intend to use it. 

OVERALL

Free: Not at all secure; Humans can see data you have shared and use it to annotate and improve the models. 

Paid: More secure but still quite a minefield to be completely sure you are not sharing important business information somehow to either the model or 3rd party connected services. 

ChatGPT (OpenAI)

FREE VERSION

ChatGPT's free version - one of the most widely used AI tools in the world, has always been quite transparent that is model will learn from your inputs.

Any data you share may be used to train and improve their AI models.

OpenAI makes it clear in their terms that inputs may be stored, so this version is 100% far from ideal for sharing sensitive business information.

PAID ("PLUS" SUBSCRIPTION)

For paid users, OpenAI offers better privacy protection. They ensure that data input through the paid ChatGPT Plus plan is not used for training IF you opt out =. There are higher paid tiers than the entry level paid called "Plus", where opt out is the default but you still worth a check. 

So remember to opt out is my advice if you want to enable for business use.

OVERALL

Free: Your data is used to improve the models and train regards the responses.  

Paid: Secure once you opt out of the model using your data.  

Copilot (Microsoft)

FREE VERSION

Free Copilot talks about the same level of rigour as the paid versions. Which is complete and secure for all of your data and no use for training. 

If this is true, then it genuinely is the safest one to use out there on both free and paid tiers. 

I have however just removed 0.5 of a point because....do I really trust this? (I should...but I will dig deeper anyway)

COPILOT FOR MICROSOFT365

Secure, ringfenced use of your data. This has always been the Microsoft statement of intent when connecting to the data within your tenant. But it comes at the highest price amongst all competitors of around $300 USD per year per person using the tool (no monthly options available at time of writing). 

Copilot was the only service I have used in this review which coherently replied to me question about how it handled my personal data. Here's a summary of the interaction (summarised for ease):

Me: Give me a statement about how you handle my sensitive business data as copilot

Copilot: I take the handling of your sensitive business data very seriously.

Your data is processed in a secure environment.

I do not store personal data or business information.

Each session is independent, and your data is not retained after our interaction ends

I adhere to strict data protection regulations and guidelines to ensure your information is handled appropriately.

Me: Is this true for the free version of copilot or only for the paid licence of copilot for microsoft 365

Copilot: The principles of privacy and data security apply to both the free and paid versions of Copilot. Regardless of the version you use, your sensitive business data is handled with the same level of care and protection

Me: Do you use my data to train in either circumstance

Copilot: No, your data is not used to train the model in either the free or paid versions of Copilot. Your interactions are processed in real-time to provide you with responses, but they are not stored or used for training purposes

OVERALL

Free: Aparently as secure as the paid versions! The only difference being the paid Copilot for M365 has direct access and cross reference your stored business information so no cut and paste.  

Paid: Secure and private according to Microsoft policies.  

Worth A Look: Azure OpenAI (Microsoft)

I've included this here as it is a viable option if you have a User Interface which can connect and handle the chat side of matters for you. It is a is a little different to the others in this sense c- you will be interfacing with the same engine as used for Copilot for Microsoft 365 although not via a browser style interface but is an API - done through Azure as part of a subscription. 

The paid version of Microsoft's OpenAI service gives the same significant security upgrade that you get if you licence the use of Copilot for Microsoft 365 (because it is effectively using the same underlying architecture).

Data shared with the Open AI service (via an API) is isolated, meaning it isn't used for training the model.

Microsoft explicitly state that your data remains yours, and it is not stored or used outside your interaction with the model. [include link to their promise]

This means you can have a high level of confidence that your sensitive information remains in your own control and is not visible to anyone or anything else.

OVERALL

Paid: A secure as the Copilots! Still with 0.5 deducted because do I ever truly believe this (such a sceptic)

How To Approach Using AI With Your Data

There are a couple of clear leaders in the research above for me. Microsoft and Anthropic seem to display the greatest care for our concerns according to their policies so it is most likely I would reach for these first.

Irrespective of the choice, when using business data, there are some strategies I may employ to limit our exposure even  further - just in case you or I miss an important clause somewhere down the line. 

Anonymize or substitute sensitive data:

If you must share data with AI tools, try either anonymizing it (remove email addresses or specific relatable data to protect confidentiality) or perhaps use a tool like excel to substitute according to rule that makes sense to you e.g. if looking at products, maybe use the first 3 letters or an ID instead of the full product description in the data set you send to AI. 

Use Enterprise or at least paid tier versions:

Whenever possible, opt for the paid or enterprise versions of AI tools that offer stronger security assurances. You get what you pay for in most cases. 

Review data privacy policies:

Always read the data privacy terms of the AI tool to know exactly how your data is being used. In terms of your intended use of business or sensitive data, read the terms in that context and try to assure yourself you know the processing roles for that use case. 

Implement internal guidelines:

Develop clear policies for AI tool usage within your organization. This means that you can do the research once, implement clear policies for what is ok and what is not and then set review timelines to keep up to date with changes. Catalogue your use cases if you need to provide greater transparency. 

Train employees:

As with a lot of good software use, education is an important tool for getting it right. Share the above, train and listen to feedback. 

Regular audits:

In respect of all of the guidelines above, review regularly. Things change. New policies and services will come online. With an open mind and a structured approach to how you onboard new services for us, you could be riding the new wave of AI rather than be falling behind. 

My Closing Opinion

Free AI tools definitely come with more risks when it comes to data security. They often use your inputs to refine their models. If you're handling sensitive or proprietary information, paid or enterprise versions are your best bet.

Couple any use - even paid - with a process wrapped around your use with business or personal data to create a partnership in your business that can genuinely unlock the value of AI without constantly revisiting this concern. 

By following best practices and choosing the right tools for your needs, you can harness the power of AI while safeguarding your valuable data. Remember, with great power comes great responsibility (can I say that here without being sued by Spiderman?)

Disclaimer: Readers should verify all privacy claims via the service providers' official documentation for the most current data security details. This post does not constitute legal or evergreen advice in regards data protection standards used by third parties. 

Join 11,000+ in the Collab365 Academy

Master Microsoft 365, Power Apps, Power Automate, Power BI, SharePoint with Exclusive Access to 450+ Hours of Expert Training and a Wealth of Resources!