Is there anything I can do to prevent my company's paying customers from uploading reports or data feeds they purchase from my company into public or private LLM's once they have acquired the data from us?

1.1k views6 Comments
Sort By:
Oldest
VP of IT in Real Estate3 months ago
Contractual restrictions is the first thing that comes to mind for me.   Next would be to use some sort of watermark style content in your data to make it more detectable if your data appears in a public LLM.   I'm not sure there is any way to know if your data is used in a private LLM as you would not have any access to see it.   In a google search I made while considering this I found a highly technical PHD thesis style discussion of the problem here: https://www.chenwang.net.cn/publications/MeFA-TIFS22.pdf
1
Sr. Director, Enterprise Applications and IT Services3 months ago
Contractual restrictions are the only way. What is the actual concern? LLMs are session-guarded and stateless. This is no different than your data being uploaded to other public and private cloud services.
1 1 Reply
CISO in Finance (non-banking)3 months ago

We could be concerned about loss of IP and loss of business opportunities if information we put behind a paywall becomes publicly available.  Contractual restrictions are the first thing that came to mind as well, but interested if the peer community has any other suggestions.  Thanks for the reply.

lock icon

Please join or sign in to view more content.

By joining the Peer Community, you'll get:

  • Peer Discussions and Polls
  • One-Minute Insights
  • Connect with like-minded individuals
VP of IT in Finance (non-banking)3 months ago
if the purchase construct gives full ownership to the buyer then they can leverage as needed
1
Head of Demand to Value Data, Digital & Technology in Healthcare and Biotech3 months ago
Impossible to stop, and of course this has legal implications based on your agreements, but you can do quite a few things to limit, prevent or at least create awareness if this is happening. Naming a few:

1. Data Watermarking - embedded invisible markers for tracing to source
2. Encryption - so you need keys to decrypt and access (you can dig into types of encryption that can be used to allow how some data can be 'used' without encryption keys)
3. Access Controls / Audit Logging - agree more complex if you've 'sold' the data
4. Smart Contracts - using for example blockchain to support data usage policy activation and controls

I'm sure there are more - also depending on where you host the data you can apply ML Data Controls to change/anonymize data when it's pulled down, monitor usage etc
2
CISO in Finance (non-banking)3 months ago
Thank you for your input, everyone!

Content you might like

Human Factors (fears, mental health, physical spacing)85%

Technical / IT Factors (on-premise tools, pivoting back away from remote)14%

3.7k views3 Upvotes2 Comments
IT Manager in Constructiona month ago
Hello,
the topic is so broad, what are you focused on?
Read More Comments
4.8k views2 Upvotes5 Comments

Hybrid administration (across on-prem and Office 365)29%

Keeping groups accurate and up to date53%

Managing and optimizing Office 365 licenses15%

Creating new user accounts1%

View Results
1.5k views1 Comment