Why should we (not) authorize AI solutions' developers to train their AI models on our public data?

11k views1 Upvote4 Comments

Sort By:

Oldest

CIO in Services (non-Government)3 months ago

There are several reason why we should or should not allow it.

Always take in to account any Privacy concerns. Public data often includes personal information about individuals. Allowing developers unrestricted access to this data for training AI models can compromise people's privacy rights. Even if the data is anonymized, there's always a risk of re-identification through advanced data linkage techniques.

Then there is potential misuse of data. Developers might use public data for purposes that are not in the public interest or go against ethical standards. There's a potential for data to be used in ways that harm individuals or groups, such as discriminatory practices in AI decision-making.

How do we control it? Developers might use public data for purposes that are not in the public interest or go against ethical standards. There's a potential for data to be used in ways that harm individuals or groups, such as discriminatory practices in AI decision-making.

Then we have the point on sensible AI. Using public data without explicit consent can raise ethical questions about fairness and justice. It may disproportionately benefit developers and tech companies without providing adequate benefits or protections to the individuals whose data is being used.

So yes you can use public data but take some guard rails in your framework so you don't have to struggle on justifying later on.

Principal Software Engineer, Data Engineering in Energy and Utilities3 months ago

It depends on what is expected out of the AI Models.
1) If the output has to include both generic information and private information, public data training would help. For private information, RAG can be used.
2) If the output should not have hallucinated results and is more Org/User domain-specific, then RAG would be the best approach for contextual grounding.

Information Security Analyst in Governmenta month ago

Good Morning, public data has data quality issues and we need to take that into consideration when building any AI model. There are also unattended biases. We've taken the approach to share city data publicly via chatbots but ensuring that we have controls in places to review and limit specific responses for public queries. For example, we want users to focus on the scope of city services/data provided by city agencies, not necessarily other news outside the scope. Start small and build incrementally.

Please join or sign in to view more content.

By joining the Peer Community, you'll get:

Peer Discussions and Polls
One-Minute Insights
Connect with like-minded individuals

Senior Director - Partner Solutions in Consumer Goodsa month ago

This is a complex question with an even more complex response - short of saying - It Depends!!

Innovation, scientific and economic growth are direct factors which will be advantaged by allowing our public data to be trained on. But, it is more complex and it depends come in because . . .

- Let's say your data is public but has some personal information that may be subject to data privacy laws - who will be responsible?
- Let's say there are copyright considerations in your public data, what is your expectations on fair use?
- If you are in EU, it gets even more complicated with the life of data if indexed wrt GDPR

Content you might like

If you have had a negative experience with a vendor (without naming the vendor), what did you learn from it?

Financial Management Vendor Management

VP of Global IT and Cybersecurity in Manufacturing6 years ago

Have clear business requirements up front, make sure the proposal includes items such as scope, timeline, cost, resources.

Who will WIN Generative AI Future (GPT/ChatGPT)?

Data & Analytics Disruptive & Emerging Technologies End-User Services & Collaboration+1 more

Open AI (Game Changer: adoption w/ChatGPT)41%

Google (Game Changer: inventor of Transformers, Bard)19%

Microsoft (Game Changer: real time BingGPT+Search plus enterprise enablement)19%

Meta (Game Changer: LLM that can run on single GPU)6%

Amazon (Game Changer: TBD)4%

X.AI / Elon Musk (Game Changer: TBD)3%

Baidu (Chinese tech giant, with GPT version released in March)2%

Someone completely new6%

View Results

46.7k views49 Upvotes15 Comments

I have just been asked by an exec from another function to audit a critical project in their portfolio which does not appear to be going well. I have some ideas based on my own PMO experience, but wanted to explore how peers might think about approaching such an exercise. Considerations: - outside my main role - risk of conflict with the leader providing project management services to this exec - sensitivity within the other function to potential criticism of their governance - readiness of the project team to provide relevant information

Relationship Management Project Management Culture & Values+2 more

CFO3 days ago

I recommend that you consider finding an outside third party to perform the audit. I have had to do something similar with an unprofitable division/product line that reports directly to our CEO. We outsourced with Alvarez ...read more

130 views1 Comment

who are you using to purchase your Microsoft licenses from? We are a midsize enterprise (450 office licenses, plus a bunch of Dynamics spend) We are looking for a vendor that will provide support since we cannot budget for Microsoft direct support. We are also trying to determine if there are benefits to signing a EA over signing a 3 year agreement with someone like Telus who does not have access to EAs (like Softchoice)

6 views

What is your primary deciding factor in adopting enterprise search?

Strategy & Architecture Cloud End-User Services & Collaboration+26 more

TCO19%

Pricing26%

Integrations21%

Alignment with Cloud Provider7%

Security10%

Alignment with Existing IT Skills4%

Product / Feature Set7%

Vendor Relationship / Reputation

Other (comment)

View Results

5.7k views3 Upvotes1 Comment