With the advent of LLMs being used more widely, what are your opinions on the introduction of biases into the pool of answers and solutions provided by these models? Will we require more validation of the answers provided by these models down the line? Do you think this marks the advent of new “AI validation” roles in the future?

275 views7 Comments

Sort By:

Oldest

Director of Data8 months ago

I do think that an AI validation role is necessary. However, the maker of the given model has the responsibility regarding the content that goes into a given model. The AI validation role on the backend is the safety net.

1 1 Reply

Director of Data in Healthcare and Biotech8 months ago

Great point! Thanks Eda!

Director of Data and AI in Banking8 months ago

I don't think so. There is a broad market out there and I apologizing for any over-generalization that's about to happen.

Right now, most commercial LLMs munge together the concept of "language" alongside of what we can generally call "data" or "knowledge". So, we expect to put an entire universe of information inside the model. This works fine for slowly changing data with one level of access control. Most valuable scenarios don't fit into this category.

What's happening at a high level is the separation of "language" and "knowledge". Your data will sit next to the LLM and get orchestrated into the LLM to give a citable answer based on the universe of data that is (See "The Co-pilot System": https://www.youtube.com/watch?v=E5g20qmeKpg)

One of the interesting outcomes of this separation is the data is now your responsibility. So, you don't have to worry about bias because obviously your own data isn't biased! I'm joking here because bias is and always has been a part of datasets. The way that we've historically mitigated this as a risk is to make sure humans are still in charge.

When humans cede control of customer facing AI interactions...bias is the least of your problems and I still don't think an AI validation role will help.

1 2 Replies

Please join or sign in to view more content.

By joining the Peer Community, you'll get:

Peer Discussions and Polls
One-Minute Insights
Connect with like-minded individuals

Director of Data in Healthcare and Biotech8 months ago

Great answer thanks! In light of your answer, do you foresee the advent of less usage of open source LLMs being utilised in the corporate world and rather a usage of own custom built models that ingest their own “knowledge” as you described?

Director of Data and AI in Banking8 months ago

I think models are going to come and go, right now the OpenAI GPT family is at the head of the table and will be for the foreseeable future, but that doesn't mean it always will be. So, you should expect to replace the model every few months for the rest of your career.

Right now, the market wants big models (with lots of data in them), but soon we are going to realize that these open-source models cost about $100k per month in GPU cost to train and operate and we are going to want small models or hosted models where we pay per use. I think this is because the people working with them are comfortable making models and living in a DIY world but once organizations start to orchestrate those resources the same way they do LOB developers, the paradigm will change. It will likley look something like this https://youtu.be/FyY0fEO5jVY?si=r5d95pA_DXNkAk5A&t=1397

the whole video is good but I've linked to the relevant part of the video.

I think the model is going to be one small, replaceable part of the gen ai value chain. It's a difficult question to answer all up because it's about more than technology and I fully expect to get flamed in the comments :) . Good luck!

Chief Data Officer in Healthcare and Biotech8 months ago

Consider using knowledge graphs when customising LLMs with your organisational proprietary data for secure and reliable adoption.

A knowledge graph is an information-rich structure that provides a view of entities and how they interrelate. Expressing these relationships as a graph can uncover facts that were previously obscured and lead to
valuable insights. You can even generate embeddings from this graph (encompassing both its data and its
structure) that can be used in machine learning pipelines or as an integration point to LLMs. This helps solves major challenges with LLM’s. The models are by design “black box” deep learning models and as thus explainability and transparency lacks with LLMs. Knowledge graphs add the ability to be transparent,
explicit, and deterministic to the models, making this a huge plus for areas of application that demand this. Equally, training LLMs existing knowledge graph, solutions such as chatbots are enabled to respond to product and service questions without hallucinations. This allows for adoption of LLMs with greater context.
The knowledge graphs also help manage the risk of bias that may arise from the data that the foundational models are trained. This protects adopters of these technologies from perpetuating or/and amplifying these biases in their own environments.

Practice Head, Cognitive AI in Banking8 months ago

Yes, in fact all gen AI models need human in the loop. If it is called as AI validation in future, be it but mandatory. This is critical when dealing with sensitive information as the generated text has to be precise and not mis- guide anyone. Biases are present everywhere throughout human history created and designed by humans. So removing biases fully is literally not possible at this point of time. But there are ways to oversee the biases and make corrections as needed. One such way is validating the answers of LLM.

Content you might like

Which cloud provider do you use for your Enterprise Data Warehousing needs?

Data & Analytics Business Intelligence Storage

Google Cloud Platform - BigQuery15%

Amazon Web Services - Redshift46%

Microsoft - Azure35%

Snowflake3%

View Results

4.6k views2 Upvotes2 Comments

Any recommendations for a comprehensive GenAI learning platform?

Engineering Data & Analytics Security Strategy & Roadmap+4 more

IT Manager in Constructiona month ago

Hello,
the topic is so broad, what are you focused on?

I would like to see if there is some consistency across industries in terms of definitions for data asset, data domain, and data product?

Peer Insights Operations Management Data & Analytics

Director of Enterprise Data & Analytics in Retail13 days ago

I look at those terms as general terms that do not have unique definition across industries, rather different forms of those terms by industry.

Data assets and data products are closely related, data domains are a ...read more

1 1 Reply

How important is it for a cloud FinOps tool to have cost allocation and data analysis capabilities?

Data & Analytics Cloud Financial Management

Critical7%

Very important57%

Somewhat important27%

Somewhat unimportant3%

Not at all important1%

Unsure1%

View Results

2.3k views1 Comment

Has anyone created a side-by-side comparison of Snowflake/DataBricks/Microsoft Fabric/Synapse on Azure? Could you please suggest a resource or share your thoughts?

Data & Analytics

Director of Data & Analytics6 months ago

Snowflake and Databricks are the industry leaders in the modern data landscape. However, Databricks as a platform it offers more features and functions for typical D&A and Advanced analytics. When it comes to snowflake, it ...read more

2 Replies

510 views1 Upvote3 Comments