AI Magazine September 2024

MACHINE LEARNING

Because these systems are so data hungry , the quality of data is sometimes overlooked in order to meet the quantity these systems need .

Although pre-processing of such data can occur , Brad argues it would be better to be more mindful of the source : “ The quality of the written or spoken words on Reddit or social media streams isn ’ t always the best quality and therefore a good use of data .”

Poor , incomplete or biased training data can lead to skewed results , perpetuating existing inaccuracies , flaws , biases , or creating new ones .

Halting hallucinations As NLP technology continues to evolve , new approaches are emerging to address these challenges .

One promising solution is known as Retrieval-augmented generation ( RAG ). This technique aims to reduce hallucinations by grounding language models in verified information sources .

“ RAG is a practical way to overcome the limitations of general LLMs by making enterprise data and information available for LLM processing ,” explains Bern . “ It is essentially a way to allow targeted information to be retrieved ( often via

114 September 2024

AI Magazine September 2024 | Page 114