AI / ML
LLMs and NLP Large language models like GPT-4 or Google ’ s LaMDA use Natural Language Processing ( NLP ) to understand and respond to human-generated text inputs in a conversational manner . A subfield of AI that focuses on enabling computers to process and understand human language , these models utilise NLP techniques to analyse and interpret the text input it receives – including tasks such as partof-speech tagging , named entity recognition , sentiment analysis , and language modelling . These NLP techniques help related tools understand the meaning , context , and intent behind the text input , allowing them to generate relevant , coherent responses in a conversational style .
“GENERATIVE LANGUAGE MODELS CAN BE USED TO SUGGEST DATA QUALITY RULES AND TRANSFORMATIONS IN NATURAL LANGUAGE TEXT THAT BUSINESS STAKEHOLDERS CAN EASILY UNDERSTAND ”
DAVIDE PELOSI MANAGER , SOLUTIONS ENGINEERING , TALEND quantify the severity of the issues . Then , based on the assessment results , generative language models can be used to suggest data quality rules and transformations in natural language text that business stakeholders can easily understand .”
From there , these proposed rules can be reviewed and validated by data quality experts and business stakeholders , who may accept or reject them or suggest modifications to better align with their business requirements .
“ Businesses can also create additional Business Rules simply by asking in natural language , without needing development or complex UIs ,” Pelosi adds . “ For example , a business user might ask , ‘ Please raise the acceptable age to drink alcohol to 18 and mark all the people not following the rule as not being targeted for the spring marketing campaign ’, like we do today with Alexa . Once the rules are accepted , they can be converted into executable code , such as Python or SQL , using a similar , template-based approach .
“ Of course , before deploying the code to production , it will need to be tested and validated using a sample of data to ensure the rules are working as expected and the data quality metrics are being met . But , once done , the cleaned data can be used for various downstream tasks , from data analysis and visualisation to machine learning and business intelligence .
“ Picture this : the world of data management and quality is about to undergo a significant transformation , and we ' ve got a sneak peek at what ' s coming . Although the use of generative language models in this field is still in its infancy and is being researched by industry experts , there are already some jaw-dropping research projects and prototypes out there that show the mind-boggling potential of this technology .”
44 June 2023