THE AI INTERVIEW
And yet, despite mounting regulatory scrutiny and legal challenges over data scraping, many organisations continue the practice – simply because it is“ inexpensive and familiar”.
Building a better alternative Alice’ s team has tried to demonstrate that a different approach is possible, through a project called FHIBE( Fair Human-Centric Image Benchmark). FHIBE is a dataset designed from the outset around consent, with images collected directly from participants rather than scraped from the web. It also aims for broader demographic representation, allowing researchers to test how AI systems perform across different groups of people.
Alice says:“ Benchmarks such as FHIBE allow practitioners to assess bias more systematically and design systems that perform well across a wider range of users and contexts.”
Building such a dataset is far more demanding than scraping images from the internet, requiring multiple disciplines from within an organisation to join forces.
“ It requires close coordination across teams such as legal, privacy, technical and operations, as well as meaningful engagement with data subjects,” Alice continues.“ This is inherently far more complex than simply scraping data from the web.”
The payoff, Alice believes, justifies the effort:“ Ethical data practices strengthen trust with customers
“ I quickly saw the lack of standards or guardrails around bias in AI systems”
Alice Xiang Global Head of AI Governance Sony
and partners, reduce regulatory and legal risk, and improve model performance through higher quality, more representative and better annotated datasets.”
Why adoption remains slow So, if FHIBE proves that responsible data collection works, why haven’ t more AI companies followed suit?
Alice identifies two barriers. The first relates to the sheer scale of resources required. For Sony, building FHIBE
24 July 2026