6000
hours of Voice Data
30
Districts
24K+
Unique Speakers
15
minutes/speaker
With Karya's expertise, 6000 hours of voice data will be gathered from 30 districts, contributing to one of the largest datasets of Indian dialects, totaling over 150,000 hours of audio upon completion. Karya plays a pivotal role in this initiative by mobilising local communities, training field coordinators, and ensuring fair compensation for data collectors, thus empowering local voices and fostering inclusivity.