By merging Web3 and artificial intelligence, Vivoka is introducing a new way to collect data to train our robot overlords.
Under the leadership of William Simonin and the voice-recognition acumen of Vivoka, the company has just rolled out the private beta for its new project “Ta-Da,” a play on the word data.
Public beta is expected next quarter.
“Through ‘Ta-da,’ we envision a platform where diverse AI firms, transcending just speech recognition, can requisition data, ensuring affordability without compromising on quality,” Simonin told Decrypt.
Tapping blockchain technology, Ta-Da aims to encourage users worldwide to share data they will create by accomplishing various tasks like reading a sentence, writing a text, or recognizing an object.
The collected data, which could include voice recordings, images, videos, and texts, will then be accessible to businesses for the purpose of AI model training.
Users are then rewarded with TADA tokens for their contributions.
Developed on the MultiversX blockchain, the platform aims to address key challenges faced by companies using data to train AI models, specifically those of high costs and inconsistent data quality.
“We perceive blockchain providers as pivotal technical allies,” Simonin told Decrypt. “Collaborating with MultiversX feels more intimate and prioritized than being one amongst countless projects on alternative platforms.”
Ta-Da’s model also prioritizes user privacy by relying solely on volunteer-generated data, a stark contrast with the practices of companies such as Meta and Amazon.
Meta Platforms utilized public posts from Facebook and Instagram to train its Meta AI virtual assistant, while Amazon leveraged actual user conversations to refine Alexa’s AI model.
Ta-Da AI takes aim at diverse audio data
Given the focus on voice recognition, one of Ta-Da’s main purposes is to amass voice recordings in myriad languages, all intended to fine-tune AI voice recognition systems.
With Vivoka, William Simonin spent years crafting a tech solution supporting 42 languages and tailored for voice development kits, enabling businesses in diverse sectors like robotics and logistics to embed it within any speech interface.
The firm currently works with approximately 100 global clients, and its technology is embedded in over 100,000 devices globally.
It’s through this extensive work that he identified challenges within the nascent voice data collection sector.
The immense volume of data required for refinement can be prohibitively expensive. The price tag for 1,000 hours of audio can cost as much as $100,000. It’s common for companies focused on AI to allocate budgets ranging from $100,000 to $1 million annually just for this type of data.
Furthermore, concerns frequently arise regarding the data’s authenticity and quality. “Only about 5-10% of a dataset undergoes rigorous examination,” noted Simonin, drawing attention to challenges like inferior data quality and inadequate compensation for genuine contributors.
The challenge remains in securing a diverse and expansive audio dataset, particularly when seeking to understand complex languages. “An AI trained solely on a male voice might perform exceptionally with that specific input. However, its accuracy could falter when a woman interacts with it,” Simonin explained.
Ta-Da will thus offer higher rewards for rarer voices.
“You will have access to various tasks, each offering different remuneration,” Simonin told Decrypt. “For instance, if you speak a particular language with a specific accent, Ta-Da might pay more for unique requirements, such as someone who can speak Corsican with an English accent.”