Gitcoin: Type2Vid

Home Projects Rounds

Type2Vid

https://www.type2Vid.ai

25%

average score over 1 application evaluations

AI platform for creating realistic videos from text using voice cloning and lip-sync technology, accessible via $TYPE tokens, supporting global Web3 integrations, with a decentralized GPU backbone.

Type2Vid.ai The first AI technology platform to empower Web3 projects

The most advanced AI text-generated video large multimodal model, the user threshold is low and the cost of use is low. Type2Vid is an aggregation platform that empowers Web 3 projects with AI technology and continuously issues new AI Base projects. In the future, decentralized GPU power AIOS will be developed and $TYPE tokens will be issued to make it accessible to users around the world.

Please see our presentation that accompanies this proposal: https://docsend.com/view/k54rixcxuxxk5dfw

Type2Vid.ai through the use of neural networks and rendering technology to achieve realistic character video synthesis. The user inputs the text content expected to be spoken by the target speaker, and the text content will be converted into the audio of the target speaker's timbre. The output speech is natural, expressive, and indistinguishable from the target speaker's voice. Then use the audio to edit the head video of the target speaker to generate a high-quality and lip-synced output video that is consistent even under different emotions.

Extract the source audio speech feature: First, perform pre-training to extract the source audio speech feature from the target speaker's voice source file to form an audio corpus, support users to directly input text, directly convert text to speech through the function of TTS, and then the extracted audio speech feature is given to the speech file converted by text.
Predict facial shapes and textures: Secondly, predict facial shapes and textures from a single image of the target speaker.
Lip synchronization is applied to the source video: Finally, the face shape that needs to be matched with the speech content predicted by the voice file generated in the first step is applied to the source video to achieve lip synchronization.

Type2Vid.ai provides text-generated video to build basic models for global users, and continues to improve and optimize, and will support more languages in the future, including Spanish, Russian, Japanese, Korean..., providing top-level text-generated speech, voice cloning, voice fusion with lip movements and expressions, and other cutting-edge AI technologies.

Advantages and applications: The lip synchronization algorithm proposed by Type2Vid can not only achieve realistic lip synchronization while maintaining the target speaker's expression, head posture and other features, but also has a wide range of application prospects in video-related industries. Let's take a deeper look: Widely used domain: It can be applied to domains such as movies, short videos, advertisements, education, training, virtual reality, conferences, etc. For example:

In entertainment such as live broadcasts and short videos, the real person can record the video of any voice content in advance, and only needs to change the text content each time, which can quickly generate new live broadcasts or short video content without the process of shooting and editing.
In advertising, the spokesperson can record the video in advance, and the advertiser can enter the text at any time to create a new advertising video based on the spokesperson according to the different needs of different stages, without the spokesperson re-shooting and editing the video advertisement.
In the field of education and training, it can be used by real teachers to create interactive learning experiences, improve the learning effect and have fun.
In filmmaking, this technology can help make film dubbing more realistic.
In virtual reality, technology can enhance the user's immersion and experience.
In scenarios such as remote meetings, this technology can make communication between participants more natural and authentic.
In translation, technology translates the entire video into different languages or modifies the voice after the video is recorded.

Type2Vid.ai will ultimately establish a decentralized GPU computing platform based on the Arbitrum One blockchain. Users participate in Type2Vid text generation video-related computing work according to hardware configuration requirements, realize decentralized data computing architecture. Ultimately, the following goals are achieved:

Consistent use of data and information
Data deletion after processing
Anonymization of sensitive personal or business data

Type2Vid History

applied to the Uniswap-Arbitrum Grant Program (UAGP) 1 year ago of which the application is still in a pending state

Explore projects

DMNO Config Engine

A versatile, open-source configuration management tool providing type-safety, validation, plugin system, and enhanced security features, suitable for various tech stacks and ideal for web3 environments.

Plurality.net

An open-source book project aiming to redefine the relationship between democracy and technology through a vision of pluralistic, Web3-enabled collaborative governance, with a focus on digital democracy and large-scale global participation.

PaintSwap & Estfor Kingdom

Decentralized PaintSwap NFT marketplace on Fantom network, offering innovative features like ERC-1155, royalties, and financial NFTs, along with the Estfor Kingdom idle MMORPG with a play-to-earn system.

Circonomy

Circonomy aims to create a circular, on-chain economy through a DAO, enabling peer-to-peer marketplaces and "ReGen" maker spaces, empowering collective ownership of "phygital" NFTs, and fostering materials-science R&D within a decentralized-science community, complemented by a learn-to-earn platform and a subscription aggregator for ReFi projects.

Web3Arabs

An educational initiative providing free Web3 and blockchain learning resources in Arabic, promoting knowledge and inclusivity in the MENA region.

Back to Projects