

Speaking AI is a beta test for the Foundation Model in generative voice. It allows users to capture their unique tone with just 3 seconds of input and achieve natural-sounding voice quality.
09 Oct 2023
Readmore

21 Mar 2024
Readmore


Image In Words is a generative model designed for scenarios that require generating ultra-detailed text from images. It is particularly suitable for recognition tasks of large language model (LLM) assistants and for leveraging AI recognition and description capabilities in more complex scenarios using gpt4o. It only supports English and has been trained using approximately 100,000 hours of English data. Image In Words has demonstrated high quality and naturalness in various tests.
20 Jun 2024
Readmore