SenseTime, a leading AI company from China, has unveiled its latest advancement, the SenseNova 5.5, at the 2024 World Artificial Intelligence Conference & High-Level Meeting on Global AI Governance. The release emphasizes SenseTime’s commitment to innovation and practical application in various industries.
The SenseNova 5.5 Large Model represents a comprehensive upgrade, integrating the first real-time multimodal model in China, SenseNova 5o. This model introduces a new AI interaction framework comparable to GPT-4’s streaming interaction capabilities. The multimodal nature of SenseNova 5o allows it to process and respond to data across audio, text, image, and video formats in real time, offering users an interactive experience similar to conversing with a human. This feature is valuable for real-time conversation and speech recognition applications, showcasing the model’s adaptability and contextual response capabilities.
One of the key highlights of SenseNova 5.5 is its cost-effective edge-side large model, which significantly reduces the cost per device to as low as RMB 9.90 per year. This affordability facilitates widespread deployment, making advanced AI accessible to various users & industries. The cloud-to-edge full-stack large model product matrix of SenseTime ensures continuous updates, providing innovative solutions for generative applications across multiple scenarios and industries. The SenseNova Large Model has already been deployed to over 3,000 government and corporate customers, spanning the technology, healthcare, finance, and programming sectors.
Dr. Xu Li, Chairman of the Board and CEO of SenseTime, highlighted the importance of this upgrade, stating, “This is a critical year for large models as they evolve from unimodal to multimodal. In line with users’ needs, SenseTime is also focused on boosting interactivity. With applications driving the development of models and their capabilities, coupled with technological advancements in multimodal streaming interactions, we will witness unprecedented transformations in human-AI interactions.”
SenseNova 5.5’s technical prowess is underpinned by a hybrid cloud-edge collaborative expert architecture, optimizing cloud-to-edge synergy and reducing inference costs. The model training utilized over 10TB tokens of high-quality training data, including synthetically generated reasoning chain data, enhancing its reasoning capabilities. Compared to its predecessor, SenseNova 5.0, the new model boasts a 30% improvement in overall performance, with enhanced abilities in mathematical reasoning, English proficiency, and command-following, aligning closely with GPT-4’s core indicators.
In addition to the large model upgrades, SenseTime has introduced SenseChat Lite-5.5, an edge-side model featuring a reduced inference time of 0.19 seconds and a 40% improvement over SenseChat Lite-5.0. The inference speed has also increased by 15%, achieving 90.2 words per second, which translates to better performance and efficiency. The edge-side model product matrix includes specialized models like the SenseChat Mini Writing Assistant, the Summary Assistant, and the Encyclopedia Assistant, each tailored to specific business needs.
A significant addition to the SenseNova suite is Vimi, SenseTime’s first controllable AI avatar video generator. Vimi can generate short video clips with precise control over facial expressions and upper body movements. It is an ideal tool for long-form video generation in entertainment and interactive applications. This feature underscores SenseTime’s commitment to expanding generative AI applications under the SenseNova Large Model Series, catering to diverse user needs and empowering industries in their digital transformation efforts.
SenseTime has also launched the “Project $0 Go” scheme, offering a free and comprehensive onboarding bundle for enterprise users migrating from the OpenAI platform. This initiative includes a 50 million tokens package and API migration consulting services, lowering the barriers to entry for enterprises seeking to leverage the robust capabilities of the SenseNova Large Model.
In conclusion, 2024 is great for large models, coinciding with SenseTime’s 10th anniversary. The company’s decade-long journey has culminated in a comprehensive full-stack large model product matrix covering cloud-to-edge applications. As SenseTime continues to expand the SenseNova industry ecosystem, it remains dedicated to empowering more businesses and communities in their digital transformation journeys.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.