Alibaba Shares Its LLMs Research Achievements

Alibaba Shares Its LLMs Research Achievements

The internal research team at Alibaba is progressing with its large language models

As Chinese Big Tech companies continue to swarm into artificial intelligence (AI) to provide a competitor to OpenAI's ChatGPT, Alibaba Group Holding's internal research division is making headway with its large language models (LLMs).

In a study paper posted last week on the online scientific document archive ArXiv, a team of researchers from DAMO Academy introduced a novel audiovisual language model dubbed Video-LLaMA that aids the system in comprehending the visual and aural material in videos.

The researchers also made the programs publicly available on the GitHub website for developers. Alibaba owns the South China Morning Post.

Machine learning-trained LLMs are the foundation for chatbots driven by AI, such as ChatGPT. Using LLMs enables chatbots to produce complex writings, code, and other content and respond to complex queries.

According to the three researchers, Zhang Hang, Li Xin, and Bing Lidong, the new DAMO Academy model is an improvement over prior vision-LLMs because it can handle two issues in video understanding: capturing the temporal changes in visual scenes and integrating audiovisual inputs.

The model could textually represent both the background sound of clapping and the visual content of a video of a man performing saxophone on stage in a case study done by the researchers. The researchers noted that earlier models, such as MiniGPT-4 and LLaVA, mainly concentrated on static visual understanding.

According to the researchers, the concept is still "an early-stage prototype" with some drawbacks, including a restricted capacity for extended videos like movies and television series.

The action is part of broader attempts by Alibaba to increase its investment in the creation and use of LLMs. Alibaba is currently undergoing its most significant organizational restructuring.

One of the first Chinese firms to get on the ChatGPT bandwagon was Alibaba's cloud division, which in April revealed Tongyi Qianwen, an alternative to ChatGPT based on DAMO's LLMs, alongside search engine giant Baidu, which unveiled its Ernie Bot in March. During a meeting with analysts last month, Alibaba Chairman and CEO Daniel Zhang Yong revealed that the service had received more than 200,000 beta testing applications from business clients.

At the World AI Conference in Shanghai last September last year, DAMO's deputy head Zhou Jingren announced its LLM, AliceMind. According to him, it is a multimodal pre-trained language model that can process various inputs, including text, graphics, audio, and video.

Zhang stated that Alibaba has begun collaborating with partners to create AI models tailored to specific industries. For instance, it intends to introduce cloud goods and enterprise solutions based on its AI model and incorporate AI capabilities into several products, such as its DingTalk tool for office communication.

Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.

Related Stories

No stories found.
Analytics Insight