List of Contents

DeepSeek Unveils Intermediate AI Model Ahead of Next-Gen Launch


Published: 30 Sep 2025

Author: Precedence Research

Share : linkedin twitter facebook

A leading AI developer firm in China, popularly known as DeepSeek, has released a new and experimental AI model claiming to have better training and processing abilities for long sequences of text compared to its previous large language Artificial Intelligence models. According to the company, this is an intermediate step in building their next-generation architecture, as shared in a post on the developer forum Hugging Face. Their upcoming architecture is likely to be the most significant product release since V3 and R1, which shook Silicon Valley and technology investors outside of China.

DeepSeek

The new model, named V3.2-Exp, is equipped with a mechanism called DeepSeek Sparse Attention and is expected to offer cost-effective computing methods, enhancing model performance at a notable scale. DeepSeek further announced on the X platform, formerly Twitter, that they are cutting API prices by over 50%. They further wrote, “This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences."

The innovative DeepSeek model is constructed based on the previous V3.1 model, featuring new technology and sparse attention, aiming to explore and optimize AI training and efficient operations with expected results, particularly for long texts and complex datasets. A core and traditional mechanism in a large language model, which considers each word in a sequence in relation to other words, costs more gradually as the text becomes longer, creating a barrier to adopting this technology. Therefore, DeepSeek comes with its own method, known as DeepSeek Sparse Attention-DSA, which addresses this issue by selectively computing attention weights. It requires comparatively less computing power and memory for long text sequences. For developers, this radiates into faster model performance and lower latency, even handling bulky and complex documents.

It is unlikely that DeepSeek's new architecture will create significant turmoil in the global market, unlike previous versions did in January. However, it could still affect domestic rivals like Alibaba’s Qwen and OpenAI if it manages to recreate the success of DeepSeek’s R1 and V3 models. To achieve this state, they need to offer a high-capability model at a significantly lower price than their competitors and must also spend substantially on model training.

Latest News