Jump to content

Talk:DeepSeek

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Correct Company Full English Name

[edit]

the official company english name is Hangzhou DeepSeek Artificial Intelligence Co., Ltd. Refer: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-policy.html But now there is another translation in this article.@Cfls is against about this name. Invite more people to talk about this. Cs haoh (talk) 12:02, 2 March 2025 (UTC)[reply]

K-V caching

[edit]

The Development and Research section has two mentions of K-V caching, associated to two sources, the papers "DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models" and "DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence" but I did a quick search on both of these papers and I couldn't find the word cache anywhere. I'm sure there's a source for this somewhere, or maybe I'm missing something, but could somebody either verify that these sources actually support this discussion, or provide sources that do? I added two verification-needed tags where it comes up. Truthnope (talk) 22:21, 10 March 2025 (UTC)[reply]

Open source / open weight

[edit]

DeepSeek has often been described as open source example. Yet other sources and this article distinguish it from genuine open source, e.g. here it says The DeepSeek algorithm is ‘open weight,’ which is similar to but different from ‘open source.’. It seems to be a bit more than that though and kind of closer to and enable open source, see here: The engineers said they were compelled to act by DeepSeek’s “black box” release philosophy. Technically, R1 is “open” in that the model is permissively licensed, which means it can be deployed largely without restrictions. However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery. Like many high-flying AI companies, DeepSeek is loathe to reveal its secret sauce. and here DeepSeek doesn’t disclose … training code used to train its models. […] DeepSeek’s models are similarly opaque, but HuggingFace is trying to unravel the mystery. On 28 January, it announced Open-R1, an effort to create a fully open-source version of DeepSeek-R1. […] Regardless of Open-R1’s success, however, Bakouch says DeepSeek’s impact goes well beyond the open AI community. “The excitement isn’t just in the open-source community.

So this means Category:Open-source artificial intelligence wouldn't be good to add here despite that a) it is highly relevant to open source AI and b) has often been called open source but could be added if there was an article on the HuggingFace variant if they make it fully open source right? What about a category for open weights AI and what's the current state on a fully open source variant of it? Prototyperspective (talk) 22:42, 1 April 2025 (UTC)[reply]

Proposed summary for technical prose

[edit]

I've been using Google's Gemini 2.5 Pro Experimental large language model to create summaries for the most popular articles with {{Technical}} templates. This article, DeepSeek, has such a template in the "Overview of models" section. Here is the paragraph summary at grade 5 reading level which Gemini 2.5 Pro suggested for that section:

DeepSeek has created several special computer programs called models. Some models, like DeepSeek Coder, are good at helping write computer instructions. Others, like DeepSeek-LLM, are made for general chatting and writing. They also made models just for solving math problems and models called R1 that focus on thinking step-by-step. DeepSeek keeps making newer versions like V2 and V3, which learn from lots of information and sometimes use special tricks to work faster or better. People can use these models, but there might be rules about how much they can change them.

While I have read and may have made some modifications to that summary, I am not going to add it to the section because I want other editors to review, revise if appropriate, and add it instead. This is an experiment with a few dozen articles initially to see how these suggestions are received, and after a week or two, I will decide how to proceed. Thank you for your consideration. Cramulator (talk) 12:15, 2 April 2025 (UTC)[reply]

I am retracting this and the other LLM-generated suggestions due to clear negative consensus at the Village Pump. I will be posting a thorough postmortem report in mid-April to the source code release page. Thanks to all who commented on the suggestions both negatively and positively, and especially to those editors who have manually addressed the overly technical cleanup issue on six, so far, of the 68 articles where these suggestions were posted. Cramulator (talk) 22:05, 4 April 2025 (UTC)[reply]