In just over a week since its launch, Depseek has achieved the title of the most downloaded free app store, Apple’s application store.
Why so much fuss? The chatbot has obtained such high scores – or older, in some cases – that its most popular rivals, such as Chatgpt, of OpenAi; Claude, from Anthropic; or Gemini, from Google. One more? It would go unnoticed if it were not for a fundamental detail: It has the same results, but it is much cheaper.
According to reports, Chinese artificial intelligence was developed by a fraction of the cost of the most popular models. It works with the Deepseek-V3 open source, which was trained at a cost of 6 million dollarswhile current models have required much greater amounts. In the case of chatgpt-4, the training cost 100 million dollars.
Andrej Karpathy, co -founder of Openai, former director of AI in Tesla and one of the most respected experts in the sector, described that budget as “From joke” And added: “You have to make sure that we are not overthrowing with what we have and this model seems like a good demonstration that there is much to review both in data and algorithms”.
Deepseek’s arrival raises questions about the future of the United States domain in the field of AI and about the strategy that US companies are adopting to ensure their investments.
What is different from Chatgpt?
According to Moisés Meza, a professor at the Department of Engineering at Cayetano Heredia, Deepseek and Chatgpt University are two of the most advanced language models of the moment. Although both can generate high quality text and maintain consistent conversations, Depseek stands out for its efficiency and ability to adapt to different tasks. Some compare it to the Chatgpt model, which stands out for its reasoning.
“Depseek uses techniques such as Mixture of Experts (MoE) y Multi-head Latent Attention (MLA). MOE allows the model Specialize in specific tasksactivating only the necessary parts for each consultation. For its part, MLA Improves memory managementcompressing information and expediting processing. These characteristics make Deepseek a lighter and more efficient model, capable of offering results comparable to ChatgPT without requiring so much computational power ”explains the specialist to Commerce.
The MOE technique activates only the “experts”Necessary, while MLA reduces the memory load when compressing data. This allows you Maintain high performance with lower energy consumption and computational resources.
Every time the user asks a question, the AI model decides whether to activate his expert in medicine, translation, law or science. Classic models activate everyone suddenly, which is a waste of energy and computing. Deepseek, on the other hand, prioritizes only one at a time.
“For example, its Deepseek-V2 model has a Mixture of Experts (MOE) architecture that adds 23 billion total parameters, of which only 21 billion per Token are activated, thus optimizing computational efficiency. This efficiency It translates into a 42.5% reduction in training costs and an improvement of up to 5.76 times in the generation speed”Meza comments.
Besides…
ChatGPT vs. DeepSeek
Chatgpt and Deepseek are models based on an architecture called Transformerbut with significant differences in its design and purpose. ChatGPT (GPT-4), developed by OpenAI, is an owner model optimized by advanced context adjustment techniques and mixing experts, designed for general tasks such as writing, reasoning and creative generation in several languages. Deepseek, on the other hand, is an open source model with a more specialized approach to mathematics, code generation and algorithmic resolution, with better performance in English and Chinese. ChatGPT prioritizes the fluidity and versatility in interaction, Deepseek prioritizes precision in computational tasks and the structuring of technical information.
Eric Biagioli, Director of the Department of Computer Science and Postgraduate Data.
An open model
For Wester Zela, dean of engineering careers at the Southern Scientific University, Depseek has several key differences with other models such as Chatgpt. The most important is that it is a model open sourcewhich means that anyone can download it, analyze their code and make modifications.
In addition, his training was carried out with less advanced hardware: Depseek used NVIDIA chips of previous generations, due to the export restrictions imposed by the US to China. Despite not having the most recent chips, developers achieved results comparable to OpenAi models, demonstrating that it is not essential to use the latest technology to obtain high performance models.
“The emergence of Deepseek represents a great opportunity for developers, startups and entrepreneurs. With an Open Source high performance model, totime is possible to access advanced technology without depending on the owners of companies such as OpenAi or Google”Zela points out to this newspaper.
Zela considers that access to technology open source It is a great opportunity for developers from countries like ours. OpenAI and other companies have never published the complete details of their models, while Deepseek makes their code available to anyone.
“This means that local developers can study, modify and train AI models without depending on proprietary technologies. However, although the code is accessible, it is still necessary to invest in training and in computational infrastructure to make the most of it ”says the dean of the scientist.
“If more people in our country manage to train in the development of AI models, we could see the creation of local ventures that take advantage of this technology. In the long term, the paradigm has changed: hundreds of millions of dollars are no longer needed to train advanced models, which opens the door to innovation in various parts of the world ”add.
Deepseek performance
According to data compiled by Europa Press, the model exceeds other open source models and achieves a performance comparable to that of the main closed code models.
- In the evaluation of language comprehension (MMLU Pro), Depseek-V3 reaches a score of 75,9in front of the 78,0 by Claude 3.5 Sonnet, El 72,6 of GPT-4O and the 73,3 of flame 3.1 405b.
- In the capacity evaluation to answer complex postgraduate level questions (GPAQ Diamond), Deepseek-V3 obtains 59,1below Claude 3.5 Sonnet (65,0), but above GPT-4o (49,9), QWEN 2.5 De 72B (49,0) and call 3.1 405b (51,1).
- In the resolution test of mathematical challenges (Math 500), Depseek achieves 90,2surpassing Claude 3.5 Sonnet (78,9), QWEN 2.5 De 72B (80,0), GPT-4O (74,6) and call 3.1 405b (73,8).
- In solving mathematical problems with Aime 2024, Depseek obtains 39,2followed by Qwen 2.5 of 72b and calls 3.1 405b (23,3), Claude 3.5 Sonnet (16,0) Y GPT-4o (9,3).
Check to the United States?
In a context in which the United States has hardened the export restrictions of AI chips, Depseek evidence that it is possible to develop advanced technology without depending on the most recent processors.
“On the contrary, one of the most obvious consequences of the restrictive measures against Chinese technological markets have been the impulse in the creation of their own modelssimpler, but also more powerful. So far, what we have observed is a country that has accelerated its technological independence, partly thanks to this type of restrictions ”tells Eric Biagioli, de la Utec.
The specialists consulted for this note agree that we are facing a revolutionary technology or, at least, in the face of a great first step towards a more prolific future in the field of AI. There is no doubt that the paradigm has changed and that, in some way, this technology is democratizing.
“I think Depseek will change the rules of the game. This means that Many great corporations will have to develop simpler modelssignificantly cheaper and with lower hardware consumption, but without sacrificing power. Without a doubt, it is an interesting change that, to some extent, puts large companies, forcing them to adapt ”Comenta Biagioli.
But although, until now, US restrictions have promoted innovation in China, they could also limit international collaboration in research and development, which would stop the advance of artificial intelligence in general.
https://animesocial.su/blogs/33244/Coach-Rental-in-Lisbon-The-Best-Option-for-Comfortable-Group
https://fbadult.com/blogs/25142/Bus-for-Event-in-Lisbon-A-Perfect-Solution-for-Group
https://sontopic.com/blogs/3910/Travel-Coach-in-Lisbon-The-Ideal-Way-to-Explore-the
https://pittsburghpenguinsclub.com/read-blog/14737
https://forum.blu-ray.com/showthread.php?p=22793116#post22793116
https://stakecommunity.com/topic/116195-stakes-favorite-travel-destinations-share-your-adventures/
https://mydramalist.com/discussions/general-discussion/120147-how-was-your-day?page=last&p=3229648#p3229648
https://pokeheroes.com/forum_thread?id=74539&site=3
https://www.6atexasfootball.com/forum/off-topic/823995-which-tx-city-has-the-worst-traffic
https://www.fitday.com/fitness/forums/off-topic/33519-what-do-you-do-your-spare-time.html
https://sinister.ly/Thread-What%E2%80%99s-your-favorite-car-brand-More-reliable
https://torrentinvites.org/f24/nasa-probe-makes-closest-ever-pass-sun-643882/
https://forum.woodenboat.com/forum/the-bilge/9223847-the-secret-to-a-better-city-is-a-two-wheeler/page5
https://forum.ucool.com/showthread.php?209454-A-hassle-free-group-trip-to-Switzerland
https://www.hidden-street.net/forum/threads/66370-Best-bus-rental-companies-in-Vienna?p=936288#post936288
https://propowerwash.com/board/upload/threads/tour-buses-in-vienna-where-to-rent.35333/
https://www.wilderness-survival.net/forums/showthread.php?35339-Buses-for-groups-of-20-to-50-people
https://captainhowdy.com/forums/topic/additional-services-when-renting-buses/
https://forum.americancasinoguide.com/forum/general-discussion-forum/21263-cost-of-bus-rental-in-vienna
http://www.orangepi.org/orangepibbsen/forum.php?mod=viewthread&tid=153459&extra=
https://forum.sessiongirls.com/index.php?threads/buses-for-group-tours-in-salzburg.27930/
https://yourhikes.com/forums/topic/bus-rental-prices-in-salzburg/
https://forums.besttechie.com/topic/214115-additional-services-for-bus-rental-in-salzburg%C2%A0%C2%A0-%C2%A0/
https://www.wikiwicca.com/forums/topic/how-to-book-a-bus-and-tour-guide-in-salzburg/
https://intua.net/forums/index.php?p=/discussion/9201/bus-rental-with-driver-for-excursions-in-copenhagen/p1?new=1
https://vandogtraveller.com/forum/index.php?topic=145748
https://www.laketahoemarathon.com/group/the-tahoe-triple-marathon-3-x-26-2-group/discussion/e6d6359a-db9c-4ae3-b94e-b0a1879b573d
https://www.militarymissionsinaction.org/group/mmia-group/discussion/084eaa1e-613a-4084-a58a-9e7bc7f28691
https://www.chasehatchery.com/group/chase-hatchery-group/discussion/e69d5eda-6e51-4727-8382-bbebeb2c5b10
https://www.cherokeeforum.com/groups/jeep-comanche-mj-owners-1420-travel-sweden.html