How DeepSeek AI is a Threat to U.S. Tech

by Sonia Boolchandani
January 31, 2025
10 min read
How DeepSeek AI is a Threat to U.S. Tech

It’s not every day that a small Chinese AI startup pokes a billion-dollar hole in the valuation of the largest chipmaker in the world. But last week, DeepSeek, a Chinese AI company, made headlines by launching an AI model that directly challenged the established norms of the AI industry. This move shook the tech world and sent shockwaves through global markets, with Nvidia feeling the brunt of it.

In a single day, Nvidia saw a record $593 billion evaporate from its market value. It marked the largest single-day loss for any company in Wall Street history. 

The catalyst? 

DeepSeek, a relatively unknown player in the AI scene, launched an open-source AI assistant that claims to perform just as well as established giants like OpenAI, but at a fraction of the cost.

By Monday, the DeepSeek assistant had overtaken ChatGPT in the download charts on Apple’s App Store, and investors were scrambling to reassess the value of AI stocks. Nvidia, which had been riding high on the back of its dominance in the AI chip market, saw its stock fall by nearly 17%, with its market cap shrinking dramatically.

But what exactly happened? How did a small startup from China manage to rattle one of the most influential players in the AI and tech industry?

The DeepSeek Threat: Cutting Costs, Raising Eyebrows

DeepSeek, founded by Liang Wenfeng, a former hedge fund manager, claims to have built an advanced AI language model, R1, that offers comparable performance to OpenAI’s offerings but at a much lower cost. The company says it was able to create this model using just 2,048 Nvidia accelerators in under two months, with an investment of only $6 million.

In comparison, OpenAI’s flagship model, GPT-4, was developed with a much larger budget, and its cost structure is still under significant scrutiny. The fact that DeepSeek managed to do it with such a leaner setup raised eyebrows—and not just in the tech world, but also among investors concerned about the future of AI chip demand.

DeepSeek claims that the R1 model contains 671 billion parameters, and is able to produce similar accuracy benchmarks to OpenAI’s models despite using far less computational power. Furthermore, its pricing model is radically cheaper. Where OpenAI charges $7.50 per million tokens, DeepSeek offers access for just $0.14 per million tokens.

The image compares DeepSeek-V3 with other AI models across tasks like MMLU-Pro (general knowledge), GPQA-Diamond (graduate-level problems), MATH 500 (math reasoning), AIME 2024 (math contest), Codeforces (coding performance), and SWE-bench Verified (software engineering tasks). It shows DeepSeek-V3’s competitive results against models like GPT-4o, Claude-3.5 Sonnet, and Qwen2.5-72B-Inst.

This revelation sent tremors through the AI ecosystem. Meta Platforms, which recently released the Llama 3.1 model with 450 billion parameters, has also been struggling with high training costs. By DeepSeek’s data, Llama’s expenses on compute power were at least ten times higher than what DeepSeek incurred.

How DeepSeek Achieved Its $6 Million Model

At the heart of DeepSeek’s success lies its efficient use of hardware and training techniques. Let’s break down some of the key elements that contributed to the low cost and impressive performance of DeepSeek-R1.

1. Maximizing Existing Hardware Instead of Waiting for the Next Big Thing

The tech world expected export restrictions on advanced AI chips like the NVIDIA H100 to slow DeepSeek down. But they didn’t wait for new hardware to arrive. Instead, they made do with what they had—likely the NVIDIA H800—and optimized it to the max.

By focusing on low-level code optimizations, they squeezed out every bit of performance from their hardware, ensuring that memory usage was as efficient as possible. Rather than relying on the latest tech, DeepSeek made their existing resources work harder, proving that smart software can often outperform fancy hardware.

2. Innovative Training Techniques

A critical factor in DeepSeek’s cost efficiency is its adoption of the Mixture of Experts (MoE) technique. Traditional AI models require the entire model to be active during training, which leads to high computational costs. With MoE, only a subset of the model is activated at any given time. This reduces the computational load, as only a few components of the model are working during each training step.

This approach allows DeepSeek-R1 to be trained with significantly fewer GPU hours. It is estimated that DeepSeek-R1 required only 3 million GPU hours, which is much less than the tens of millions of hours typically required for models of similar complexity. By using MoE and more efficient training strategies, DeepSeek kept the training costs low while still ensuring high-quality results.

3. Training Only What’s Necessary to Save Time and Resources

Training AI models usually involves updating every single part of the model, even the ones that don’t contribute much. This leads to huge inefficiencies. DeepSeek flipped the script by training only the parts of the model that mattered most.

They used a technique called Auxiliary-Loss-Free Load Balancing, which ensured that only the most relevant parts of the model were activated and updated. This dynamic approach allowed them to balance the workload across the model, cutting down on unnecessary resource consumption.

Results:

  • Only 5% of the model’s parameters were trained per token.
  • 95% less GPU usage compared to companies like Meta.
  • Faster, cheaper training without compromising accuracy.

4. Compression for Faster, Cheaper AI

AI models are memory-hungry beasts, especially when it comes to inference—when the model is generating outputs. DeepSeek tackled this challenge with Low-Rank Key-Value (KV) Joint Compression.

Instead of storing massive key-value pairs in memory, which are crucial for the attention mechanism, they compressed them. This reduced the memory footprint while still maintaining the necessary data integrity. When needed, the compressed data was expanded with minimal loss of accuracy.

Benefits:

  • Lower memory usage: A smaller data footprint without sacrificing performance.
  • Faster inference: Less data means quicker results.
  • Lower costs: More efficient hardware utilization.

5. Reinforcement Learning for Smarter Training

DeepSeek also took a more intelligent approach to training by using reinforcement learning. Instead of relying purely on traditional methods, they focused on tasks with clear, verifiable answers, like math or coding challenges.

Here’s how it worked: the AI tackled tasks, got rewarded for correct answers, and adjusted its approach based on feedback from any mistakes. This focused, task-based learning method improved the model’s accuracy with fewer resources.

The Mystery with DeepSeek’s Model

While the performance of DeepSeek’s model is impressive, there’s a big question on everyone’s mind: How did they manage to achieve this cost efficiency? Some industry experts believe DeepSeek is hiding the full extent of its GPU usage. In a CNBC interview, the CEO of Scale AI, Alexandr Karp, suggested that DeepSeek might be using far more H100 chips than they publicly claim—up to 50,000 units, according to some reports.

If DeepSeek indeed has access to such a large number of Nvidia’s powerful H100 chips (which are priced at around $30K each), its actual investment in GPU hardware could be much higher than the stated $6 million. This discrepancy raises concerns about the true cost of DeepSeek’s AI development and whether it can maintain this efficiency as it scales.

Additionally, DeepSeek’s open-source approach to AI models has created a further stir. U.S. corporations, which are wary of Chinese technology due to export controls and security concerns, would find it difficult to rely on a DeepSeek model built in a Chinese lab. The company has also restricted new users to Chinese mobile numbers, which further complicates the accessibility of its offerings in the U.S.

The Nvidia Dilemma: Will Demand for GPUs Drop?

For Nvidia, this development presents a dilemma. The company has been riding high on its position as the supplier of choice for AI-related hardware, particularly in powering the training of large language models (LLMs). However, if DeepSeek’s claims about cost-efficient AI are accurate, it could suggest that much lesser chips would be required to run these systems.

There’s a real concern in the market that reduced demand for GPUs could hurt Nvidia’s sales growth. This fear is compounded by the fact that the AI chip company is trading at a relatively rich valuation, which makes it more vulnerable to market sentiment shifts.

However, there are some who argue that DeepSeek’s success doesn’t necessarily spell doom for Nvidia. Instead, it may just be the beginning of a new phase in AI development where cheaper, more efficient models make AI accessible to a broader audience. As Microsoft CEO Satya Nadella pointed out, the Jevons Paradox—where more efficient technology leads to increased usage and greater demand—could be at play here.

In other words, while DeepSeek may have managed to build a cost-effective model, the overall demand for AI is likely to continue growing. As the AI landscape expands, companies like Nvidia could still benefit from the increased need for more advanced hardware capable of supporting cutting-edge artificial general intelligence (AGI) research.

OpenAI’s Business Model: A Bleeding Juggernaut

One of the other crucial elements to consider in the Nvidia vs. DeepSeek debate is OpenAI’s current business model. Despite bringing in impressive revenues (over $4 billion in 2024), OpenAI is still struggling with massive operational costs. The company has forecasted a loss of $5 billion for the year, largely due to the enormous expenses involved in training its AI models, particularly the cost of GPUs and compute power.

This financial bleed raises questions about the sustainability of the current AI business models. OpenAI, despite being at the forefront of generative AI, is clearly grappling with the high costs of scaling its infrastructure. DeepSeek, on the other hand, could be a sign that the next wave of AI companies will need to rethink their cost structures.

The Bigger Picture: AGI and AI’s Expanding Role

While DeepSeek has made waves by proving that AI can be built at a lower cost, the real battle in the AI market remains focused on achieving artificial general intelligence (AGI). The U.S. tech giants—Microsoft, Meta, Google—are all heavily invested in reaching AGI, which would represent a leap forward in AI capabilities, making current models like GPT-4 look like a small step in comparison.

Nvidia’s core business remains firmly tied to this long-term vision of AGI. Hyperscalers in the U.S. are focused on building the infrastructure necessary to support this kind of leap. This suggests that even if companies like DeepSeek succeed in creating affordable AI models, the demand for Nvidia’s advanced GPUs may not subside. Instead, the need for more powerful hardware could increase as AI continues to scale up in complexity.

In the short term, however, Nvidia’s stock may face challenges. The company’s 60%+ operating margins could be under pressure if GPU demand flattens, which would reduce profits. A dip in margins to the 30% range—more in line with Intel’s past performance—could significantly impact earnings per share.

The Bottom Line: Nvidia’s Investment Outlook

Despite the short-term headwinds, the key takeaway for investors is clear: Nvidia remains a solid long-term bet, particularly if the stock dips further. While the rise of more efficient AI models like DeepSeek’s may pose a challenge, they do not necessarily spell the end of Nvidia’s growth. Instead, more accessible AI technology could pave the way for increased spending in the sector, benefitting companies like Nvidia in the long run.

However, the risk remains that if another company develops better GPUs or alternative AI chips that surpass Nvidia’s offerings, the company could see its dominant position eroded. This is the real threat to Nvidia’s position in the market and one that investors should continue to monitor closely.

For now, Nvidia’s stock is trading at an attractive valuation, with strong growth projections for the coming years. As AI continues to evolve, Nvidia’s role in powering the next generation of models and technologies remains central to the industry’s trajectory.

Conclusion

DeepSeek’s low-cost AI breakthrough may have sparked a momentary panic, but it’s unlikely to dethrone Nvidia as the dominant force in AI hardware—at least not yet. While cheaper AI models may disrupt the current status quo, the increasing demand for more advanced AI technologies will continue to fuel the growth of companies like Nvidia.

For investors, the recent drop in Nvidia’s stock presents a potential opportunity. The AI race is far from over, and Nvidia’s pivotal role in driving the future of artificial intelligence will likely see it recover and thrive in the long run.

Disclaimer – This article draws from sources such as the Financial Times, Bloomberg,and other reputed media houses. Please note, this blog post is intended for general educational purposes only and does not serve as an offer, recommendation, or solicitation to buy or sell any securities. It may contain forward-looking statements, and actual outcomes can vary due to numerous factors. Past performance of any security does not guarantee future results.This blog is for informational purposes only. Neither the information contained herein, nor any opinion expressed, should be construed or deemed to be construed as solicitation or as offering advice for the purposes of the purchase or sale of any security, investment, or derivatives.The information and opinions contained in the report were considered by VF Securities, Inc.to be valid when published. Any person placing reliance on the blog does so entirely at his or her own risk, and does not accept any liability as a result.Securities markets may be subject to rapid and unexpected price movements, and past performance is not necessarily an indication of future performance. Investors must undertake independent analysis with their own legal, tax, and financial advisors and reach their own conclusions regarding investment in securities markets.Past performance is not a guarantee of future results

Leave a Comment

Your email address will not be published. Required fields are marked *

Alternative Investments made easy