What is DeepSeek?
Right now, the buzzword on everyone’s mind is DeepSeek and how it has shaken our understanding of AI and its complexities.
There are several misconceptions that need to be cleared up. First, the term AI has been misused and misinterpreted.
Behind AI, there has been an immense effort in natural language processing (NLP), Language Learning Models (LLMs), and Machine Learning (ML) for years – if not decades. Furthermore, advancements in hardware have helped make this a reality, though an expensive one.
I personally dreamed, ten years ago, of a day when I could use an LLM to translate 18th and 19th-century documents from their original sources. Today, that dream is a reality, not just for me but for many others who see these tools as allies rather than threats to progress.
Aside from text processing, AI has also advanced in image generation, using tools like Stable Diffusion, DALL·E, and the upcoming Sora from OpenAI, which will allow us to create movies from text prompts.
The quality of the tool you use matters. Platforms like ChatGPT, Copilot, and Google Gemini have been trained with massive resources and investments. So when DeepSeek entered the field, it truly shook the foundation of what we had been told about AI. And now, at the time of writing, DeepSeek has also released Janus-Pro, which is said to rival DALL·E 3.
At this rate, if things continue like this, DeepSeek may release its equivalent of Sora before OpenAI does!
Why Is DeepSeek Important?
There are three key things to consider:
1. Privacy – How Safe is Your Data?
On the internet, privacy doesn’t truly exist. If something is online, it is virtually impossible to hide it completely. This is why laws are being created to regulate how companies handle your data.
If you provide personal or private information to an LLM like ChatGPT, it learns from it. In the best-case scenario, this can provide interesting results; in the worst case, it could be used against you in a breach. (For example, ChatGPT has brought up things I didn’t even remember telling it and has even asked me about how my dog is doing. Now you also know I have a dog.)
Using DeepSeek’s online version has considerable security and privacy concerns. Google it (if you still use Google) to understand what that means for you, or read DeepSeek’s fine print.
But do not misunderstand me: data privacy is everyone’s responsibility. Laws and regulations can only go so far.
If you want to try DeepSeek safely, like I did, you can run it offline using Ollama.
2. Power vs. Efficiency: The Cost of AI Processing
I experimented with Ollama, which allows you to run different LLMs, hoping to build something for a previous employer. The results were interesting, but when the model returned only one letter every 30 seconds, it was clear that this was not the chatbot our company needed.
At the time, I misunderstood the point. Now, I realize that this is why models are sometimes distilled for specific tasks.
What is model distillation? It involves using a larger model to generate answers, which are then used to train a smaller, more efficient model. Companies do this all the time, and depending on the contract, it may not always be legal.
The reality is that training an AI model requires vast resources, especially electricity. The more powerful the model, the more expensive it is to train.
DeepSeek is changing the game by creating something more efficient, open-source (which I always support), and capable of further distillation – something that has already been done.
Benchmark Example:
I am running DeepSeek R1 (1.5B) on an Acer Aspire Lite 16 laptop, and I’m impressed by its speed compared to Llama 3 (1B). DeepSeek R1 is much faster, but its responses are much shorter than Llama 3’s. You decide which one you prefer.
3. AI on a Raspberry Pi? Yes, It’s Possible
This challenge perfectly showcases how efficient a model can be. To say I am shocked would be an understatement.
I am still using smaller models since larger ones require significant storage and computational power, which goes against my core idea: testing model efficiency without spending millions.
Again, I noticed that DeepSeek R1 is faster than Llama 3.2, but Llama provides longer, more detailed responses. However, one surprising thing about DeepSeek is that it can censor responses, which I did not expect. Meanwhile, Llama 3.2 tends to go off on lengthy tangents.
Yet, both models work, at speeds I did not expect. I even have older hardware that I could test them on – because why not? I hate e-waste. Just because something is old doesn’t mean it can’t be useful. (Ask my Toshiba Satellite 4015CDS, which has been with me for 15 years!)
The Infamous Sputnik Moment
At the end of the day, why does this matter?
The issue is that DeepSeek, a company from China, has released a model comparable to OpenAI’s GPT–4 – and then Janus-Pro, which rivals DALL·E 3.
Meanwhile, OpenAI’s ChatGPT and DALL·E 3 require paid subscriptions, with closed-source code. On the other side, DeepSeek is offering similar quality as open-source.
And what’s even better? You don’t need expensive hardware to run it.
This is a turning point – developers and companies now have access to powerful AI models for free, with no upfront cost beyond their hardware.
The Sputnik Moment? DeepSeek, along with Alibaba and Kimi, has released models rivaling the most advanced Western AI models – those that laid the foundation for modern AI.
Stock market values have dropped significantly, and some are calling it the AI Bubble bursting, with trillions of dollars being pulled out. But this shouldn’t be happening – that money should be reinvested in better hardware and AI research.
At the end of the day, the controversy surrounding this is massive. But for us mere mortals, it simply means more choices and new opportunities.
We may be witnessing the dawn of true AI R&D competition, and if that’s the case, we might see groundbreaking advancements before 2025 is over.
Just wait and see – Sora’s competition might arrive before the end of the year, and who knows what else is on the horizon?