Oh no the Chinese are doing the useless thing that gives wrong answers cheaper

https://www.cnn.com/2025/01/27/tech/deepseek-stocks-ai-china/index.html

A shocking Chinese AI advancement called DeepSeek is sending US stocks plunging

US stocks dropped sharply Monday — and chipmaker Nvidia lost nearly $600 billion in market value — after a surprise advancement from a Chinese artificial intelligence company, DeepSeek, threatened the aura of invincibility surrounding America’s technology industry.

CNN

@scalzi It's interesting IMO because:
- You can run the smallest model on a Ras Pi with no internet connection.
- It prints its "train of thought".
- The models are all open source.

I have no love of China, or of how the tech industry has watered machine learning down to "everything is a shitty chatbot", but I can't deny that this is an impressive breakthrough and I hope it's enough to help drive real innovation in ML.

On the other hand, I also realize this is probably going to buoy the grift even longer.

@faoluin
> You can run the smallest model on a Ras Pi with no internet connection

That is interesting. But I've seen a smaller Llama model run on a MacBook with no net connection. I'm still not convinced that makes it anything beyond a vaguely amusing waste of resources.

> The models are all open source

Including the weights and the data they're generated from?

@scalzi

@strypey @scalzi 1) As I tried to articulate in my post, I think most LLMs today are just parlor tricks, not very useful. I'd like to see more specialized ML tools.
2) I haven't gotten that far yet. I agree that would be preferable.

@faoluin
> [weights and the data being freely licensed] would be preferable

As a number of software freedom activists have pointed out, these are just as necessary to recreate the model as its source code. So it's not just preferable, it's a prerequisite for saying with accuracy that ...

> The models are all open source

@scalzi

@strypey @scalzi If I were king of the world then yeah that would be the case. I'm sorry, but I'm not going to waste time on pedantics on this topic.

@faoluin
> I'm not going to waste time on pedantics on this topic

You specified that the models are "open source" is if it matters. Now you're arguing that distinguishing actual Open Source from openwashing is "pedantics". Not sure how you square that circle.