• by ahzhou on 1/28/2025, 1:22:31 PM

    I might be missing something, but DeepSeek’s recipe is right there in plain sight. Most of the cost efficiency of DeepSeek v3 seem to be attributable to MoE and FP8 training. DeepSeek R1s improvements are from GRPO-based RL.

    Interesting to note - we have no idea how much R1 cost to train. To speculate - maybe DeepSeek’s release made an upcoming Llama release moot in comparison.

  • by marjann on 1/28/2025, 4:08:28 PM

    What a time to be alive. Chinese companies were copying everything from the west, now it seems the opposite.

  • by bamboozled on 1/28/2025, 12:41:22 PM

    Can anyone explain why Meta's share price was untouched by the deep seek announcement ? They have spent billions on AI infra?

    According to this article they are rattled in some way...

  • by OfCounsel on 1/28/2025, 5:39:04 PM

    Meta has been aware of DeepSeek for a long time (as Zuckerberg mentioned the company by name in his podcast with Joe Rogan) and a “war room” is just a meeting room.

  • by hulitu on 1/28/2025, 4:19:48 PM

    > Meta scrambling 'war rooms' of engineers to figure out DeepSeek's AI

    "Gentlemen, you can't fight in the war room."