danielhanchen

2024-12-10 19:51:26

Llama 3.3 (70B) Finetuning - now with 90K context length and fits on <41GB VRAM.

danielhanchen

2025-01-28 19:28:34

You can now run DeepSeek-R1 on your own local device!

danielhanchen

2025-01-27 15:24:00

1.58bit DeepSeek R1 - 131GB Dynamic GGUF

Hisma

2025-02-01 02:50:40

I was able to 1-shot prompt the unlsoth "python flappy bird game" test with Deepseek R1 distilled 70B. The distilled models deserve more credit.

Master-Meal-77

2025-01-31 23:24:55

The new Mistral Small model is disappointing

ApprehensiveCook2236

2025-01-31 20:54:04

DeepSeek AI blocked by Italian authorities

Wrong-Historian

2025-01-29 17:51:28

Running Deepseek R1 IQ2XXS (200GB) from SSD actually works

True-Local-4043

2025-01-28 16:06:49

Fine Tuning On Completions only using Unsloth

davernow

2025-01-28 14:04:50

Unsloth made dynamic R1 quants - can be run on as little as 80gb of RAM

shakespear94

2025-01-27 02:32:44

I have a 12GB 3060, is it possible to fine-tune ANY model?

Happysedits

2025-01-25 03:08:41

[R] Replicating DeepSeek-R3-Zero RL recipe on 3B LLM for <30$, the model develops self-verification and search abilities all on its own

danielhanchen

2025-01-13 18:28:26

I fixed 4 bugs in Microsoft's open-source Phi-4 model

danielhanchen

2025-01-20 15:07:11

Deepseek-R1 GGUFs + All distilled 2 to 16bit GGUFs + 2bit MoE GGUFs

kristaller486

2025-01-20 12:07:33

Deepseek just uploaded 6 distilled verions of R1 + R1 "full" now available on their website.

bymechul

2025-01-20 09:02:18

let’s goo, DeppSeek-R1 685 billion parameters!

luckbossx

2025-01-20 12:32:02

DeepSeek R1 has been officially released!

danielhanchen

2024-03-19 17:23:23

[P] How I found 8 bugs in Google's Gemma 6T token model

danielhanchen

2025-01-15 18:22:48

[P] How I found & fixed 4 bugs in Microsoft's Phi-4 model

davernow

2025-01-14 20:18:39

I accidentally built an open alternative to Google AI Studio

danielhanchen

2025-01-09 00:20:50

Phi-4 Llamafied + 4 Bug Fixes + GGUFs, Dynamic 4bit Quants

paf1138

2025-01-08 15:37:07

Phi-4 has been released

danielhanchen

2025-01-10 18:09:05

Phi-4 Finetuning - now with >128K context length + Bug Fix Details

oobabooga4

2025-01-10 22:47:05

DeepSeek-V3 imatrix quants by team mradermacher

pigeon57434

2025-01-09 02:03:50

Now that Phi-4 has been out for a while what do you think?

No-Guitar843

2025-01-09 04:06:54

Resources for AI

Share Your Mood