jd_3d

The first performant open-source byte-level model without tokenization has been released. EvaByte is a 6.5B param model that also has multibyte prediction for faster inference (vs similar sized tokenized models)

jd_3d

2024-02-18 21:03:14

I created a single-prompt benchmark (with 5-questions) that anyone can use to easily evaluate LLMs. Mistral-Next somehow vastly outperformed all others. Prompt and more details in the post.

Many_SuchCases

2025-01-14 16:41:20

MiniMax-Text-01 - A powerful new MoE language model with 456B total parameters (45.9 billion activated)

itsnottme

2025-01-14 15:51:58

DDR6 RAM and a reasonable GPU should be able to run 70b models with good speed

fairydreaming

2025-01-08 20:05:14

Why I think that NVIDIA Project DIGITS will have 273 GB/s of memory bandwidth

[deleted]

2025-01-08 18:06:36

i dont get the hype around nvidia project digits?

dp3471

2025-01-08 00:33:59

Llama 4 compute estimates & timeline

lxe

2025-01-07 17:10:46

Are you gonna wait for Digits or get the 5090?

Longjumping-Bake-557

2025-01-07 04:10:47

Now THIS is interesting

Big_Coat6894

2025-01-07 03:29:48

RTX 5000 series official specs

TechNerd10191

2025-01-06 15:20:42

RTX 5090 rumored to have 1.8 TB/s memory bandwidth

Kooky-Somewhere-2883

2025-01-07 03:05:53

RTX 5090 Blackwell - Official Price

robk001

2025-01-02 22:28:09

Killed by LLM – I collected data on AI benchmarks we thought would last years

jd_3d

2025-01-01 19:59:25

A new Microsoft paper lists sizes for most of the closed models

theshadowraven

2024-12-31 22:46:58

For 2025

jd_3d

2024-12-28 07:44:04

DeepSeek does not need 5 hours to generate $1 worth of tokens. Due to batching, they can get that in about 1 minute

AlgorithmicKing

2024-12-28 03:48:59

I don't get it.

jd_3d

2024-12-24 04:23:12

Aider has released a new much harder code editing benchmark since their previous one was saturated. The Polyglot benchmark now tests on 6 different languages (C++, Go, Java, JavaScript, Python and Rust).

EstablishmentFun3205

2024-12-18 18:36:09

Day 10 🙂

Share Your Mood