Leaderboard | Coding

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

851

Llama 3.3 70B Instruct Turbo

849

Command R 7B

846

Wikipedia

842

MAI-DS-R1

838

GPT-3.5 Turbo 16k

838

Cogito V2 671B

838

ERNIE 4.5 21B A3B Thinking

835

DeepSeek-R1 Distill Llama 70B

835

Typhoon 2 70B Instruct

834

GLM 4.5 Flash

833

OLMo 2 0425 1B Instruct

832

NVIDIA Llama 3.1 Nemotron Ultra 253B v1

832

Mixtral-8x7B Instruct v0.1

831

Qwen 2.5 7B

829

Sky T1 32B Preview

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
321	234	Llama 3.3 70B Instruct Turbo	851	±19	1.2K	6.0%	2.0%	78 tps	1.0s	131K	$0.88	$0.88
322	234	Command R 7B	849	±15	3.3K	4.8%	1.1%	76 tps	0.4s	128K	$0.04	$0.15
323	312	Wikipedia	846	±7	9.8K	4.9%	<0.1%	47 tps	2.1s	32K	$0	$0
324	324	MAI-DS-R1	842	±7	3.5K	11.7%	<0.1%	73 tps	3.2s	64K	$0.10	$0.40
325	234	GPT-3.5 Turbo 16k	838	±10	2.7K	3.6%	<0.1%	22 tps	0.6s	16K	$3.00	$4.00
326	324	Cogito V2 671B	838	±17	1.6K	5.9%	<0.1%	41 tps	0.6s	164K	$1.25	$1.25
327	234	ERNIE 4.5 21B A3B Thinking	838	±23	1.1K	6.9%	1.8%	87 tps	1.5s	120K	$0.07	$0.28
328	240	DeepSeek-R1 Distill Llama 70B	835	±9	3.4K	5.2%	3.6%	27 tps	1.6s	32K	$0.73	$0.95
329	324	Typhoon 2 70B Instruct	835	±15	1.4K	4.0%	<0.1%	19 tps	0.1s	8K	$0.88	$0.88
330	240	GLM 4.5 Flash	834	±37	520	8.8%	12.2%	15 tps	2.2s	131K	$0	$0
331	324	OLMo 2 0425 1B Instruct	833	±21	560	2.6%	<0.1%	68 tps	0.0s	4K	$0	$0
332	324	NVIDIA Llama 3.1 Nemotron Ultra 253B v1	832	±16	2.2K	4.1%	<0.1%	40 tps	0.8s	128K	$0.30	$0.90
333	240	Mixtral-8x7B Instruct v0.1	832	±23	1.3K	4.6%	1.3%	54 tps	0.4s	33K	$0.60	$0.60
334	240	Qwen 2.5 7B	831	±17	2K	5.1%	3.7%	40 tps	1.9s	131K	$0.08	$0.27
335	240	Sky T1 32B Preview	829	±14	2.4K	4.5%	7.8%	73 tps	0.6s	16K	$0.12	$0.18
336	240	LFM2 2.6B	826	±26	810	10.0%	6.7%	184 tps	0.4s	33K	$0.01	$0.02
337	240	Krutrim 2	825	±10	2.3K	2.3%	12.5%	33 tps	2.1s	128K	$1.00	$1.00
338	240	Ministral 8B	825	±17	2.2K	5.5%	1.4%	177 tps	0.4s	128K	$0.14	$0.14
339	240	C4AI Aya Expanse 32B	821	±7	3.8K	4.0%	1.5%	43 tps	0.5s	128K	$0.50	$1.50
340	240	Moonshot V1 32k	820	±17	950	3.1%	1.4%	53 tps	1.4s	33K	$1.00	$3.00
341	240	LFM2 8B A1B	818	±18	825	11.3%	<0.1%	142 tps	0.3s	33K	$0.01	$0.02
342	240	Gemma 2 27B	815	±17	1.5K	4.1%	1.4%	44 tps	1.4s	8K	$0.80	$0.80
343	337	GLM 4.1V 9B Thinking	813	±16	1.1K	4.2%	<0.1%	69 tps	1.3s	66K	$0.04	$0.14
344	252	Ministral 3B	806	±16	2.3K	5.1%	0.8%	248 tps	0.4s	131K	$0.08	$0.08
345	337	Qwen 2 72B Instruct	805	±19	1K	3.3%	<0.1%	3 tps	N/A	33K	$0.90	$0.90
346	346	Magistral Medium (Thinking)	804	±10	2.2K	5.7%	<0.1%	67 tps	0.8s	41K	$2.00	$5.00
347	252	Magistral Small 2509	802	±18	1.8K	7.5%	2.7%	116 tps	0.6s	131K	$0.50	$1.50
348	252	Gemma 3 1B	802	±11	2K	6.1%	0.6%	176 tps	1.0s	33K	$0.06	$0.10
349	252	WizardLM-2 8x22B	801	±12	1.9K	3.1%	11.6%	11 tps	2.5s	66K	$0.77	$0.77
350	252	Phi 4	798	±16	1.7K	3.4%	5.1%	28 tps	1.3s	128K	$0.10	$0.32
351	252	Hermes 4 405B FP8	797	±21	815	8.4%	3.5%	31 tps	0.9s	131K	$0.52	$1.73
352	346	Magistral Medium 2507	795	±25	665	14.2%	<0.1%	86 tps	0.7s	41K	$2.00	$5.00
353	252	Mercury Coder	793	±27	510	3.8%	<0.1%	247 tps	2.2s	32K	$0.25	$1.00
354	252	GPT-3.5 Turbo Instruct	787	±9	2K	2.7%	<0.1%	46 tps	1.2s	4K	$1.50	$2.00
355	252	Mistral Large	785	±16	1.1K	5.8%	1.5%	54 tps	0.7s	33K	$2.00	$6.00
356	252	Hermes 4 70B	781	±29	460	8.9%	1.1%	67 tps	0.6s	131K	$0.12	$0.39
357	262	Command R	778	±18	2.2K	4.9%	5.8%	54 tps	0.6s	128K	$0.30	$0.99
358	262	Baichuan-M2-32B	770	±30	740	10.8%	<0.1%	32 tps	3.3s	131K	$0.07	$0.07
359	262	Mistral Small	770	±12	1.2K	4.5%	1.7%	142 tps	0.6s	32K	$0.43	$1.30
360	354	OLMo 3 7B Think	763	±21	770	7.8%	4.2%	77 tps	0.4s	66K	$0.12	$0.20

9of11

View All (404 models)