Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1129

Claude Haiku 4.5 (Extended Thinking)

1126

Qwen3 Max Instruct Preview

1126

Gemini 2.5 Pro

1119

Grok 4.1 Fast Reasoning

1116

Claude Sonnet 4.5

1114

DeepSeek V3.1 Turbo

1113

DeepSeek V3.2

1112

Gemini 2.5 Flash Lite

1111

Qwen Max

1111

Qwen3 32B

1110

GLM 5

1107

GPT-5 Mini Minimal

1103

DeepSeek V3 0324 Turbo

1098

Qwen3 32B Fast

1097

DeepSeek V3.1 Chat

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	26	Claude Haiku 4.5 (Extended Thinking)	1129	±5	3.6K	1.6%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
42	42	Qwen3 Max Instruct Preview	1126	±4	4.3K	2.8%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
43	44	Gemini 2.5 Pro	1126	±4	16.2K	1.5%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
44	44	Grok 4.1 Fast Reasoning	1119	±6	5.4K	1.5%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
45	37	Claude Sonnet 4.5	1116	±6	5K	3.1%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
46	56	DeepSeek V3.1 Turbo	1114	±6	4K	2.1%	0.9%	173 tps	1.3s	164K	$2.00	$3.75
47	40	DeepSeek V3.2	1113	±5	3.6K	0.8%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
48	101	Gemini 2.5 Flash Lite	1112	±6	7.6K	1.7%	1.3%	210 tps	0.7s	1M	$0.10	$0.40
49	93	Qwen Max	1111	±6	7.6K	1.4%	1.5%	49 tps	1.5s	33K	$1.60	$6.40
50	95	Qwen3 32B	1111	±17	515	1.9%	3.9%	30 tps	3.1s	41K	$0.12	$0.42
51	22	GLM 5	1110	±7	1.8K	0.8%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
52	84	GPT-5 Mini Minimal	1107	±10	970	3.5%	1.2%	63 tps	1.4s	400K	$0.25	$2.00
53	93	DeepSeek V3 0324 Turbo	1103	±5	4.4K	1.9%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
54	121	Qwen3 32B Fast	1098	±8	9K	1.0%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
55	86	DeepSeek V3.1 Chat	1097	±7	1.9K	2.3%	2.8%	21 tps	1.6s	131K	$0.38	$1.00
56	56	DeepSeek V3.2 Thinking	1096	±6	3.8K	0.9%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
57	111	LongCat Flash Chat	1095	±7	1.7K	2.8%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
58	121	QwQ 32B	1091	±5	9.9K	0.9%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
59	33	Kimi K2.5	1090	±6	4.5K	0.7%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
60	86	Nemotron 3 Nano (Thinking)	1089	±9	1.5K	0.7%	2.0%	200 tps	0.5s	256K	$0	$0
61	62	GPT-5.1 Instant	1085	±6	3.7K	1.1%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
62	106	DeepSeek V3 0324	1084	±4	5.7K	1.4%	5.8%	12 tps	2.7s	164K	$0.38	$0.93
63	52	GPT-5	1083	±5	7.6K	2.2%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
64	86	Claude Sonnet 4	1083	±5	12K	1.6%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
65	60	MiniMax M2.1	1080	±6	5.2K	0.6%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
66	48	Grok 4 Fast Reasoning	1077	±6	3.3K	2.8%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
67	121	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	1076	±12	755	1.9%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
68	52	Claude Haiku 4.5	1076	±8	4.2K	2.2%	1.1%	100 tps	0.9s	200K	$1.00	$5.00
69	52	Grok 4 Fast Non-Reasoning	1075	±6	2.9K	3.3%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
70	95	DeepSeek-R1 Turbo	1075	±6	1.9K	2.4%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
71	56	Gemini 3.1 Flash Lite Preview Thinking	1071	±13	560	1.8%	1.7%	75 tps	4.7s	1M	$0.25	$1.50
72	68	Grok 4	1070	±4	13.8K	1.6%	3.9%	29 tps	11.1s	256K	$3.00	$15.00
73	95	Gemini 2.5 Flash	1068	±4	11.2K	1.2%	1.3%	2 tps	3.7s	1M	$0.30	$2.50
74	56	MiniMax M2.1 Lightning	1067	±12	855	0.6%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
75	79	Qwen3 Max Thinking Preview	1067	±6	3.1K	1.4%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
76	71	Gemini 2.5 Flash Lite Preview 0925	1066	±6	3.3K	2.8%	1.2%	209 tps	0.7s	1M	$0.25	$0.35
77	124	Qwen3 235B A22B Thinking 2507	1065	±7	1.8K	1.9%	2.5%	53 tps	1.6s	131K	$0.59	$5.70
78	44	Kimi K2 Thinking Turbo	1065	±6	3K	1.9%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
79	65	Mistral Large 3	1064	±7	1.8K	2.2%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
80	118	GPT-4.1 mini	1062	±5	5.5K	1.8%	1.1%	67 tps	0.9s	1M	$0.34	$1.60

2of5

View All (193 models)