Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1142

Grok 4 Fast Reasoning

1141

GPT-5.4 mini

1134

DeepSeek V3.1 Turbo

1129

MiniMax M2.1 Lightning

1125

DeepSeek V3.1

1124

MiniMax M2.1

1121

Claude Haiku 4.5 (Extended Thinking)

1118

Gemini 2.5 Flash Preview 0925

1117

GPT-5

1116

OpenAI o3-pro

1110

Grok 4

1110

Claude Opus 4.5

1110

MiniMax M2

1107

Qwen3.5 397B A17B

1107

DeepSeek V3.1 Nex N1

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	48	Grok 4 Fast Reasoning	1142	±3	14.5K	2.0%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
42	17	GPT-5.4 mini	1141	±14	545	1.8%	0.8%	148 tps	0.5s	400K	$0.75	$4.50
43	56	DeepSeek V3.1 Turbo	1134	±3	9.5K	1.2%	0.9%	173 tps	1.3s	164K	$2.00	$3.75
44	56	MiniMax M2.1 Lightning	1129	±5	3.6K	1.4%	1.7%	52 tps	2.1s	205K	$0.30	$2.40
45	71	DeepSeek V3.1	1125	±4	4.4K	1.1%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
46	60	MiniMax M2.1	1124	±3	24.4K	1.0%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
47	26	Claude Haiku 4.5 (Extended Thinking)	1121	±3	14.1K	1.8%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
48	60	Gemini 2.5 Flash Preview 0925	1118	±3	14.4K	2.2%	1.2%	5 tps	0.9s	1M	$0.13	$0.97
49	52	GPT-5	1117	±2	31.1K	1.7%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
50	81	OpenAI o3-pro	1116	±5	3.2K	2.8%	5.2%	22 tps	70.8s	200K	$20.00	$80.00
51	68	Grok 4	1110	±1	98.8K	0.9%	3.9%	29 tps	11.1s	256K	$3.00	$15.00
52	17	Claude Opus 4.5	1110	±4	12.9K	2.2%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
53	62	MiniMax M2	1110	±3	17.2K	2.5%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
54	71	Qwen3.5 397B A17B	1107	±6	5.1K	1.6%	4.3%	57 tps	1.4s	256K	$0.52	$3.00
55	86	DeepSeek V3.1 Nex N1	1107	±8	1.5K	1.3%	3.4%	24 tps	7.2s	131K	$0.14	$0.50
56	79	Qwen3 Max Thinking Preview	1106	±4	13.3K	2.0%	3.1%	40 tps	2.1s	256K	$1.20	$6.00
57	101	DeepSeek V3 (Turbo)	1105	±5	3.7K	1.5%	1.5%	32 tps	1.5s	64K	$0.40	$1.30
58	56	Gemini 3.1 Flash Lite Preview Thinking	1105	±8	2K	1.7%	1.7%	75 tps	4.7s	1M	$0.25	$1.50
59	68	GLM 4.7	1105	±3	21K	1.0%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
60	86	Amazon Nova 2 Lite	1099	±4	10.5K	2.7%	1.0%	137 tps	0.6s	300K	$0.35	$2.95
61	68	Qwen Plus (Aug'24)	1098	±2	50.5K	1.1%	1.4%	53 tps	1.3s	30K	$0.40	$1.20
62	101	GPT-5 (Low)	1097	±7	1.5K	1.0%	1.8%	75 tps	8.2s	400K	$1.25	$10.00
63	62	GPT-5.1 Instant	1096	±3	14.9K	1.5%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
64	84	GPT-5 Mini Minimal	1094	±3	4.9K	3.0%	1.2%	63 tps	1.4s	400K	$0.25	$2.00
65	95	Kimi K2 Thinking	1092	±4	5.4K	2.0%	4.2%	61 tps	5.9s	262K	$0.24	$1.03
66	37	Claude Sonnet 4.5	1092	±2	25.2K	2.2%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
67	71	MiniMax M2.5 FP8	1092	±10	2.1K	1.6%	3.6%	33 tps	1.7s	205K	$0.45	$1.75
68	81	GPT-4o	1091	±2	23.5K	0.7%	1.0%	49 tps	2.4s	128K	$3.71	$12.57
69	71	Seed 1.8 251228	1090	±3	14.9K	1.5%	3.7%	41 tps	2.1s	256K	$0.25	$2.00
70	52	Claude Haiku 4.5	1089	±3	20.4K	2.1%	1.1%	100 tps	0.9s	200K	$1.00	$5.00
71	71	Gemini 2.5 Flash Lite Preview 0925	1087	±2	15.1K	2.5%	1.2%	209 tps	0.7s	1M	$0.25	$0.35
72	86	DeepSeek V3.1 Chat	1087	±3	10.7K	1.8%	2.8%	21 tps	1.6s	131K	$0.38	$1.00
73	71	GPT-5 Mini	1087	±3	11.3K	2.1%	2.6%	66 tps	14.2s	400K	$0.25	$2.00
74	95	Qwen3 32B	1085	±5	2.6K	1.5%	3.9%	30 tps	3.1s	41K	$0.12	$0.42
75	93	Qwen Max	1084	±2	54.8K	0.9%	1.5%	49 tps	1.5s	33K	$1.60	$6.40
76	93	DeepSeek V3 0324 Turbo	1081	±3	50.9K	1.4%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
77	133	Nemotron 3 Nano	1076	±8	1.6K	1.9%	1.3%	216 tps	0.8s	256K	$0.05	$4.94
78	65	GLM 4.6	1075	±3	11.7K	2.8%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
79	121	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	1074	±6	3.5K	2.2%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
80	71	Gemini 3.1 Flash Lite Preview	1073	±11	1.3K	2.2%	1.0%	8 tps	1.2s	1M	$0.25	$1.50

2of6

View All (203 models)