Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1068

Gemini 2.5 Flash

1070

Grok 4

1071

Gemini 3.1 Flash Lite Preview Thinking

1075

DeepSeek-R1 Turbo

1075

Grok 4 Fast Non-Reasoning

1076

Claude Haiku 4.5

1076

NVIDIA Llama 3.3 Nemotron Super 49B v1.5

1077

Grok 4 Fast Reasoning

1080

MiniMax M2.1

1083

Claude Sonnet 4

1083

GPT-5

1084

DeepSeek V3 0324

1085

GPT-5.1 Instant

1089

Nemotron 3 Nano (Thinking)

1090

Kimi K2.5

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	95	Gemini 2.5 Flash	1068	±4	11.2K	1.2%	1.3%	2 tps	3.7s	1M	$0.30	$2.50
122	68	Grok 4	1070	±4	13.8K	1.6%	3.9%	29 tps	11.1s	256K	$3.00	$15.00
123	56	Gemini 3.1 Flash Lite Preview Thinking	1071	±13	560	1.8%	1.7%	75 tps	4.7s	1M	$0.25	$1.50
124	95	DeepSeek-R1 Turbo	1075	±6	1.9K	2.4%	2.6%	29 tps	1.8s	64K	$2.85	$4.75
125	52	Grok 4 Fast Non-Reasoning	1075	±6	2.9K	3.3%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
126	52	Claude Haiku 4.5	1076	±8	4.2K	2.2%	1.1%	100 tps	0.9s	200K	$1.00	$5.00
127	121	NVIDIA Llama 3.3 Nemotron Super 49B v1.5	1076	±12	755	1.9%	2.0%	50 tps	0.6s	131K	$0.09	$0.33
128	48	Grok 4 Fast Reasoning	1077	±6	3.3K	2.8%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
129	60	MiniMax M2.1	1080	±6	5.2K	0.6%	2.1%	66 tps	2.6s	205K	$0.30	$1.20
130	86	Claude Sonnet 4	1083	±5	12K	1.6%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
131	52	GPT-5	1083	±5	7.6K	2.2%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
132	106	DeepSeek V3 0324	1084	±4	5.7K	1.4%	5.8%	12 tps	2.7s	164K	$0.38	$0.93
133	62	GPT-5.1 Instant	1085	±6	3.7K	1.1%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
134	86	Nemotron 3 Nano (Thinking)	1089	±9	1.5K	0.7%	2.0%	200 tps	0.5s	256K	$0	$0
135	33	Kimi K2.5	1090	±6	4.5K	0.7%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
136	121	QwQ 32B	1091	±5	9.9K	0.9%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
137	111	LongCat Flash Chat	1095	±7	1.7K	2.8%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
138	56	DeepSeek V3.2 Thinking	1096	±6	3.8K	0.9%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
139	86	DeepSeek V3.1 Chat	1097	±7	1.9K	2.3%	2.8%	21 tps	1.6s	131K	$0.38	$1.00
140	121	Qwen3 32B Fast	1098	±8	9K	1.0%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
141	93	DeepSeek V3 0324 Turbo	1103	±5	4.4K	1.9%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
142	84	GPT-5 Mini Minimal	1107	±10	970	3.5%	1.2%	63 tps	1.4s	400K	$0.25	$2.00
143	22	GLM 5	1110	±7	1.8K	0.8%	3.4%	36 tps	2.7s	200K	$0.72	$2.55
144	95	Qwen3 32B	1111	±17	515	1.9%	3.9%	30 tps	3.1s	41K	$0.12	$0.42
145	93	Qwen Max	1111	±6	7.6K	1.4%	1.5%	49 tps	1.5s	33K	$1.60	$6.40
146	101	Gemini 2.5 Flash Lite	1112	±6	7.6K	1.7%	1.3%	210 tps	0.7s	1M	$0.10	$0.40
147	40	DeepSeek V3.2	1113	±5	3.6K	0.8%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
148	56	DeepSeek V3.1 Turbo	1114	±6	4K	2.1%	0.9%	173 tps	1.3s	164K	$2.00	$3.75
149	37	Claude Sonnet 4.5	1116	±6	5K	3.1%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
150	44	Grok 4.1 Fast Reasoning	1119	±6	5.4K	1.5%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
151	44	Gemini 2.5 Pro	1126	±4	16.2K	1.5%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
152	42	Qwen3 Max Instruct Preview	1126	±4	4.3K	2.8%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
153	26	Claude Haiku 4.5 (Extended Thinking)	1129	±5	3.6K	1.6%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
154	42	GPT-5.2 (Extra High)	1131	±5	3.7K	0.9%	13.2%	17 tps	20.5s	400K	$1.75	$14.00
155	10	Claude Sonnet 4.5 (Thinking)	1136	±5	6.8K	2.7%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
156	29	Nova Experimental Chat 12-10	1138	±9	1.9K	0.5%	2.4%	84 tps	12.9s	98K	$0	$0
157	37	Qwen3 Omni 30B A3B Thinking	1139	±7	1.6K	1.2%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
158	48	Step 3.5 Flash	1140	±15	965	0.5%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
159	37	Kimi K2.5 Instant	1140	±10	1.1K	1.4%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
160	17	Claude Opus 4.5	1144	±8	2.4K	2.1%	1.5%	45 tps	1.5s	200K	$5.00	$25.00

4of5

View All (193 models)