Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1100

DeepSeek V3 0324

1099

Qwen3 Coder 480B A35B Instruct

1098

Gemini 2.5 Flash

1098

Grok 3

1093

DeepSeek V3 0324 Turbo

1093

Qwen3 235B A22B

1091

Nova Experimental Chat 10-09

1090

Sherlock Dash Alpha

1090

OpenAI o3-pro

1089

DeepSeek V3.1

1089

DeepSeek V3.2 Exp Thinking

1088

Claude Sonnet 3.5

1087

Qwen Plus 0728 (Thinking)

1087

GPT-4.1 mini

1085

GPT-4.1 nano

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
121	90	DeepSeek V3 0324	1100	±4	15.1K	4.3%	5.8%	12 tps	2.7s	164K	$0.38	$0.93
122	90	Qwen3 Coder 480B A35B Instruct	1099	±8	3.1K	4.5%	3.3%	61 tps	2.0s	262K	$0.71	$1.34
123	98	Gemini 2.5 Flash	1098	±4	35.9K	3.2%	1.3%	2 tps	3.7s	1M	$0.30	$2.50
124	98	Grok 3	1098	±4	19.1K	5.5%	1.5%	53 tps	0.6s	1M	$3.67	$18.33
125	98	DeepSeek V3 0324 Turbo	1093	±5	15.5K	5.7%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
126	98	Qwen3 235B A22B	1093	±6	4.5K	8.0%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
127	123	Nova Experimental Chat 10-09	1091	±7	3.2K	10.7%	<0.1%	59 tps	6.1s	98K	$0	$0
128	123	Sherlock Dash Alpha	1090	±19	835	6.7%	<0.1%	68 tps	0.7s	2M	$0	$0
129	98	OpenAI o3-pro	1090	±8	5.4K	4.3%	5.2%	22 tps	70.8s	200K	$20.00	$80.00
130	98	DeepSeek V3.1	1089	±12	2.3K	4.7%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
131	98	DeepSeek V3.2 Exp Thinking	1089	±7	5.9K	3.5%	7.2%	26 tps	3.0s	131K	$0.28	$0.42
132	132	Claude Sonnet 3.5	1088	±10	2.9K	4.9%	1.0%	40 tps	2.7s	200K	$3.00	$15.00
133	132	Qwen Plus 0728 (Thinking)	1087	±9	1.2K	8.9%	<0.1%	56 tps	1.1s	1M	$0.40	$4.00
134	105	GPT-4.1 mini	1087	±5	19.7K	4.2%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
135	105	GPT-4.1 nano	1085	±5	17K	5.0%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
136	105	Qwen3 Omni 30B A3B Instruct	1085	±13	775	4.3%	3.9%	65 tps	1.2s	66K	$0.35	$0.97
137	105	DeepSeek V3 (Turbo)	1082	±20	1.5K	5.1%	1.5%	32 tps	1.5s	64K	$0.40	$1.30
138	132	Solar Pro 2 250710	1081	±5	10.6K	6.9%	<0.1%	9 tps	N/A	66K	$0.50	$0.50
139	105	Seed 1.8 251228	1081	±10	3.2K	3.1%	3.7%	41 tps	2.1s	256K	$0.25	$2.00
140	105	Mistral Medium	1080	±4	9.6K	5.6%	1.8%	48 tps	0.6s	33K	$1.48	$4.55
141	105	Qwen3 Max Thinking	1080	±18	1.5K	2.0%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
142	112	GLM 4.5	1075	±5	6K	7.0%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
143	112	Kimi K2 0905	1074	±7	8.7K	4.3%	4.0%	30 tps	1.4s	262K	$0.63	$2.39
144	112	Kimi K2 Fast	1073	±5	35K	6.4%	0.8%	365 tps	0.5s	131K	$1.00	$3.00
145	112	GPT-5 (Low)	1070	±14	690	3.5%	1.8%	75 tps	8.2s	400K	$1.25	$10.00
146	112	Kimi K2 0905 Turbo	1070	±6	7.5K	9.1%	0.7%	373 tps	0.5s	262K	$1.70	$6.50
147	112	gpt-oss-20b	1066	±6	7.7K	7.1%	0.5%	216 tps	0.5s	131K	$0.06	$0.26
148	144	Qwen Turbo	1064	±5	10K	6.0%	<0.1%	53 tps	1.1s	1M	$0.05	$0.20
149	112	Grok 4.20 Beta Non-reasoning	1063	±36	500	4.8%	1.1%	151 tps	0.6s	2M	$2.00	$6.00
150	119	OpenAI o1	1062	±6	9.9K	3.3%	4.2%	92 tps	5.5s	200K	$15.00	$60.00
151	119	DeepSeek V3.1 Terminus Thinking	1061	±9	2.9K	9.4%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
152	119	OpenAI o1-pro	1061	±20	680	7.5%	5.2%	33 tps	72.8s	200K	$150.00	$600.00
153	119	Gemini 2.5 Flash Lite Thinking	1061	±4	9.8K	6.2%	1.0%	118 tps	4.4s	1M	$0.03	$0.13
154	151	GLM 4.5 FP8	1060	±18	610	8.3%	<0.1%	59 tps	1.2s	131K	$0.41	$1.65
155	151	Llama 3 8B Turbo	1059	±24	600	1.6%	<0.1%	97 tps	0.1s	8K	$0.12	$0.13
156	119	Seed 2.0 Lite (Medium)	1058	±20	525	3.7%	6.6%	33 tps	1.6s	256K	$0.25	$2.00
157	119	LongCat Flash Chat	1058	±12	2.7K	5.9%	0.8%	85 tps	0.9s	131K	$0.14	$0.68
158	151	OpenAI Codex Mini	1057	±5	9.8K	3.3%	<0.1%	46 tps	2.1s	200K	$1.50	$6.00
159	119	GPT-5.1 Codex Mini (Medium)	1057	±15	1.9K	4.9%	4.6%	69 tps	4.1s	400K	$0.25	$2.00
160	119	GPT-5.1 Codex Mini (High)	1054	±15	2.2K	3.9%	5.9%	70 tps	4.6s	400K	$0.25	$2.00

4of11

View All (404 models)