Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1038

DeepSeek V3 0324 Turbo

1038

GLM 4.5

1040

Qwen3 Omni 30B A3B Thinking

1042

Gemini 2.5 Flash Lite

1042

Amazon Nova 2 Lite

1043

Mistral Medium

1045

GPT-4.1 mini

1049

Grok 4 Fast Reasoning

1055

Claude Sonnet 3.5 v2

1056

DeepSeek V3.2

1056

Qwen3.5 27B

1056

DeepSeek V3.1 Terminus Chat

1058

Grok 4.1 Fast Non-Reasoning

1060

Gemini 3.1 Flash Lite Preview

1063

Qwen3.5 122B A17B

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
81	93	DeepSeek V3 0324 Turbo	1038	±9	2.2K	1.8%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
82	113	GLM 4.5	1038	±12	915	3.2%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
83	37	Qwen3 Omni 30B A3B Thinking	1040	±20	750	2.0%	3.7%	67 tps	1.2s	66K	$0.97	$1.79
84	101	Gemini 2.5 Flash Lite	1042	±6	7.8K	4.3%	1.3%	210 tps	0.7s	1M	$0.10	$0.40
85	86	Amazon Nova 2 Lite	1042	±23	690	2.1%	1.0%	137 tps	0.6s	300K	$0.35	$2.95
86	113	Mistral Medium	1043	±11	1.1K	3.1%	1.8%	48 tps	0.6s	33K	$1.48	$4.55
87	118	GPT-4.1 mini	1045	±8	3.4K	2.5%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
88	48	Grok 4 Fast Reasoning	1049	±10	2.3K	3.6%	2.1%	102 tps	3.1s	2M	$0.30	$0.75
89	106	Claude Sonnet 3.5 v2	1055	±22	770	3.8%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
90	40	DeepSeek V3.2	1056	±15	1.4K	1.4%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
91	81	Qwen3.5 27B	1056	±17	665	2.9%	3.7%	55 tps	2.6s	256K	$0.30	$2.40
92	44	DeepSeek V3.1 Terminus Chat	1056	±13	955	2.1%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
93	26	Grok 4.1 Fast Non-Reasoning	1058	±19	2K	4.1%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
94	71	Gemini 3.1 Flash Lite Preview	1060	±22	1.2K	3.3%	1.0%	8 tps	1.2s	1M	$0.25	$1.50
95	52	Qwen3.5 122B A17B	1063	±14	980	1.5%	1.5%	82 tps	1.4s	256K	$0.40	$3.20
96	71	Qwen3.5 397B A17B	1067	±15	1.3K	2.2%	4.3%	57 tps	1.4s	256K	$0.52	$3.00
97	48	Step 3.5 Flash	1067	±20	630	2.3%	2.2%	109 tps	0.6s	256K	$0.05	$0.15
98	56	DeepSeek V3.1 Turbo	1070	±12	1.3K	2.6%	0.9%	173 tps	1.3s	164K	$2.00	$3.75
99	113	Gemini 2.5 Flash Lite Thinking	1071	±10	2.3K	3.2%	1.0%	118 tps	4.4s	1M	$0.03	$0.13
100	68	GLM 4.7	1071	±12	1.9K	2.1%	5.8%	40 tps	1.5s	200K	$0.77	$1.73
101	65	DeepSeek V3.2 Exp Chat	1072	±12	755	2.6%	2.6%	29 tps	1.5s	131K	$0.27	$0.39
102	44	Kimi K2 Thinking Turbo	1072	±17	1.3K	2.2%	2.0%	75 tps	1.4s	262K	$1.15	$8.00
103	48	gpt-oss-120b	1074	±6	3K	2.6%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
104	33	Qwen3 Next 80B A3B Instruct	1083	±16	1.5K	2.6%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
105	33	Qwen3 30B A3B Instruct 2507	1084	±8	2.3K	3.2%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
106	71	DeepSeek V3.1	1085	±14	690	3.5%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
107	95	Gemini 2.5 Flash Lite Thinking Preview 0925	1086	±8	3.4K	2.7%	1.5%	152 tps	3.0s	1M	$0.10	$0.40
108	33	Kimi K2.5	1090	±13	4.3K	2.1%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
109	37	Kimi K2.5 Instant	1093	±12	1.4K	2.7%	2.9%	32 tps	3.0s	262K	$0.50	$3.00
110	68	Grok 4	1102	±7	7.8K	3.3%	3.9%	29 tps	11.1s	256K	$3.00	$15.00
111	42	Qwen3 Max Instruct Preview	1103	±13	2.2K	2.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
112	95	Gemini 2.5 Flash	1104	±7	9.8K	2.7%	1.3%	2 tps	3.7s	1M	$0.30	$2.50
113	29	Nova Experimental Chat 12-10	1110	±25	710	1.4%	2.4%	84 tps	12.9s	98K	$0	$0
114	81	GPT-4o	1113	±8	2.2K	3.6%	1.0%	49 tps	2.4s	128K	$3.71	$12.57
115	52	GPT-5	1115	±8	5.4K	3.7%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
116	26	GPT-5 (High)	1115	±7	3.7K	3.6%	4.5%	81 tps	35.9s	400K	$1.25	$10.00
117	48	Claude Sonnet 4 (Thinking)	1116	±7	5.3K	4.2%	1.5%	52 tps	1.5s	200K	$3.00	$13.67
118	60	Gemini 2.5 Flash Preview 0925	1124	±9	3.4K	3.4%	1.2%	5 tps	0.9s	1M	$0.13	$0.97
119	40	Qwen3 235B A22B Instruct 2507	1126	±8	2.1K	2.5%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
120	22	GLM 5	1132	±12	1.7K	1.4%	3.4%	36 tps	2.7s	200K	$0.72	$2.55

3of4

View All (154 models)