Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Language

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

938

Nemotron 3 Nano (Thinking)

938

GLM 4.6V

937

OpenAI o4-mini-high

937

OpenAI o3-mini-low

935

Llama 3 70B Turbo

933

Qwen3 14B

932

Arcee AI Virtuoso-Medium

932

Qwen3 32B Fast

929

Qwen 2.5 VL 72B Instruct

928

Qwen 2.5 14B Instruct

926

Grok 3 Mini Beta

924

DeepSeek V3.2 Speciale

924

Devstral Small

922

Gemini 1.5 Pro

920

Magistral Small 2506

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
201	86	Nemotron 3 Nano (Thinking)	938	±14	1.3K	7.6%	2.0%	200 tps	0.5s	256K	$0	$0
202	139	GLM 4.6V	938	±11	2.5K	6.1%	6.4%	21 tps	1.8s	128K	$0.38	$0.90
203	148	OpenAI o4-mini-high	937	±6	6K	14.9%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
204	175	OpenAI o3-mini-low	937	±4	6.1K	13.6%	0.7%	139 tps	1.5s	200K	$1.10	$4.40
205	177	Llama 3 70B Turbo	935	±9	1.9K	2.1%	<0.1%	31 tps	0.0s	8K	$0.73	$0.83
206	133	Qwen3 14B	933	±11	2.7K	17.1%	1.7%	109 tps	0.8s	41K	$0.04	$0.15
207	270	Arcee AI Virtuoso-Medium	932	±15	490	6.7%	<0.1%	3 tps	N/A	131K	$0.50	$0.80
208	121	Qwen3 32B Fast	932	±5	4.5K	12.9%	11.6%	30 tps	3.1s	41K	$0.10	$0.25
209	265	Qwen 2.5 VL 72B Instruct	929	±12	1.2K	7.9%	5.3%	25 tps	3.7s	128K	$1.01	$2.79
210	209	Qwen 2.5 14B Instruct	928	±13	910	11.7%	2.4%	40 tps	1.6s	1M	$0.40	$1.61
211	219	Grok 3 Mini Beta	926	±11	1.1K	1.3%	<0.1%	75 tps	0.5s	131K	$0.45	$2.25
212	133	DeepSeek V3.2 Speciale	924	±12	1.6K	6.3%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
213	201	Devstral Small	924	±16	570	12.3%	2.4%	180 tps	0.6s	131K	$0.10	$0.30
214	211	Gemini 1.5 Pro	922	±8	1.8K	3.0%	<0.1%	15 tps	0.0s	2M	$0.78	$3.13
215	194	Magistral Small 2506	920	±15	2K	6.9%	1.6%	156 tps	0.5s	40K	$0.37	$1.10
216	179	Inception Mercury	919	±10	2.8K	11.5%	0.4%	257 tps	1.1s	32K	$0.25	$1.00
217	148	Qwen3 30B A3B Thinking 2507	919	±10	1.3K	3.6%	0.5%	124 tps	1.2s	131K	$0.16	$1.70
218	229	Magistral Medium 2509	918	±8	2.1K	11.3%	4.0%	58 tps	0.9s	131K	$2.00	$5.00
219	153	Qwen 2.5 32B Instruct	916	±9	1.9K	18.0%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
220	179	Amazon Nova Pro 1.0	916	±16	2.1K	10.3%	0.9%	96 tps	0.7s	300K	$0.80	$1.70
221	186	Mistral Small 3.2 24B Instruct	915	±22	525	9.5%	1.9%	113 tps	1.1s	131K	$0.02	$0.08
222	214	Llama 3.3 70B Instruct Turbo	914	±24	640	11.7%	2.0%	78 tps	1.0s	131K	$0.88	$0.88
223	160	Llama 4 Scout	911	±6	8K	9.6%	0.6%	88 tps	5.1s	131K	$0.18	$0.46
224	179	Baichuan-M2-32B	911	±25	505	13.7%	<0.1%	32 tps	3.3s	131K	$0.07	$0.07
225	170	Kimi K2 0711	911	±8	3.2K	9.2%	1.6%	29 tps	1.3s	131K	$0.72	$2.60
226	170	Mistral Small 3.2 24B	911	±10	2K	12.4%	2.8%	141 tps	0.7s	33K	$0.02	$0.08
227	182	Fauna Fox	909	±11	2.5K	10.1%	<0.1%	194 tps	0.3s	128K	$0.04	$0.15
228	253	R1 1776	908	±17	870	14.3%	<0.1%	61 tps	1.0s	128K	$2.00	$8.00
229	214	OpenAI o3-mini-high	908	±14	1.1K	6.5%	2.4%	231 tps	10.5s	200K	$1.10	$4.40
230	246	DeepSeek-R1 Distill Llama 70B	907	±23	535	7.0%	3.6%	27 tps	1.6s	32K	$0.73	$0.95
231	186	Gemma 3n E4B	905	±10	2.6K	8.4%	2.0%	30 tps	0.5s	8K	$0.01	$0.02
232	201	ERNIE 4.5 VL 424B A47B	905	±12	805	7.5%	4.9%	36 tps	3.5s	123K	$0.42	$1.25
233	292	NVIDIA Llama 3.1 Nemotron Ultra 253B v1	905	±14	785	11.3%	<0.1%	40 tps	0.8s	128K	$0.30	$0.90
234	209	Llama 3.3 Swallow 70B Instruct	904	±8	1.6K	15.2%	1.4%	153 tps	1.3s	131K	$0.13	$0.39
235	186	Grok 3 Mini	903	±5	6K	12.8%	1.2%	43 tps	0.5s	131K	$0.30	$0.50
236	157	Qwen3 Next 80B A3B Thinking	903	±7	4.9K	11.2%	0.6%	175 tps	1.3s	256K	$0.21	$2.26
237	161	Qwen3 8B	902	±12	2K	17.8%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
238	186	Gemma 3 27B	902	±18	615	13.4%	1.8%	35 tps	1.1s	66K	$0.06	$0.10
239	186	Grok 3 Mini Fast	897	±7	5.2K	14.9%	1.6%	44 tps	0.5s	131K	$0.60	$4.00
240	277	Dobby Unhinged Llama 3.3 70B	896	±22	775	1.9%	<0.1%	41 tps	0.4s	128K	$0.90	$0.90

6of8

View All (312 models)