Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

915

OpenAI o4-mini-high

922

DeepSeek V3.2 Speciale

922

Grok 3 Fast

924

Mistral Small 3.1

927

DeepSeek V3.1 Terminus Thinking

929

Mistral Small 3.2 24B

932

Kimi K2 0711

932

GLM 4.7 Flash

940

Qwen3 4B

941

Pixtral Large

946

OpenAI o1

948

Gemini 2.0 Flash Lite

948

Amazon Nova 2 Lite

955

GPT-5 Nano

958

GPT-4.1 nano

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	148	OpenAI o4-mini-high	915	±10	4.7K	1.5%	1.9%	117 tps	15.9s	200K	$1.10	$4.40
42	133	DeepSeek V3.2 Speciale	922	±25	780	6.6%	6.0%	43 tps	1.4s	131K	$0.84	$1.52
43	111	Grok 3 Fast	922	±12	530	0.9%	1.7%	52 tps	2.4s	131K	$5.00	$25.00
44	161	Mistral Small 3.1	924	±36	615	2.4%	7.4%	13 tps	2.6s	32K	$0.17	$0.28
45	106	DeepSeek V3.1 Terminus Thinking	927	±17	840	4.5%	5.9%	27 tps	1.8s	131K	$0.56	$1.68
46	170	Mistral Small 3.2 24B	929	±14	985	2.5%	2.8%	141 tps	0.7s	33K	$0.02	$0.08
47	170	Kimi K2 0711	932	±12	2.4K	1.7%	1.6%	29 tps	1.3s	131K	$0.72	$2.60
48	179	GLM 4.7 Flash	932	±26	690	2.1%	5.8%	61 tps	2.8s	128K	$0.07	$0.39
49	165	Qwen3 4B	940	±14	1.7K	5.2%	1.9%	94 tps	1.5s	128K	$0.01	$0.01
50	165	Pixtral Large	941	±18	940	3.6%	2.5%	57 tps	1.3s	128K	$1.50	$4.50
51	153	OpenAI o1	946	±11	3.6K	1.0%	4.2%	92 tps	5.5s	200K	$15.00	$60.00
52	143	Gemini 2.0 Flash Lite	948	±7	3.5K	1.7%	<0.1%	42 tps	0.5s	1M	$0.08	$0.30
53	86	Amazon Nova 2 Lite	948	±21	970	7.6%	1.0%	137 tps	0.6s	300K	$0.35	$2.95
54	157	GPT-5 Nano	955	±18	1.2K	4.1%	3.2%	113 tps	20.9s	400K	$0.05	$0.40
55	133	GPT-4.1 nano	958	±8	4K	1.4%	0.6%	175 tps	0.5s	1M	$0.10	$0.40
56	129	Command A	959	±8	6.2K	1.2%	2.2%	42 tps	0.8s	256K	$2.00	$7.33
57	71	DeepSeek V3.1	962	±20	750	3.2%	0.8%	197 tps	0.4s	164K	$0.55	$1.60
58	143	Gemini 2.0 Flash	965	±12	1.8K	1.3%	<0.1%	76 tps	0.5s	1M	$0.14	$0.56
59	148	OpenAI o3	968	±12	1.6K	1.6%	0.9%	85 tps	6.8s	128K	$7.33	$29.33
60	139	OpenAI o4-mini	971	±8	2.2K	2.6%	1.4%	97 tps	7.0s	128K	$1.10	$4.40
61	65	Mistral Large 3	974	±20	1.2K	5.5%	2.1%	51 tps	1.0s	256K	$0.50	$1.50
62	126	DeepSeek V3	975	±10	5.5K	0.5%	0.9%	69 tps	1.1s	64K	$0.59	$1.49
63	129	DeepSeek V3.1 Thinking	984	±13	1.4K	3.7%	7.1%	18 tps	1.8s	131K	$0.23	$0.75
64	126	Qwen3 30B A3B	986	±16	1.9K	3.4%	5.1%	163 tps	1.0s	41K	$0.06	$0.21
65	113	GLM 4.5	986	±15	1.5K	2.0%	3.7%	46 tps	1.4s	131K	$0.43	$1.63
66	161	Qwen3 8B	991	±11	1.4K	2.5%	2.4%	61 tps	1.4s	41K	$0.02	$0.07
67	71	Seed 1.8 251228	997	±13	2.4K	1.3%	3.7%	41 tps	2.1s	256K	$0.25	$2.00
68	86	Qwen3 235B A22B	998	±18	1.4K	3.2%	5.3%	71 tps	0.9s	41K	$0.23	$0.63
69	118	GPT-4.1 mini	999	±10	5.1K	1.4%	1.1%	67 tps	0.9s	1M	$0.34	$1.60
70	65	GLM 4.6	1001	±15	1.3K	4.4%	5.4%	39 tps	1.5s	200K	$0.42	$1.66
71	153	Qwen 2.5 32B Instruct	1004	±14	1.2K	1.7%	2.5%	48 tps	1.0s	131K	$0.21	$0.25
72	86	Claude Sonnet 4	1013	±7	10.4K	1.2%	1.8%	49 tps	1.3s	200K	$3.00	$15.00
73	119	ERNIE 4.5 300B A47B	1014	±11	4K	1.1%	4.7%	23 tps	2.3s	123K	$0.28	$1.10
74	121	QwQ 32B	1015	±7	4.6K	1.4%	5.4%	41 tps	2.1s	16K	$0.43	$0.56
75	124	Kimi K2 0905 Turbo	1017	±12	2.1K	2.3%	0.7%	373 tps	0.5s	262K	$1.70	$6.50
76	71	Qwen3.5 397B A17B	1021	±22	910	1.1%	4.3%	57 tps	1.4s	256K	$0.52	$3.00
77	129	Qwen3 Max Thinking	1022	±12	1.3K	1.1%	13.5%	32 tps	2.3s	256K	$1.20	$6.00
78	62	MiniMax M2	1027	±9	2.5K	5.2%	2.2%	39 tps	2.3s	205K	$0.21	$0.85
79	126	Qwen3 VL 235B A22B Thinking	1027	±13	935	4.1%	4.3%	47 tps	3.0s	127K	$0.47	$3.31
80	48	Claude Sonnet 4 (Thinking)	1028	±15	3.7K	2.9%	1.5%	52 tps	1.5s	200K	$3.00	$13.67

2of5

View All (173 models)