Leaderboard | Text

Models

Choose model family

Claude by Anthropic

Mistral by Mistral AI

Topics

Choose topic

All topics Facts and Information Creative Writing and Ideation Logic and Problem-Solving Task Completion Coding

Choose language

All languages English Chinese Arabic Spanish Indonesian Japanese

More filters

Show inactive models

Hide models that are no longer actively available on Yupp.

Turns

Filter model performance by the number of turns in a conversation.

All Single turn Multiple turns

Open license models

Filter the leaderboard to only show models that have an open license.

All selected Open license Proprietary license

1023

GPT-5

1031

Claude Sonnet 4 (Thinking)

1031

Grok 4 Fast Non-Reasoning

1034

DeepSeek V3.1 Terminus Chat

1036

Grok 4.1 Fast Reasoning

1036

GPT-5.2 (High)

1046

Qwen Plus (Aug'24)

1050

Qwen3 235B A22B Instruct 2507

1050

DeepSeek V3.2 Thinking

1054

DeepSeek V3 0324 Turbo

1056

Qwen Max

1059

Qwen3 Max Instruct Preview

1069

Claude Sonnet 3.5 v2

1070

GPT-5.1 Instant

1076

Gemini 2.5 Pro

Last updated about 1 month ago

Rank	Overall	Name	VIBE Score	Confidence Interval	Votes	Downvote %	Abort %	Speed	Latency	Context	Cost (Input)	Cost (Output)
41	52	GPT-5	1023	±14	1.1K	1.7%	3.1%	78 tps	23.1s	400K	$1.25	$9.67
42	48	Claude Sonnet 4 (Thinking)	1031	±20	775	1.3%	1.5%	52 tps	1.5s	200K	$3.00	$13.67
43	52	Grok 4 Fast Non-Reasoning	1031	±29	600	0.8%	1.5%	93 tps	0.6s	2M	$0.27	$0.67
44	44	DeepSeek V3.1 Terminus Chat	1034	±22	590	1.7%	3.4%	27 tps	1.5s	131K	$0.86	$1.80
45	44	Grok 4.1 Fast Reasoning	1036	±25	1.2K	1.2%	1.5%	58 tps	7.3s	2M	$0.20	$0.50
46	17	GPT-5.2 (High)	1036	±20	1.3K	1.1%	6.7%	18 tps	16.3s	400K	$1.75	$14.00
47	68	Qwen Plus (Aug'24)	1046	±18	1.5K	1.3%	1.4%	53 tps	1.3s	30K	$0.40	$1.20
48	40	Qwen3 235B A22B Instruct 2507	1050	±17	795	1.2%	6.8%	13 tps	1.9s	262K	$0.13	$0.52
49	56	DeepSeek V3.2 Thinking	1050	±23	695	1.4%	9.0%	30 tps	2.6s	131K	$0.28	$0.42
50	93	DeepSeek V3 0324 Turbo	1054	±16	1.5K	2.0%	6.3%	12 tps	2.4s	164K	$0.73	$1.79
51	93	Qwen Max	1056	±11	1.6K	2.4%	1.5%	49 tps	1.5s	33K	$1.60	$6.40
52	42	Qwen3 Max Instruct Preview	1059	±19	980	1.0%	1.1%	31 tps	1.7s	256K	$1.43	$6.61
53	106	Claude Sonnet 3.5 v2	1069	±23	630	1.6%	<0.1%	46 tps	1.4s	200K	$3.00	$15.00
54	62	GPT-5.1 Instant	1070	±21	705	1.4%	1.3%	50 tps	1.9s	400K	$1.25	$10.00
55	44	Gemini 2.5 Pro	1076	±12	1.9K	1.3%	2.3%	45 tps	2.6s	1M	$1.25	$10.00
56	10	Claude Sonnet 4.5 (Thinking)	1080	±15	1K	1.0%	1.9%	44 tps	1.1s	200K	$3.00	$15.00
57	40	DeepSeek V3.2	1083	±25	640	0.8%	1.4%	83 tps	5.1s	131K	$0.43	$1.09
58	17	Claude Opus 4.5	1086	±21	625	0.8%	1.5%	45 tps	1.5s	200K	$5.00	$25.00
59	32	Gemini 2.5 Pro High	1109	±20	1.2K	1.3%	1.5%	48 tps	2.3s	1M	$1.25	$10.00
60	81	GPT-4o	1121	±14	895	1.6%	1.0%	49 tps	2.4s	128K	$3.71	$12.57
61	106	DeepSeek V3 0324	1125	±15	1.4K	1.8%	5.8%	12 tps	2.7s	164K	$0.38	$0.93
62	33	Qwen3 Next 80B A3B Instruct	1135	±25	670	0.7%	0.6%	84 tps	1.1s	256K	$0.20	$1.42
63	48	gpt-oss-120b	1136	±19	775	1.3%	0.7%	213 tps	0.5s	131K	$0.11	$0.50
64	17	Gemini 3 Flash Preview	1137	±20	615	0.8%	1.3%	138 tps	1.4s	1M	$0.50	$3.00
65	33	Qwen3 30B A3B Instruct 2507	1139	±16	780	1.3%	1.2%	55 tps	1.3s	131K	$0.13	$0.72
66	37	Claude Sonnet 4.5	1139	±17	1.3K	1.1%	1.4%	41 tps	1.3s	200K	$1.80	$9.00
67	8	GPT-5.1 (High)	1145	±19	1.1K	3.6%	3.2%	76 tps	6.9s	400K	$1.25	$10.00
68	52	Claude Haiku 4.5	1169	±17	960	1.0%	1.1%	100 tps	0.9s	200K	$1.00	$5.00
69	26	Grok 4.1 Fast Non-Reasoning	1183	±28	980	1.0%	0.9%	101 tps	0.5s	2M	$0.20	$0.50
70	33	Kimi K2.5	1186	±38	760	0.7%	6.5%	33 tps	1.7s	262K	$0.34	$2.57
71	7	Claude Opus 4.5 (Thinking)	1188	±22	1.1K	1.8%	1.8%	49 tps	1.4s	200K	$5.00	$25.00
72	14	Gemini 3 Flash Preview Thinking	1195	±20	950	1.6%	1.6%	3 tps	6.2s	1M	$0.50	$3.00
73	26	Claude Haiku 4.5 (Extended Thinking)	1200	±17	610	0.8%	1.4%	115 tps	0.7s	200K	$1.00	$5.00
74	16	GPT-5.2	1203	±26	680	0.7%	4.1%	18 tps	2.7s	400K	$1.75	$14.00
75	14	Gemini 3 Pro (Low)	1211	±21	1K	0.9%	2.4%	51 tps	3.5s	1M	$2.00	$12.00
76	10	GPT-5.2 Instant	1214	±21	1.3K	0.4%	1.7%	52 tps	2.0s	400K	$1.75	$14.00
77	10	Gemini 3 Pro	1235	±13	1.6K	1.2%	2.1%	50 tps	3.6s	1M	$2.00	$12.00
78	8	GPT-5.1	1239	±21	960	0.5%	2.3%	71 tps	1.4s	400K	$1.42	$11.33
79	22	GPT-5 Chat	1313	±8	2.4K	0.6%	1.3%	95 tps	0.9s	400K	$1.25	$10.00

2of2

View All (79 models)