Matt Shumer February 2026

The gap between the AI you see
and the real AI
is growing wider

The gap between the AI capabilities the tech industry experiences daily and what the general public perceives. We show you with real data why this gap is dangerous.

Scroll down

Perception Gap

Same Era, Completely Different Realities

Most people use the free version of AI. Matt Shumer describes this as "judging the smartphone era with a flip phone." Free AI is more than a year behind the latest paid models.

TECH INDUSTRY

Latest paid model users

"When you ask AI to build an entire app, it tests itself, iterates, and completes it. It shows not just coding ability, but judgment."
GENERAL PUBLIC

Free version users

"It gives useful answers sometimes, but gets a lot wrong too. Isn't it just a chatbot that can't handle complex tasks?"

AI Capability Awareness by Group

How accurately each group understands AI's current level

AI Researchers
92%
Tech Workers
72%
Tech-Adjacent
35%
General Public
12%

Exponential Growth

AI progress follows an exponential curve — and it's accelerating

According to METR benchmarks, AI autonomous task capability doubles every ~6.5 months overall, and has accelerated to every ~89 days (~3 months) since 2024. GPT-5.3-Codex and Claude Opus 4.6, released simultaneously on Feb 5, show this acceleration intensifying.

197days
Capability doubling time
(overall average)
89days
Accelerated doubling
time since 2024
6.6hrs
METR official best
(GPT-5.2)
~113x
Capability increase
vs GPT-4 over 2.7 years

METR Benchmark: Autonomous Task Duration by Model

Task duration AI can complete with 50% probability without human expert help (50% time horizon)

Source: METR Time Horizon 1.1 (updated 2026.01.29). GPT-5.3-Codex and Opus 4.6 post-release; official METR evaluation not yet published. Applying the 89-day doubling trend, latest models' time horizons are estimated at 10+ hours.
2026. 2. 5

Two models launched on the same day

OpenAI's GPT-5.3-Codex and Anthropic's Claude Opus 4.6 were released on the same day. GPT-5.3-Codex was announced as "the first model to contribute to building itself," while Opus 4.6 nearly doubled its predecessor's score on ARC-AGI-2. Official METR evaluations haven't been published yet, but benchmark improvements show the curve steepening.

Benchmark GPT-5.2 GPT-5.3-CodexNew Opus 4.5 Opus 4.6New
SWE-bench Verified 80.0% 80.9% 80.8%
SWE-Bench Pro 56.4% 56.8%
Terminal-Bench 2.0 64.0% 77.3% 59.8% 65.4%
OSWorld 38.2% 64.7% 66.3% 72.7%
ARC-AGI-2 37.6% 68.8%
GPQA Diamond 87.0% 91.3%
HLE (with tools) 43.4% 53.1%
GDPval-AA Elo ~1462 1416 1606
BrowseComp 67.8% 84.0%

Source: OpenAI, Anthropic official announcements (2026.02.05). GPT-5.3-Codex: 25% faster + 400K context. Opus 4.6: 1M context + adaptive thinking.

Feb 5 Models vs Previous Gen — Key Benchmark Comparison

How much they improved over the previous generation on the same benchmarks

Exponential Growth in AI Autonomous Task Duration

50% time horizon change by model release date (log scale). The curve has steepened since 2024.

Source: METR Time Horizon 1.1 / Epoch AI. Gray dashed line: overall trend (197-day doubling). Red dashed line: post-2024 acceleration (89-day doubling).

Timeline

What Happened in 4 Years

From AI that couldn't do basic arithmetic to autonomous expert-level complex tasks.

2022
ChatGPT launches. The world is amazed, but it frequently gets basic arithmetic wrong. Autonomous task time: unmeasurable.
2023. 3
GPT-4 launches. Passes the US bar exam. METR autonomous task time: ~3.5 minutes.
2024. 10
Claude 3.5 Sonnet. Can write complete software. Autonomous task time grows to ~20 minutes.
H1 2025
Claude 3.7 Sonnet — ~1 hour. O3 — ~1.6 hours. Top engineers begin delegating most coding to AI.
H2 2025
Gemini 3 Pro — ~4 hours. Claude Opus 4.5 — ~5.3 hours. MathArena Apex: 1% to 23%, a 20x leap.
2025. 12
GPT-5.2 — ~6.6 hours (394 min) on METR. All-time record. "Everything before feels like a different era."
2026. 2. 5 — Present
Same day: GPT-5.3-Codex and Claude Opus 4.6 launch simultaneously. GPT-5.3-Codex jumps +13.3%p on Terminal-Bench, declared "the first model to contribute to building itself." Opus 4.6 leaps from 37.6% to 68.8% on ARC-AGI-2 in one generation. 1M token context. Doubling time confirmed at 89 days.

Interactive

See for Yourself

AI Capability Time Machine

Move the slider to see AI capability levels at each point in time.

20202021202220232024202520262027
2024

The Dangerous Gap

Why the Growing Gap Is Dangerous

Actual AI capability is rising exponentially, but public perception barely moves. Society faces shocks it's not prepared for.

Actual AI Capability vs. Public Perception

The wider the red area, the greater the societal shock

WARNING

"Within the next 1–5 years, 50% of entry-level office jobs could be eliminated. We are only 1–2 years away from AI being able to build models that are fundamentally far superior to the current generation."

— Dario Amodei, CEO of Anthropic, 2025

Impact

Jobs Already Under Threat

Click an item to see details.

Customer Service
95%
With complex multi-step problem solving now possible, AI is replacing not just simple inquiries but expert consulting roles.
Software Development
90%
AI auto-generates hundreds of thousands of lines of code. Even top engineers delegate most coding to AI. The developer's role is shifting from "writing code" to "directing AI."
Legal / Law
85%
AI performs contract review, case analysis, and brief drafting at or above human lawyer level. Law firms are already reducing junior hires.
Finance / Analysis
80%
Financial modeling, data analysis, and report writing are being automated. AI generates investment analysis reports in minutes.
Content / Marketing
75%
Ad copy, blog posts, social media content, and translation — writing-based work is being automated at scale.
Medical / Diagnostics
70%
AI has achieved specialist-level accuracy in image reading and initial diagnosis. Radiology and pathology are especially impacted.

What You Can Do

What You Can Do Now

Recognizing this gap already puts you one step ahead.

1

Try the latest AI yourself

A $20/month subscription can close a 1+ year technology gap.

2

Apply it to real work

Integrate AI into daily tasks like writing, analysis, and coding.

3

Invest 1 hour every day

One hour a day experimenting with AI tools. In a month, you'll be a completely different person.

4

Build adaptability

The ability to learn itself — not any specific skill — becomes your most valuable asset.

5

Spread the word

The fastest way to close the gap is to share this information.