×
Study finds AI agents complete just 3% of real freelance tasks
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

New research testing six leading AI agents on real freelance work reveals these automated systems can barely complete 3% of assigned tasks, earning just $1,810 out of a possible $143,991 in simulated projects. The study by the Center for AI Safety, a nonprofit research organization, and Scale AI, a major data annotation company, exposes a massive gap between AI industry promises and actual performance, suggesting that widespread job automation remains far from reality despite aggressive corporate adoption.

What you should know: The Remote Labor Index benchmark tested AI agents across diverse real-world freelance projects spanning game development to data analysis.

  • China-based startup Manus performed best with only a 2.5% automation rate, meaning it could acceptably complete just 2.5% of assigned projects.
  • Elon Musk’s Grok 4 and Anthropic’s Claude Sonnet 4.5 tied for second place at 2.1%, despite Claude being marketed as the “best coding model in the world.”
  • OpenAI’s GPT-5, touted for “PhD level” intelligence, managed just 1.7% completion rate.

The performance rankings: Even the most advanced AI models struggled dramatically with basic freelance tasks.

  • ChatGPT Agent, OpenAI’s dedicated AI agent tool, barely reached 1.3% completion.
  • Google’s Gemini 2.5 Pro performed worst at 0.8%, demonstrating the industry-wide challenge.
  • No AI agent exceeded 3% task completion across any category tested.

Why this matters: Companies are aggressively replacing human workers with AI despite mounting evidence that automation isn’t delivering promised productivity gains.

  • One MIT study found 95% of companies piloting AI initiatives saw no meaningful revenue growth.
  • Research shows AI tools often create “workslop”—low-quality output requiring extensive human revision that creates workplace tension.
  • Many executives who fired employees for AI have been forced to rehire them after discovering the technology’s limitations.

The big picture: AI agents face fundamental technical barriers that prevent effective job replacement.

  • “They don’t have long-term memory storage and can’t do continual learning from experiences. They can’t pick up skills on the job like humans,” CAIS director Dan Hendrycks explained.
  • The gap between AI marketing claims and real-world performance suggests current automation capabilities are vastly oversold.
  • Despite these findings, AI-related layoffs continue accelerating across industries.

What they’re saying: Researchers emphasize the importance of realistic AI capability assessments.

  • “I should hope this gives much more accurate impressions as to what’s going on with AI capabilities,” Hendrycks told Wired.
  • “We have debated AI and jobs for years, but most of it has been hypothetical or theoretical,” noted Scale AI’s director of research Bing Lie.
A New Paper Tested AI's Ability to Do Actually Online Freelance Work, and the Results Are Damning

Recent News

Study finds AI agents complete just 3% of real freelance tasks

Even the best performers earned just $1,810 out of a possible $143,991 in simulated projects.