×
Reddit sues Perplexity for stealing user content to build $20B AI company
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Reddit filed a lawsuit against Perplexity and several data scraping companies, accusing them of stealing valuable user-generated content to build AI products without permission. The lawsuit claims these companies bypassed Reddit’s digital security measures and continued scraping data even after being told to stop, with Perplexity allegedly building a $20 billion valuation on this foundation.

What you should know: Reddit alleges that Perplexity and other defendants used sophisticated methods to circumvent the platform’s anti-scraping protections.

  • The lawsuit, filed Wednesday in Manhattan federal court, claims Perplexity used Reddit comments to generate AI responses even after agreeing not to scrape Reddit’s data.
  • Reddit sent a cease-and-desist letter to Perplexity in May 2024, but the company’s citations to Reddit content increased “forty-fold after Reddit told it to stop,” according to the filing.
  • Other defendants include data scraping firms Oxylabs UAB, AWMProxy, and SerpApi, which allegedly sell scraped data to AI companies.

How the alleged scheme worked: According to the lawsuit, Perplexity appears to have used third-party scrapers to access Reddit content indirectly through Google search results.

  • “Perplexity’s business model is effectively to take Reddit’s content from Google search results, feed them into a third party’s LLM, and call it a new product,” the lawsuit states.
  • Reddit compared the defendants to “would-be bank robbers, who, knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead.”
  • The social media company has spent tens of millions of dollars on anti-scraping systems that these companies allegedly circumvented.

In plain English: Instead of directly accessing Reddit’s servers (which would be easily detected), Perplexity allegedly used Google as a middleman—searching for Reddit content through Google’s results, then feeding that information into AI systems to create responses for users.

What they’re saying: Both sides defended their positions, with Perplexity emphasizing user rights and Reddit highlighting the scale of alleged theft.

  • Perplexity spokesperson Jesse Dwyer said the company “will always fight vigorously for users’ rights to freely and fairly access public knowledge.”
  • Reddit’s chief legal officer Ben Lee called the other defendants “textbook examples” of illegal scrapers, stating: “Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material.”
  • Lee added that “Reddit is a prime target because it’s one of the largest and most dynamic collections of human conversation ever created.”

Why this matters: The case highlights growing tensions over data rights as AI companies seek training material while platforms try to monetize their user-generated content.

  • Reddit has successfully negotiated licensing deals with major AI companies like Google and OpenAI, demonstrating a path for legitimate data access.
  • The lawsuit suggests that some AI companies are choosing to circumvent these legitimate channels, potentially undermining the value of user-generated content platforms.
Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

Recent News

Surgeon builds AI platform to improve heart ultrasound diagnostics

Her unique training method correlates ultrasound findings with actual surgical observations.

Former Scale AI exec raises $9M to build AI infrastructure for Middle East

Manual crew assignments and vehicle routing could soon be automated through AI-powered infrastructure.

Chinese startup Noetix launches $1.4K humanoid robot for consumers

The three-foot robot costs about the same as a flagship smartphone.