PayPal spent 25 years building one of the most valuable financial data assets in the world. The problem: by 2024, that data lived across a dozen siloed systems. Teradata, Hadoop, Redshift, Snowflake, each one a remnant of an acquisition or an infrastructure decision made when the next decade wasn’t yet in scope. 400 petabytes of data, none of it talking to the rest.
The practical consequences showed up everywhere. Building a unified view of a merchant using PayPal for online checkout and Venmo for in-person sales required complex, expensive ETL processes. Fraud models and personalization engines hit the same fragmentation wall. Growth had created a data landscape that was impressive in aggregate and difficult to actually use.
The Migration
PayPal chose BigQuery as the destination. Working with Google Cloud Consulting, they migrated more than 300 petabytes with zero downtime and zero customer impact. For a payments company operating across 200 markets around the clock, those constraints were non-negotiable. The migration consolidated what is believed to have been the world’s largest Teradata deployment, alongside Hadoop, Redshift, and Snowflake. Data infrastructure vendors went from four to one. Around 25% of workloads were decommissioned in the process.
Real-time streaming analytics moved too. PayPal replaced self-managed Flink with Google Cloud Dataflow, integrated directly with BigQuery. That shift meant immediate insights for merchant monitoring and failed transaction analytics, instead of batch-delayed data too stale to act on. The streaming migration alone changed how PayPal’s operations teams could respond to problems in real time, rather than discovering them after the fact in a morning report.
The execution timeline is worth noting. PayPal began the BigQuery migration in May 2021, starting with analytics warehouses. By early 2026, the migration was largely complete. That’s roughly four years of sustained infrastructure work, running in parallel with a live payments operation serving hundreds of millions of users. The fact that it completed with zero customer impact is a meaningful data point about how this kind of project can be managed.
What It Unlocked
The numbers from the migration are specific. Data for AI model training is now 16x fresher. Queries run 2.5x to 10x faster, including the complex ones data scientists use for feature engineering. Vertex AI now optimizes logistics planning across more than 5,000 daily shipments. Personalized recommendations and fraud signals all build from a single governed source rather than reconciling across disconnected systems.
The less quantifiable outcome matters more. When data infrastructure isn’t in the way, the time between “we want to build this AI feature” and “it’s in production” compresses. That velocity is the actual return on the migration. Every model PayPal trains now starts from clean, fresh, unified data. Before, it started from a negotiation with infrastructure.
Data freshness deserves particular attention. A 16x improvement in freshness isn’t just a number, it changes what’s possible in fraud detection and personalization. Fraud signals that would have been hours old are now minutes old. Personalization models that trained on yesterday’s behavior now train on this morning’s. At PayPal’s transaction volume, that difference in signal quality translates directly into better outcomes for customers and merchants.
The Partnership That Followed
In September 2025, Google and PayPal announced a multi-year strategic partnership. Together, they’re developing AI-driven shopping and payment experiences. They’re also building standards for “agentic commerce”; systems that conduct transactions autonomously on behalf of users. Additionally, PayPal Enterprise Payments became a payment provider across Google products including Google Cloud, Google Ads, and Google Play.
None of this is coincidental. The BigQuery migration is the foundation the AI partnership sits on. Without unified, fresh, accessible data, the agentic commerce work can’t run at the quality either company needs. The $300 million infrastructure commitment PayPal made over several years was, in retrospect, the prerequisite for everything that followed. You don’t build an AI partnership of this scope on top of fragmented infrastructure. You build it on a unified data platform that took four years to create.
What This Means Beyond PayPal
PayPal’s situation, too much data, too many systems, can’t afford downtime, isn’t unique to a company of their size. The scale is unusual, but the underlying problem isn’t. Most ISVs and enterprises carrying fragmented data infrastructure from years of growth and acquisition sit on the same tension: valuable data that’s hard to use for AI at the speed AI requires.
The technology clearly works. One of the world’s largest payment networks moved 300 petabytes with zero customer impact. BigQuery is fully managed. The migration path is documented. For ISVs building on Google Cloud, the PayPal story is useful in customer conversations precisely because it answers the question every enterprise data team eventually asks: has anyone actually done this at real scale? The answer is yes, and the specifics are public.
Want to go deeper?
- PayPal’s historic data migration (Google Cloud Blog), PayPal’s own account of the migration, the technical choices, and the AI outcomes.
- PayPal’s Dataflow migration (Google Cloud Blog), How PayPal replaced self-managed Flink infrastructure with Dataflow for real-time streaming analytics.
- Google and PayPal multi-year partnership announcement, The September 2025 announcement covering agentic commerce, AI shopping experiences, and payment integration across Google products.
