Google has achieved, for the first time, complex geospatial reasoning on a global scale, transforming the Earth into a “computable object.”

Building on decades of experience in world modeling and combining it with the advanced reasoning capabilities of Gemini, Google has significantly upgraded Earth AI. This upgrade covers a wide range of applications, from environmental monitoring to disaster response.

Google Earth AI: A Suite of Geospatial AI Models and Datasets

Google Earth AI encompasses a series of geospatial AI models and datasets, including a geospatial reasoning model powered by Gemini. This model can automatically connect different Earth AI models—such as weather forecasting, population maps, and satellite imagery—and answer various questions.

Google Earth Gets an “External Plug-in”: Beyond Gemini as a Guide

While individual AI capabilities are growing increasingly powerful, real-world problems often require the integration of cross-disciplinary knowledge.

Where might a typhoon make landfall? Which communities are most vulnerable? How should we prepare for a typhoon?

Answering such questions requires the coordinated processing of image, population, and environmental data, along with comprehensive reasoning.

This year, Google introduced Earth AI for precisely this purpose. By combining powerful foundational models with Gemini’s spatial reasoning agent, Google has achieved, for the first time, the ability to reason about complex real-world problems on a “global scale.”

Based on real-world data, foundational models provide a deep understanding of the Earth. The agent, acting as an intelligent commander, breaks down complex problems into multi-step solutions. It executes plans by calling on foundational models, querying vast databases, and utilizing geospatial tools, ultimately integrating results from each stage to form a holistic solution.

Today, Google Unveils New Innovations in Earth AI:

  • Release of a new generation of image and population foundational models, along with technical details and evaluation reports.
  • Introduction of a spatial reasoning agent.

Research shows that, with geospatial reasoning, analysts can not only predict storm paths but also identify the most vulnerable communities and high-risk infrastructure in one go.

For example, the non-profit organization GiveDirectly precisely located disaster-affected groups in urgent need of assistance by integrating flood data with population density information, improving disaster relief efficiency.

Google states that its integrated conversational feature, piloted since last year, helps users discover targets and patterns in satellite imagery. For instance, users can simply input “find algal blooms” to enable Google Earth to monitor drinking water sources.

The excitement surrounding this research lies in its driving force behind important AI applications:

  • Precision community health interventions conducted by Boston Children’s Hospital.
  • GiveDirectly’s rapid identification of the most vulnerable groups during disasters.
  • The World Health Organization Africa Region’s actions to predict cholera outbreak risk zones.
  • Airbus’s detection of vegetation encroaching on power lines to help clients prevent power outages.
  • The University of Chicago’s use of models to predict the onset of India’s monsoon season and its collaboration with India’s Ministry of Agriculture and Farmers’ Welfare to send accurate forecasts to 38 million farmers.

Google also offers practical AI applications:
During the 2025 California wildfires, Google sent crisis alerts to 15 million people in the Los Angeles area and displayed real-time locations of available shelters on maps.

Behind these achievements is Google’s profound accumulation in the field of geospatial AI. Its models are not only used for flood and wildfire warnings but also cover cyclones, air quality, and many other scenarios.

Google Achieves Global-Scale Reasoning for the First Time

In its latest technical paper, Google publicly unveiled, for the first time, its “remote sensing foundational model” and “population dynamics foundational model,” showcasing the powerful capabilities of the geospatial reasoning agent:

  • Intelligent Geographic Reasoning: Based on Gemini, the agent coordinates multi-dimensional Earth AI models to answer complex cross-modal questions.
  • Enhanced Deep Insights: Google Earth integrates Earth AI models with Gemini functionality, supporting users in intelligently retrieving targets in satellite imagery through natural language.
  • Cloud-Based Open Access: Through the Google Cloud platform, core Earth AI models (image, population, environment) are directly accessible to trusted testers.

Earth AI is built on multi-source, multi-modal geospatial data and tools (left side of the diagram below). Subsequently, sub-agents and models in three vertical domains—image, population, and environment—process this data (middle of the diagram). Finally, the Earth AI geospatial reasoning agent (right side of the diagram) performs global integration, enabling comprehensive geospatial analysis and insight generation.

Three Foundational Models: Image, Population, Environment

The remote sensing foundational model simplifies three core capabilities, accelerating satellite image analysis.

First, synthetic annotations and data obtained from the web form the core component of the training dataset.

The trained visual language model and open vocabulary detection model can be directly applied to classification, detection, and retrieval tasks. Through fine-tuning, the visual Transformer encoder can enhance performance on downstream specific tasks.

The training and application process of the remote sensing foundational model centers on the visual language model, open vocabulary object detection model, and pre-trained ViT encoder.

Users can pose queries in natural language and receive quick, accurate responses, such as “find roads submerged in images after heavy rain.”

Jointly trained on massive high-resolution aerial imagery and textual descriptions, the remote sensing foundational model has achieved breakthrough performance in multiple public Earth observation benchmark tests—improving average performance by over 16% in text-based image retrieval tasks and doubling the zero-shot detection accuracy for new object categories compared to baseline models.

To gain a deeper understanding of the complex interactions between human activities and the geographical environment, research in areas such as “Mobility AI” and “Population Dynamics Foundations” is essential.

In this study, the population dynamics foundational model introduces two key innovations:

  1. A globally unified embedding representation covering 17 countries.
  2. Dynamically updated human activity embeddings on a monthly basis.

These new features are particularly important for time-sensitive predictions, as they can more precisely capture the rhythm of changes in human behavior.

Training occurs in two stages:

  • Offline Training: By integrating diverse geospatial data (map data, search trends, human mobility activity, and environmental conditions), compact regional embedding representations are generated.
  • Dynamic Fine-Tuning: Using pre-trained embeddings, fine-tuning is performed for specific downstream tasks, enabling spatial interpolation, extrapolation, super-resolution reconstruction, and trend prediction of local statistical data.

Dual-Stage Framework of the Population Dynamics Foundational Model

Google internally evaluated the “population dynamics foundational model” using data from 17 countries. The results showed excellent performance, with R² scores (ranging from 0 to 1, with higher values indicating better performance) for predicting population density, tree cover, nighttime light intensity, and elevation all performing well across countries.

Based on U.S. ZIP codes, Google visualized the similarity of various dimensions of the population dynamics foundational embedding vectors. Patterns across different dimensions reflect the diverse characteristics of the U.S. population.

Independent research results have also validated the model’s powerful performance.

For example, researchers at the University of Oxford significantly improved the accuracy of long-term predictions of dengue fever transmission in Brazil by introducing the embedding representations provided by this model—the 12-month R² value (an indicator measuring the model’s ability to explain actual incidence rates) increased from 0.456 to 0.656.

Previously, Google has achieved technological breakthroughs in medium-range weather forecasting, monsoon onset prediction, air quality monitoring, and river flood warning.

Recently, it upgraded its environmental model to support global precipitation nowcasting and expanded the coverage of major river flood warnings to 2 billion people.

Geospatial Reasoning Agent: Unlocking Earth’s Potential

Solving real-world problems requires the integration of insights from multiple specialized models.

The future of geospatial AI lies not in isolated single models but in an integrated, multi-modal ecosystem coordinated by advanced AI.

Google’s newly introduced Gemini-powered geospatial reasoning agent can intelligently coordinate the diverse capabilities of Earth AI.

Research confirms: Multi-model integration enhances predictive capabilities.

The ultimate goal of Earth AI is to help users answer complex real-world questions that require multi-model, multi-data source reasoning.

Such queries can be categorized into three levels of complexity:

  • Descriptive and Retrieval Queries: Fact-finding, such as “What was the highest temperature recorded in New York in August 2020?”
  • Analytical and Associative Queries: Revealing patterns and associations across different data sources, such as “When Hurricane Katrina made landfall, how many hospitals in Louisiana were located in areas severely affected by the storm?”
  • Predictive or Inferential Queries: Information prediction, such as “By November this year, which cities in India will have the most vulnerable populations facing the highest risk of flood impacts?”

To address these three categories of complex queries, Google has specifically designed the “geospatial reasoning agent.”

To iteratively optimize responses, the agent continuously repeats a cycle of “thinking and planning → data manipulation/model reasoning/training → reflection and correction” until a final answer based on reliable evidence is generated.

Operational Framework of the Geospatial Reasoning Agent

For example, when a user needs to identify specific vulnerable populations threatened by a storm, the agent achieves precise analysis through the following transparent reasoning steps:

  • Environmental Risk Modeling: Call on the environmental model to precisely delineate the geographical area threatened by hurricane-force winds.
  • Population Density Analysis: Query the Data Commons population statistics database to identify high-population-density counties within the predicted landfall area.
  • Administrative Boundary Matching: Obtain official administrative boundary data for target counties from the BigQuery public dataset.
  • Spatial Overlay Calculation: Perform geometric spatial intersection operations on the wind impact area and administrative boundaries.
  • Vulnerable Area Location: Based on the population dynamics foundational model and real-time training of machine learning models using county-level statistical data, precisely locate the most vulnerable ZIP code areas.
  • Critical Facility Identification: Utilize the target detection capability of the remote sensing foundational model to automatically identify critical infrastructure such as hospitals and shelters in satellite imagery of the most vulnerable areas.

This hierarchical reasoning mechanism transforms raw data into decision-making knowledge for disaster response, demonstrating the breakthrough application of multi-modal AI in emergency management.

To evaluate agent performance, Google developed two innovative evaluation methods: a question-answering benchmark test constructed from publicly available data (containing verifiable real answers) and crisis response case studies for complex prediction scenarios.

The Earth AI agent combines weather models, population data, and geospatial analysis to achieve automated disaster risk identification and visual reasoning.

In the question-answering benchmark test, the geospatial reasoning agent achieved a comprehensive accuracy rate of 0.82, significantly outperforming baseline agents Gemini 2.5 Pro (0.50) and Gemini 2.5 Flash (0.39).

This result underscores the critical value of equipping the agent with professional geospatial models and tools for handling such queries.

Innovation Knows No Bounds

Over the past two weeks, Google Research has achieved a series of new breakthroughs, spanning from genomics and quantum computing to geospatial cognition.

These breakthroughs perfectly exemplify the “magic loop” research model proposed by Google Research head Yossi Matias: addressing global challenges and opportunities through foundational research and directly translating them into real-world application solutions.

These solutions not only benefit hundreds of millions of people worldwide but also continuously reveal new topics worthy of exploration.

Research is the essential path for humanity to improve daily life, tackle social challenges, and seize the opportunities of the times, meaning that scientific research and innovation know no bounds.

Driven by more powerful models and intelligent tools, the magic loop is accelerating, producing cross-disciplinary chain reactions.