About

With the rapid revolution and increasing availability of geospatial data not only academia, but also industry aspire for solutions to further leverage the big data and AI technologies to create new products, improve efficiencies and provide novel solutions to existing problems. However, despite the widespread interest, there is a lack of communication between the researchers in academia and industry. This limits advancements at the intersection and causes redundant or siloed efforts on both sides. Academia often has limited access to the rich and potentially useful big geospatial datasets, sufficient computational resources, and related real problems. In addition, the solutions proposed by the academic researchers alone are usually developed for a small scale with many assumptions, leaving a less-attended gap between methods and their applicability at scale for industrial applications. On the other hand, the industry has the data, computational resources, and problems at scale. However, since existing research is often not on par, industry researchers may lean towards using the traditional approaches that are developed without spatial consideration (e.g., ignoring spatial and temporal dependencies), and project teams have limited time and efforts to dive deep on the development of novel techniques that can be high-risk, but high-potential. This opens up opportunities for synergistic collaboration between industrial practitioners and academic researchers. The 3rd ACM SIGSPATIAL International Workshop on Spatial Big Data and AI for Industrial Applications (GeoIndustry 2024) is to offer a forum to exchange thoughts and ideas between industry and academia and reduce those siloed efforts by exploring synergies between the researchers on both sides. We envision that the collaborations, via invited talks and panel discussions, can not only accelerate the research-to-impact cycle, but also foster workforce development for future geospatial researchers.

Organization Committee

Program Chairs
Emre Eftelioglu (Amazon)
Heba Aly (Amazon)
Jinmeng Rao (Google DeepMind)
Song Gao (University of Wisconsin-Madison)
Yiqun Xie (University of Maryland)

Program Committee
Chenxi Lin (PAII Inc) Badrinath Srinivas (Amazon)

Schedule

Tuesday October 29, 2024 - Atlanta, Georgia, USA

Eastern Time (ET) Title
08:00 - 08:10 Opening Remarks



08:10 - 09:00 [Keynote Talk] The Rise of the Data Science Assistant: LLM Agents in Action

Speaker: Dr. Hongxu Ma, Staff AI Research Scientist at Google

Abstract: The way we work is undergoing a revolution thanks to AI, and with this revolution comes incredible new opportunities. Join me as I introduce our latest innovation: the Google Data Science Agent. Discover how this powerful LLM-powered tool is transforming data analysis and get a glimpse into the future of work. We'll also explore how industry and academia can work together more effectively in this exciting new era of LLMs.
09:00 - 09:10 Map Stitcher: Graph Sampling-based Map Conflation

Erfan Hosseini Sereshgi and Carola Wenk

Abstract: The integration of two geometric graphs is a fundamental task that arises in various fields, including automated cartography, image processing, and computer graphics. Maintaining precise and up-to-date roadmaps may pose a significant hurdle for new companies, especially in the early stages when resources are limited. A notable application of this task is the process of map conflation. Map conflation or map merging involves combining roadmap data from two separate sources to create data with higher coverage and accuracy than either source which is essential for accurate navigation systems. An ideal solution would be a scalable automated technique capable of handling vast areas while safeguarding key geometric and topological details. In this work, we explore the application of graph sampling, a common method for evaluating reconstructed maps, to the task of map conflation.Our approach employs a partial matching technique, allowing segments to be matched fractionally through graph sampling. Unlike existing methods, which require a segment to be either fully matched or not matched at all, our technique enables nuanced matching. This leads to more detailed conflated maps, avoiding excessive selectivity when adding edges and more customization. Hence, we introduce Map Stitcher, an automated tool that seamlessly integrates new information from a secondary map into a primary one. Our approach offers adaptable algorithms that address various scenarios encountered in roadmap data, allowing for tailored choices based on the specific datasets along with source code and sample datasets. Furthermore, we evaluate our method on three roadmap datasets, demonstrating its effectiveness in maintaining accuracy.
09:10 - 09:20 Towards a Trajectory-powered Foundation Model of Mobility

Shushman Choudhury, Abdul Rahman Kreidieh, Ivan Kuznetsov and Neha Arora

Abstract: This paper advocates for a geospatial foundation model based on human mobility trajectories in the built environment. Such a model would be widely applicable across many important societal domains currently addressed independently, including transportation networks, data-driven urban planning, tourism, and sustainability. Unlike existing large vision-language models, trained primarily on text and images, this foundation model should integrate the complex spatiotemporal and multimodal data inherent to mobility. This paper motivates this challenging research agenda, outlining many downstream applications that would be significantly impacted and enabled by such a model. It then explains the critical spatial, temporal, and contextual factors that such a model must capture in trajectories. Finally, it concludes with several research questions and directions, laying the foundations for future exploration in this exciting and emerging field.
09:20 - 09:30 TorchSpatial: A Python Package for Spatial Representation Learning and Geo-Aware Model Development

Qian Cao, Nemin Wu, Zhangyu Wang, Zeping Liu, Yanlin Qi, Jielu Zhang, Joshua Ni, Xiaobai Yao, Hongxu Ma, Lan Mu, Stefano Ermon, Tanuja Ganu, Akshay Nambi, Ni Lao and Gengchen Mai

Abstract: Spatial representation learning (SRL) focuses on developing spatial embeddings from various forms of spatial data, such as points, polylines, polygons, graphs, networks, and images without any additional feature engineering or data conversion step. Effective spatial representation is fundamental for a wide range of downstream geospatial applications, including species distribution modeling, satellite image classification, point cloud classification and segementation, trajectory synthesis, building footprint extraction, and cartographic generalization. Despite the widespread use of SRL as a cornerstone for many spatially-aware AI models, there is still no comprehensive package shared across the community that provides ready-made code to support the implementation and reproduction of SRL model development. To fill this void, we present TorchSpatial, a Python package designed to support the encoding of spatial data, starting with location (point) encoding, a fundamental data type in SRL. TorchSpatial includes two key components: 1) We present TorchSpatial, an SRL framework supporting the development of location encoders. TorchSpatial now integrates 15 widely-used encoders and essential encoder components, ensuring scalability and reproducibility for future developments; 2) We establish a ready-to-use workflow that takes the input hyperparameters and outputs the model inference results and evaluation across geo-aware image classification and regression tasks with access to 17 datasets. We believe TorchSpatial will foster future advancement of SRL and spatial fairness in GeoAI research. The TorchSpatial SRL framework and inference models are available at https://github.com/seai-lab/TorchSpatial.
09:30 - 10:00 Coffee Break



10:00 - 10:50 [Keynote Talk] Reimagining Automated Urban Planning: A Generative AI Perspective

Speaker: Dr. Yanjie Fu, Associate Professor at the Arizona State University

Abstract: Urban planning is an interdisciplinary and complex process that involves with public policy, social science, engineering, architecture, landscape, and other related field. Effective urban planning can help to mitigate the operational and social vulnerabilty of a urban system, such as high tax, crimes, traffic congestion and accidents, pollution, depression, and anxiety. In this talk, we will discuss two key research questions: (1) how can we quantify a land-use configuration plan? (2) how can we develop a machine learning framework that can learn the good and the bad of existing urban communities in terms of land-use configuration? Besides, we will introduce several technical frameworks (e.g., adversarial generative land use planning, conditional variational generative planning, language-instructed deep hierarchical generative planning. Finally, we will discuss limitations and future work.
10:50 - 11:00 Multi-source data fusion for filling gaps in satellite Aerosol Optical Depth (AOD) using generative models

Anusha Srirenganathan Malarvizhi and Phoebe Pan

Abstract: Aerosol Optical Depth (AOD) is a crucial parameter for monitoring air quality, but satellite-based measurements often suffer from significant gaps due to cloud cover and other obstructions. These missing data points, often categorized as Missing Not At Random (MNAR), pose challenges for accurate air quality assessments. This study applies a Generative Adversarial Imputation Network (GAIN) to impute missing AOD data from the MODIS MAIAC dataset across the Northeast United States, addressing these challenges by leveraging relevant meteorological covariates, such as cloud cover, relative humidity, and temperature. The GAIN model was trained using data from 2021 to 2022, with hyperparameter tuning conducted to optimize performance. The tuning process revealed that a low learning rate and minimal weight decay yielded the most stable and accurate results. The model was validated against AERONET data, achieving a correlation coefficient (R) of 0.89, demonstrating strong alignment between imputed and observed AOD values. The GAIN model also demonstrated strong predictive accuracy, achieving an average R² of 0.94, MSE of 0.0046, and RMSE of 0.0676. Cross-validation confirmed the robustness and generalizability of the model across various datasets. The model’s performance was compared with traditional imputation methods like MICE and MissForest. GAIN outperformed both models, showing superior performance in handling MNAR data and minimizing error across all metrics. This comparative analysis emphasizes the GAIN model's ability to effectively capture complex spatial and temporal dependencies in the dataset. In addition to filling data gaps, the GAIN model preserved the spatial distribution of AOD, showing higher concentrations in urban areas and regions with elevated pollution. During the 2023 Canadian wildfire event, the model successfully imputed AOD levels, capturing the sharp rise in aerosol concentrations. This study demonstrates the effectiveness of GAIN in handling complex MNAR scenarios, offering a reliable solution for improving AOD data coverage and enhancing the accuracy of air quality assessments.
11:00 - 11:10 Convenience Store Geospatial Location Optimization Analytics Using Deep Reinforcement Learning

Shaohua Wang, Dachuan Xu, Junyuan Zhou, Cheng Su, Xiao Li, Xiaojian Liang, Chunxiang Cao, Chang Liu and Yang Zhong

Abstract: Convenience stores, as a rapidly developing new retail format, have a significant impact on both consumer convenience and the brand’s commercial profits and logistics costs. With the rise of online food delivery and other services, convenience stores need to optimize their location selection and spatial layout to meet modern consumers' demands and enhance market competitiveness. This paper takes the Everyday Chain as the research subject, analyzing its spatial distribution and influencing factors in the main urban area of Xi'an. A Maximum Coverage Location Model is adopted, combined with deep reinforcement learning and genetic algorithms for optimization. The study results indicate that deep reinforcement learning outperforms genetic algorithms in terms of solution efficiency and coverage performance, offering a new approach and reference for convenience store location optimization. This can better enhance the rationality of service layout and the market competitiveness of convenience stores.
11:10 - 11:20 Geospatial Optimization Analytics for Bubble Tea Shops Location Service

Jiayi Zheng, Shaohua Wang, Haojian Liang, Chunxiang Cao, Junyuan Zhou, Xiao Li, Min Xu, Xinwei Yang and Jiahui Ji

Abstract: With the rapid development of the bubble tea industry, optimizing the location selection of bubble tea shops has become crucial. Traditional location methods often oversimplify the use of population density data and neglect the overlap of service areas, potentially leading to resource waste or untapped market potential. This paper proposes a location optimization method for bubble tea shops based on the calculation of relative service population, utilizing Monte Carlo simulation and a genetic algorithm. By considering the overlap of service areas and the fine-grained distribution of population density, the method aims to improve service coverage and the scientific basis of location decisions. Using the area within Beijing's Fifth Ring Road as a case study, experimental results demonstrate that the proposed method effectively avoids redundant coverage of service areas and enhances the accuracy of evaluating potential service populations, providing valuable insights for practical location selection in the bubble tea industry.
11:20 - 12:00 [Panel Discussion] Academia and Industry in the Era of AI

Panelist: Dr. Hongxu Ma (Google), Dr. Yanjie Fu (Arizona State University), Dr. Heba Aly (Amazon), Dr. Jinmeng Rao (Google DeepMind)

Moderator: Dr. Yiqun Xie (University of Maryland)

Call For Papers (PDF version)

The workshop seeks high-quality regular (8-10 pages) and short (4 pages) papers that have not been published in other academic outlets and are not concurrently under peer review. Interested participants should submit a paper in the ACM format. Once accepted, at least one author is required to register for the workshop and the ACM SIGSPATIAL conference, as well as attend the workshop to present the accepted work which will then appear in the ACM Digital Library.

The topics include but are not limited to (in the context of industrial or related problems, such as delivery, routing, recommendation, mapping, resource allocation, and more):

  • Applications of AI

  • Big Data systems

  • Geospatial AI foundation models

  • Problems and benchmark datasets

  • Machine learning and deep learning

  • Computer vision and earth observation

  • Generative models and simulation

  • Map generation techniques

  • Heterogeneous data integration and analysis

  • Small data learning approaches

  • Citizen science and data collection

  • Spatial query processing

  • Spatial data management and integration

  • Ethical issues in geospatial data and research

  • Perspectives on the future of geospatial data and research

  • Emerging topics and trends

Important Dates

Submission deadline

September 14, 2024 (anywhere on earth)

Author notification

September 24, 2024 (anywhere on earth)

Workshop date

October 29, 2024 (ET)

Submission site

https://easychair.org/conferences/?conf=geoindustry2024