13 Using Google Earth Engine to Access Geospatial Data (Covariate Development)
December 17, 2024 Meeting Transcript Analysis
13.1 Part One
13.1.1 1. Initial Project Organization
- Created a Google Doc for prompt engineering work in the urban soil mapping shared folder
- Document named “prompt engineering” placed in shared project folder
- Discussed organizing project files within lab shared drive structure
13.1.2 2. QGIS Bounding Box Creation
- Created new project with following steps:
- Added XYZ tile layer for Google satellite imagery as base map
- Set coordinate system to NAD 83 UTM Zone 15N
- Added TCMA urban soil survey update draft AOI shapefile
- Created new shapefile for bounding box
- Used center and point method to create square boundary around the AOI
- Ensured coverage extended beyond immediate AOI
- Bounding Box Specifications:
- File named: “bounding-box-draft-v2-17December2024”
- Designed to have 2km overlap between tiles
- Configured to extend beyond final boundary limits
- Created in UTM coordinate system for consistent tile sizing
13.1.3 3. Google Earth Engine Setup
Registration Process:
- Accessed through umn.edu account
- Selected unpaid usage for education
- Project type: Research
- Created new Google Cloud project with Earth Engine integration
- Completed Earth Engine registration
13.1.4 4. Google Cloud Console Configuration
- Accessed console.cloud.google.com
- Created project under organizational structure
- Attempted to set up cloud storage buckets
- Note: Encountered payment verification issues that need resolution
13.1.5 5. MSI (Minnesota Supercomputing Institute) Environment
Jupyter Lab Setup:
- Accessed through interactive apps
- Selected Python 3.8.3 environment
- Created new project folder: “tcma_covar_development”
- Installed required package (e.g., geopandas)
13.1.6 Technical Considerations
- Need to ensure 2km overlap in tiles for processing
- File organization split between:
- Google Cloud Storage (for large datasets)
- MSI Tier 1 & 2 storage
- Google Drive (for documentation)
- Working environment includes:
- Google Earth Engine for satellite data
- MSI for computing
- QGIS for spatial analysis
- Python/Jupyter Lab for processing
13.2 Part Two
13.2.1 1. Google Cloud Project Setup
- Created access to project environment
- Added user as editor to existing project
- Confirmed access to cloud console
- Set up bucket structure for data storage
- Created Storage Structure:
- Created main bucket “tcma_covars”
- Created initial directory: “dem-3dep-1”
- Configured with multi-region storage
- Enabled standard storage settings
- Enforced public access hazard prevention
13.2.2 2. Jupyter Lab Environment Configuration
- Package Installation:
- Attempted Conda installation of GeoPandas
- Switched to Pip install when Conda had conflicts
- Successfully installed GeoPandas using Pip
- Added local bin directory to path for module access
- Environment Organization:
- Created clear file naming conventions
- Set up documentation of one-time setup steps
- Organized notebooks with clear prefixes for setup vs. regular usage
- Added annotations for process documentation
13.2.3 3. Google Earth Engine Setup
- Asset Management:
- Uploaded shapefile as Earth Engine asset
- Required coordinate system conversion to EPSG:4326
- Created project structure for scripts
- Set up folders for different data types
- Script Development:
- Created visualization script for tiles
- Set up data processing framework
- Implemented naming conventions for exported files
- Format: “tcma_3dep_1meter_tile_h[XX]v[YY]”
13.2.4 4. Data Processing Framework
- Initial Setup:
- Imported tiles asset
- Added 3DEP elevation data
- Configured NAIP imagery access
- Set up processing functions for tiles
- Key Components:
- Tile processing function
- Export naming convention setup
- Visualization parameters
- Data extraction framework
13.2.5 Technical Notes
- Modified tile size to handle ~1.5GB files instead of smaller segments
- Reduced number of tiles from 169 (13x13) to 49 (7x7)
- Implemented 2km overlap between tiles for processing continuity
- Configured proper coordinate system transformations
13.3 Part Three
13.3.1 1. Data Visualization Troubleshooting
- Identified Issues with Attribute Table:
- Found shapefile attribute table was not correctly formatted
- Made necessary adjustments to tile ID properties
- Added proper ID fields to enable processing
- Script Refinements:
- Modified JavaScript to handle tile naming conventions
- Adjusted bucket paths for correct data storage
- Set up proper directory structure for output files
13.3.2 2. Image Processing Configuration
- NAIP Imagery Setup:
- Configured date range for 2021 imagery
- Extended collection window to May-September
- Confirmed NAIP imagery is typically cloud-free
- Set up composite parameters
- Sentinel-2 Configuration:
- Set date range: 2019-2024
- Configured growing season window: June 1 - August 31
- Added cloud masking procedures
- Set up 5-year composite parameters
13.3.3 3. File Structure and Organization
- Google Cloud Storage:
- Created main bucket: “tcma_covars”
- Established directory structure:
- DEM (3DEP data)
- NAIP imagery
- Sentinel-2 data
- Set up proper naming conventions for outputs
- Processing Framework:
- Configured 10-meter resolution as standard output
- Aligned with G-SERGO grid specifications
- Set up downscaling procedures for higher resolution data
13.3.4 4. Next Steps and Action Items
- Immediate Tasks:
- Complete DEM processing for remaining tiles
- Set up NAIP processing for 2021
- Configure Sentinel-2 processing
- Document process for technical documentation
- Future Processing:
- Transfer processed data to MSI Tier 2
- Develop Python scripts for:
- Data alignment
- Resolution matching
- Stack creation
- Quality control
- Documentation Requirements:
- Create progress slides for project update
- Include screenshots of:
- Google Earth Engine interface
- Cloud console
- Processing results
- Document technical procedures
13.3.5 Technical Notes
- Resolution Standards:
- Final output will be 10-meter resolution
- Matching gSSURGO grid specifications
- Downscaling required for higher resolution inputs
- Interpolation needed for alignment
- Processing Considerations:
- Cloud-free compositing for Sentinel data
- Scene classification layer (SCL) for cloud masking
- Median value computation for composites
- Spatial alignment requirements