Skip to content

Data Sources for EEMT Calculations

Overview

EEMT calculations require high-quality elevation and climate data. This guide covers accessing publicly available datasets optimized for Critical Zone analysis.

Quick Reference

Data Type Source Resolution Coverage API Access
Elevation USGS 3DEP 1m, 10m, 30m United States
OpenTopography Variable Global
FABDEM 30m Global
Climate DAYMET 1km daily North America
PRISM 800m monthly United States
GridMET 4km daily United States
Satellite Landsat 30m Global
MODIS 250m-1km Global

Elevation Data

USGS 3DEP (3D Elevation Program)

Best for: High-resolution analysis in the United States

Access Methods

1. National Map Downloader

# Direct download interface
https://apps.nationalmap.gov/downloader/

# Select area of interest, choose elevation products:
# - 1m DEM (lidar-derived, best quality)
# - 1/3 arc-second (~10m, good coverage) 
# - 1 arc-second (~30m, complete coverage)

2. USGS API Access

import requests
import geopandas as gpd

def download_3dep(bbox, resolution='10m'):
    """
    Download USGS 3DEP elevation data

    Parameters:
    bbox: [west, south, east, north] in decimal degrees
    resolution: '1m', '10m', or '30m'
    """

    base_url = "https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/"

    if resolution == '1m':
        dataset = "USGS_LPC"
    elif resolution == '10m':  
        dataset = "USGS_NED_13"
    else:
        dataset = "USGS_NED_1"

    # Construct download URL
    url = f"{base_url}{dataset}?west={bbox[0]}&south={bbox[1]}&east={bbox[2]}&north={bbox[3]}&outputFormat=GTiff"

    return url

# Example usage
arizona_bbox = [-111.5, 32.0, -110.5, 32.5]  
dem_url = download_3dep(arizona_bbox, '10m')

3. Cloud-Optimized Access

import rasterio
from rasterio.session import AWSSession

# Access 3DEP COGs on AWS Open Data
aws_session = AWSSession(profile_name='default')
with rasterio.Env(session=aws_session):
    with rasterio.open('s3://prd-tnm/StagedProducts/Elevation/1m/Projects/...') as src:
        dem_data = src.read(1)

OpenTopography

Best for: Global coverage, lidar access, research applications

Web Interface

# OpenTopography Portal
https://portal.opentopography.org/

# Select dataset:
# - SRTM GL1 (30m global)
# - SRTM GL3 (90m global)  
# - ALOS World 3D (30m global)
# - Regional lidar datasets

API Access

import requests

def download_opentopo(bbox, dem_type='SRTMGL1'):
    """
    Download DEM from OpenTopography API

    Parameters:
    bbox: [west, south, east, north] in decimal degrees
    dem_type: 'SRTMGL1', 'SRTMGL3', 'ALOS', 'COP30', 'COP90'
    """

    base_url = "https://cloud.sdsc.edu/v1/AUTH_opentopography/Raster/"

    params = {
        'demtype': dem_type,
        'west': bbox[0],
        'south': bbox[1], 
        'east': bbox[2],
        'north': bbox[3],
        'outputFormat': 'GTiff'
    }

    response = requests.get(f"{base_url}/API/globaldem", params=params)

    if response.status_code == 200:
        with open(f'{dem_type}_dem.tif', 'wb') as f:
            f.write(response.content)
        return f'{dem_type}_dem.tif'
    else:
        raise Exception(f"Download failed: {response.status_code}")

# Example usage
bbox = [-111.5, 32.0, -110.5, 32.5]
dem_file = download_opentopo(bbox, 'SRTMGL1')

Lidar Point Cloud Access

import pdal
import json

# Download lidar point cloud data
pipeline = {
    "pipeline": [
        {
            "type": "readers.ept",
            "filename": "https://cloud.sdsc.edu/v1/AUTH_opentopography/PC/CA_FullState_2019/ept.json",
            "bounds": "([-111.5, -111.0], [32.0, 32.5])"
        },
        {
            "type": "writers.las",
            "filename": "output.las"
        }
    ]
}

pdal_pipeline = pdal.Pipeline(json.dumps(pipeline))
pdal_pipeline.execute()

Global DEMs

FABDEM (Forest And Buildings removed DEM)

# Access via Google Earth Engine
import ee

ee.Initialize()

# Load FABDEM
fabdem = ee.Image("projects/sat-io/open-datasets/FABDEM")

# Export specific region
task = ee.batch.Export.image.toDrive(
    image=fabdem.select('elevation'),
    description='FABDEM_export',
    folder='EarthEngine',
    scale=30,
    region=ee.Geometry.Rectangle([-111.5, 32.0, -110.5, 32.5]),
    crs='EPSG:4326'
)
task.start()

Copernicus DEM

# Download via Copernicus Data Space
# https://dataspace.copernicus.eu/

# Available resolutions:
# - COP-DEM 30m (global)
# - COP-DEM 90m (global)

# API access requires registration

Climate Data

DAYMET (Daily Surface Weather Data)

Best for: High-resolution daily meteorology across North America

Variables Available

  • tmin: Daily minimum temperature (°C)
  • tmax: Daily maximum temperature (°C)
  • prcp: Daily precipitation (mm/day)
  • srad: Shortwave radiation (W/m²)
  • vp: Water vapor pressure (Pa)
  • swe: Snow water equivalent (kg/m²)
  • dayl: Day length (s/day)

API Access

import requests
import xarray as xr
from datetime import datetime

def download_daymet(lat, lon, start_year, end_year, variables=['tmin', 'tmax', 'prcp']):
    """
    Download DAYMET data for point location

    Parameters:
    lat, lon: coordinates in decimal degrees
    start_year, end_year: year range
    variables: list of variable names
    """

    base_url = "https://daymet.ornl.gov/single-pixel/api/data"

    params = {
        'lat': lat,
        'lon': lon,
        'vars': ','.join(variables),
        'start': start_year,
        'end': end_year,
        'format': 'json'
    }

    response = requests.get(base_url, params=params)
    data = response.json()

    return data

# Example usage
tucson_data = download_daymet(32.25, -110.97, 2015, 2020, 
                            ['tmin', 'tmax', 'prcp', 'vp'])

Spatial Data Access

def download_daymet_spatial(bbox, year, variable='tmin'):
    """
    Download DAYMET spatial data via THREDDS

    Parameters:
    bbox: [west, south, east, north]
    year: data year
    variable: 'tmin', 'tmax', 'prcp', 'srad', 'vp', 'swe', 'dayl'
    """

    base_url = "https://thredds.daac.ornl.gov/thredds/dodsC/ornldaac/1328"
    nc_url = f"{base_url}/daymet_v4_daily_na_{variable}_{year}.nc"

    # Open with xarray
    ds = xr.open_dataset(nc_url)

    # Subset to bounding box
    subset = ds.sel(
        x=slice(bbox[0], bbox[2]),
        y=slice(bbox[1], bbox[3])
    )

    return subset

# Example usage
bbox = [-111.5, 32.0, -110.5, 32.5]
temp_data = download_daymet_spatial(bbox, 2020, 'tmin')

Bulk Download Script

import os
from concurrent.futures import ThreadPoolExecutor
import subprocess

def download_daymet_bulk(bbox, years, variables, output_dir):
    """
    Bulk download DAYMET data using wget
    """

    os.makedirs(output_dir, exist_ok=True)

    def download_file(url, filename):
        cmd = f"wget -O {output_dir}/{filename} '{url}'"
        subprocess.run(cmd, shell=True)

    download_tasks = []

    for year in years:
        for var in variables:
            url = f"https://thredds.daac.ornl.gov/thredds/fileServer/ornldaac/1328/daymet_v4_daily_na_{var}_{year}.nc"
            filename = f"daymet_{var}_{year}.nc"
            download_tasks.append((url, filename))

    # Parallel downloads
    with ThreadPoolExecutor(max_workers=4) as executor:
        for url, filename in download_tasks:
            executor.submit(download_file, url, filename)

# Example usage  
years = range(2015, 2021)
variables = ['tmin', 'tmax', 'prcp', 'vp']
download_daymet_bulk(bbox, years, variables, 'daymet_data')

PRISM (Parameter-elevation Regressions on Independent Slopes Model)

Best for: High-resolution monthly climate normals for the United States

Access Methods

import requests
from bs4 import BeautifulSoup

def download_prism(bbox, year, variable='ppt', temporal='monthly'):
    """
    Download PRISM data

    Parameters:
    bbox: [west, south, east, north]
    year: data year or 'normals' for 30-year averages
    variable: 'ppt', 'tmin', 'tmax', 'tmean', 'tdmean', 'vpdmin', 'vpdmax'
    temporal: 'monthly', 'daily', 'annual'
    """

    if year == 'normals':
        base_url = "https://prism.oregonstate.edu/normals/"
    else:
        base_url = "https://prism.oregonstate.edu/recent_years/"

    # PRISM requires spatial subsetting after download
    # Full datasets available at: ftp://prism.oregonstate.edu/

    print(f"PRISM data download instructions:")
    print(f"1. Visit: {base_url}")
    print(f"2. Download {variable} data for {year}")
    print(f"3. Use gdal_translate to subset:")
    print(f"   gdal_translate -projwin {bbox[0]} {bbox[3]} {bbox[2]} {bbox[1]} input.bil output.tif")

# Example usage
download_prism([-111.5, 32.0, -110.5, 32.5], 2020, 'ppt')

GridMET

Best for: Daily meteorological data with broader spatial coverage than DAYMET

import xarray as xr

def download_gridmet(bbox, year, variable='pr'):
    """
    Download GridMET data

    Parameters:  
    bbox: [west, south, east, north]
    year: data year
    variable: 'pr', 'tmmn', 'tmmx', 'rmax', 'rmin', 'vs', 'th', 'pet', 'erc', 'bi', 'fm100', 'fm1000'
    """

    base_url = "http://thredds.northwestknowledge.net:8080/thredds/dodsC/MET"
    nc_url = f"{base_url}/{variable}/{variable}_{year}.nc"

    # Open dataset
    ds = xr.open_dataset(nc_url)

    # Subset to bounding box
    subset = ds.sel(
        lon=slice(bbox[0], bbox[2]),
        lat=slice(bbox[1], bbox[3])
    )

    return subset

# Example usage
bbox = [-111.5, 32.0, -110.5, 32.5]
precip_data = download_gridmet(bbox, 2020, 'pr')

Satellite Data

Landsat (Vegetation Indices)

Best for: Long-term vegetation monitoring, NDVI calculation

Google Earth Engine Access

import ee

ee.Initialize()

def get_landsat_collection(bbox, start_date, end_date, cloud_cover=20):
    """
    Get Landsat collection with cloud masking
    """

    # Define area of interest
    aoi = ee.Geometry.Rectangle(bbox)

    # Get Landsat 8 collection
    collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') \
        .filterBounds(aoi) \
        .filterDate(start_date, end_date) \
        .filter(ee.Filter.lt('CLOUD_COVER', cloud_cover))

    def mask_clouds(image):
        # Cloud mask using QA_PIXEL band
        qa = image.select('QA_PIXEL')
        cloud_mask = qa.bitwiseAnd(1 << 3).eq(0)  # Cloud bit
        return image.updateMask(cloud_mask)

    # Apply cloud masking
    masked_collection = collection.map(mask_clouds)

    return masked_collection

def calculate_ndvi(image):
    """Calculate NDVI from Landsat image"""
    ndvi = image.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI')
    return image.addBands(ndvi)

# Example usage
bbox = [-111.5, 32.0, -110.5, 32.5]
collection = get_landsat_collection(bbox, '2020-01-01', '2020-12-31')
ndvi_collection = collection.map(calculate_ndvi)

# Get median NDVI for year
median_ndvi = ndvi_collection.select('NDVI').median()

Direct Archive Access

import pystac_client
import rioxarray

# Access Landsat via STAC
catalog = pystac_client.Client.open("https://landsatlook.usgs.gov/stac-server")

# Search for Landsat scenes
search = catalog.search(
    collections=["landsat-c2l2-sr"],
    bbox=[-111.5, 32.0, -110.5, 32.5],
    datetime="2020-06-01/2020-08-31",
    query={"eo:cloud_cover": {"lt": 20}}
)

# Download and process
items = list(search.get_items())
for item in items[:5]:  # First 5 scenes
    red_asset = item.assets['red']
    nir_asset = item.assets['nir08']

    # Load bands
    red = rioxarray.open_rasterio(red_asset.href, chunks=True)
    nir = rioxarray.open_rasterio(nir_asset.href, chunks=True)

    # Calculate NDVI
    ndvi = (nir - red) / (nir + red)

MODIS Products

Best for: Global vegetation monitoring, LAI/FPAR products

import requests
from pyhdf.SD import SD, SDC
import numpy as np

def download_modis_lai(bbox, year, product='MYD15A3H'):
    """
    Download MODIS LAI product

    Parameters:
    bbox: [west, south, east, north] 
    year: data year
    product: 'MOD15A3H' (Terra) or 'MYD15A3H' (Aqua)
    """

    # MODIS data requires registration at:
    # https://urs.earthdata.nasa.gov/

    base_url = "https://e4ftl01.cr.usgs.gov/MOLA/"

    # Example for automated access (requires authentication)
    print(f"MODIS {product} download:")
    print(f"1. Register at: https://urs.earthdata.nasa.gov/")
    print(f"2. Use wget with credentials:")
    print(f"   wget --user=USERNAME --password=PASSWORD {base_url}{product}")
    print(f"3. Process HDF files to extract LAI layer")

# Process MODIS HDF file
def extract_modis_lai(hdf_file):
    """Extract LAI from MODIS HDF file"""

    # Open HDF file
    hdf = SD(hdf_file, SDC.READ)

    # Get LAI dataset
    lai_dataset = hdf.select('Lai_500m')
    lai_data = lai_dataset.get()

    # Apply scale factor and fill values
    scale_factor = lai_dataset.attributes()['scale_factor']
    fill_value = lai_dataset.attributes()['_FillValue']

    lai_data = lai_data.astype(np.float32)
    lai_data[lai_data == fill_value] = np.nan
    lai_data = lai_data * scale_factor

    return lai_data

Data Integration Workflows

Automated Climate Data Download

import os
import subprocess
from datetime import datetime, timedelta

class ClimateDataManager:
    """Automated climate data download and processing"""

    def __init__(self, study_area, output_dir):
        self.bbox = study_area  # [west, south, east, north]
        self.output_dir = output_dir
        os.makedirs(output_dir, exist_ok=True)

    def download_daymet_timeseries(self, start_year, end_year, variables):
        """Download DAYMET time series for study area"""

        for year in range(start_year, end_year + 1):
            for var in variables:
                url = f"https://thredds.daac.ornl.gov/thredds/fileServer/ornldaac/1328/daymet_v4_daily_na_{var}_{year}.nc"
                output_file = f"{self.output_dir}/daymet_{var}_{year}.nc"

                if not os.path.exists(output_file):
                    print(f"Downloading {var} for {year}...")
                    subprocess.run([
                        'wget', '-O', output_file, url
                    ], check=True)

    def process_to_monthly(self, variables):
        """Convert daily DAYMET to monthly averages"""

        for var in variables:
            # Use CDO (Climate Data Operators) if available
            input_pattern = f"{self.output_dir}/daymet_{var}_*.nc"
            output_file = f"{self.output_dir}/daymet_{var}_monthly.nc"

            cmd = f"cdo -monmean -mergetime {input_pattern} {output_file}"
            print(f"Processing {var} to monthly: {cmd}")
            # subprocess.run(cmd, shell=True, check=True)

# Example usage
study_area = [-111.5, 32.0, -110.5, 32.5]  # Arizona region
climate_manager = ClimateDataManager(study_area, 'climate_data')

# Download 5 years of data
climate_manager.download_daymet_timeseries(2015, 2020, ['tmin', 'tmax', 'prcp', 'vp'])
climate_manager.process_to_monthly(['tmin', 'tmax', 'prcp', 'vp'])

Data Quality Control

import rasterio
import numpy as np
from rasterio.mask import mask
import geopandas as gpd

def validate_dataset(filepath, data_type='elevation'):
    """Validate downloaded dataset for quality issues"""

    with rasterio.open(filepath) as src:
        data = src.read(1)
        profile = src.profile

        results = {
            'file': filepath,
            'crs': profile['crs'],
            'shape': data.shape,
            'dtype': profile['dtype'],
            'nodata': profile.get('nodata'),
            'valid_pixels': np.sum(~np.isnan(data)),
            'coverage_pct': np.sum(~np.isnan(data)) / data.size * 100
        }

        if data_type == 'elevation':
            results.update({
                'elevation_min': np.nanmin(data),
                'elevation_max': np.nanmax(data),
                'elevation_mean': np.nanmean(data),
                'has_negative_elev': np.any(data < -500),  # Flag suspicious values
            })
        elif data_type == 'temperature':
            results.update({
                'temp_min': np.nanmin(data),
                'temp_max': np.nanmax(data), 
                'temp_mean': np.nanmean(data),
                'realistic_range': np.all((data >= -50) & (data <= 60))  # Celsius range
            })
        elif data_type == 'precipitation':
            results.update({
                'precip_min': np.nanmin(data),
                'precip_max': np.nanmax(data),
                'precip_mean': np.nanmean(data),
                'no_negative': np.all(data >= 0)  # Precipitation can't be negative
            })

        return results

# Validate all datasets
datasets = [
    ('dem.tif', 'elevation'),
    ('temperature.tif', 'temperature'), 
    ('precipitation.tif', 'precipitation')
]

for filepath, data_type in datasets:
    if os.path.exists(filepath):
        validation = validate_dataset(filepath, data_type)
        print(f"\n{filepath} validation:")
        for key, value in validation.items():
            print(f"  {key}: {value}")

Best Practices

Data Selection Guidelines

  1. Match spatial resolutions - Use consistent pixel sizes across datasets
  2. Align temporal periods - Ensure climate data covers the same time period
  3. Check coordinate systems - Reproject data to common CRS if needed
  4. Validate extents - Confirm all datasets cover your study area completely

Storage and Organization

project/
├── data/
│   ├── elevation/
│   │   ├── raw/           # Original downloads
│   │   ├── processed/     # Reprojected, clipped
│   │   └── metadata/      # Data provenance
│   ├── climate/
│   │   ├── daymet/        # Daily meteorology  
│   │   ├── prism/         # Monthly normals
│   │   └── processed/     # Analysis-ready
│   └── satellite/
│       ├── landsat/       # Vegetation indices
│       └── modis/         # LAI products
├── scripts/               # Download automation
└── docs/                  # Data documentation

Performance Tips

  • Use parallel downloads for large datasets
  • Compress intermediate files (LZW compression)
  • Tile large rasters for memory-efficient processing
  • Cache frequently used data locally
  • Document data provenance for reproducibility

Next: GRASS GIS Setup and Configuration