Skip to content

STAC API

OpenStack Virtual Machines

We are currently running two instances on CyVerse OpenStack Cloud

https://tombstone-cloud.cyverse.org/


One instance is running a Radiant Earth stac-fastapi STAC API

This vm is called stac-api and is served to https://stac.cyverse.org

It is a small instance (2 virtual CPUs, 16 GB RAM) with Ubuntu 22.04, Docker, and Docker-Compose.


The other instance is running DevSeed TiTiler

This vm is called titiler and is served at https://titiler.cyverse.org

For this we are running a xl instance (16-cores, 64 GB RAM, 200 GiB Disk ) with Ubuntu 22.04 and Docker


Launch using OpenStack

Log into OpenStack and provision each instance

After the instance is active, assign a floating IP address

Make sure that the default Security Group includes egress and ingress settings to connect the VM over :443

create and add ssh keys

Make sure that the VMs are using your public ssh key

Add your other admin keys by ssh to the VM

copy their id_rsa.pub keys to ~/.ssh/known_hosts

nano ~/.ssh/known_hosts


Deployments

Install Docker

If the image does not have Docker, install it.

sudo apt update
sudo apt install docker.io

Add the ubuntu user or your username to the Docker group

sudo groupadd docker
sudo usermod -aG docker $USER

Close your connection and reboot the instance.

Install docker-compose

sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose

sudo chmod +x /usr/local/bin/docker-compose

Install CaddyServer

To secure both instances over https:// we are runninng CaddyServer with a reverse proxy to the public IP addresses.

sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install caddy

Suggest using a tmux session for terminal.

usage Apple Keyboard Linux keyboard
start tmux tmux tmux
detach terminal hold ^ control + B, then D hold Ctrl + B, then D
find session tmux ls same
re-attach session tmux attach -t <session #> same
split-screen vertical hold ^ control + B, then hold shift + % hold Ctrl + B, then hold shift + %
split-screen horizontal hold ^ control + B, then hold shift + " hold Ctrl + B, then hold shift + "
Switch pane hold ^ control + B, then arrow key (left or right, up or down) hold Ctrl + B, then arrow key (left or right, up or down)
Close the session hold ^ control + B, then X hold Ctrl + B, then X

Ctrl + b + % to split the current pane vertically. Ctrl + b + " to split the current pane horizontally. Ctrl + b + x to close the current pane.


Run STAC API

stac-utils/stac-fastapi

git clone https://github.com/stac-utils/stac-fastapi.git

cd stac-fastapi

There are two configuration files which need to be updated:

docker-compose.yaml - provisions the deployment of a PostgreSQL database and SQL lchemy Python SQL toolkit and object relational mapper mapped to a STAC .json Collection and .geojson Feature Collection.

/scripts/ingest_collection.py - there is a sample .py ingestion file called "joplin" which you can make a copy of and edit.

There is one sample STAC catalog directory which can be removed or used for testing:

/stac_fastapi/testdata/joplin - is the test dataset which is deployed by default.

Edit docker-compose.yml

The stac-fastapi Docker-Compose will start up multiple containers, including a database which relies on the sample data.

Note: the GitHub repository for stac-fastapi expects containers from GitHub Container Registry, not from DockerHub - update the docker-compose.yml to use the specific containers and tag version

docker-compose.yml
version: '3'
services:
  app-sqlalchemy:
    container_name: stac-fastapi-sqlalchemy
    image: ghcr.io/stac-utils/stac-fastapi:main-sqlalchemy
    build:
      context: .
      dockerfile: docker/Dockerfile
    platform: linux/amd64
    environment:
      - APP_HOST=0.0.0.0
      - APP_PORT=8081
      - RELOAD=true
      - ENVIRONMENT=local
      - POSTGRES_USER=username
      - POSTGRES_PASS=password
      - POSTGRES_DBNAME=postgis
      - POSTGRES_HOST_READER=database
      - POSTGRES_HOST_WRITER=database
      - POSTGRES_PORT=5432
      - WEB_CONCURRENCY=10
    ports:
      - "8081:8081"
    volumes:
      - ../cyverse-stac:/app/cyverse-stac
      - ./stac_fastapi:/app/stac_fastapi
      - ./scripts:/app/scripts
    depends_on:
      - database
    command: bash -c "./scripts/wait-for-it.sh database:5432 && python -m stac_fastapi.sqlalchemy.app"

  app-pgstac:
    container_name: stac-fastapi-pgstac
    image: ghcr.io/stac-utils/stac-fastapi:main-pgstac
    platform: linux/amd64
    environment:
      - APP_HOST=0.0.0.0
      - APP_PORT=8082
      - RELOAD=true
      - ENVIRONMENT=local
      - POSTGRES_USER=username
      - POSTGRES_PASS=password
      - POSTGRES_DBNAME=postgis
      - POSTGRES_HOST_READER=database
      - POSTGRES_HOST_WRITER=database
      - POSTGRES_PORT=5432
      - WEB_CONCURRENCY=10
      - VSI_CACHE=TRUE
      - GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
      - GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
      - DB_MIN_CONN_SIZE=1
      - DB_MAX_CONN_SIZE=1
      - USE_API_HYDRATE=${USE_API_HYDRATE:-false}
    ports:
      - "8082:8082"
    volumes:
      - ../cyverse-stac:/app/cyverse-stac
      - ./stac_fastapi:/app/stac_fastapi
      - ./scripts:/app/scripts
    depends_on:
      - database
    command: bash -c "./scripts/wait-for-it.sh database:5432 && python -m stac_fastapi.pgstac.app"

  database:
    container_name: stac-db
    image: ghcr.io/stac-utils/pgstac:v0.6.12
    environment:
      - POSTGRES_USER=username
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=postgis
      - PGUSER=username
      - PGPASSWORD=password
      - PGHOST=localhost
      - PGDATABASE=postgis
    ports:
      - "5439:5432"
    command: postgres -N 500


##################################
#
# CyVerse Data Testing Stuff
#
##################################

# Load cyverse demo dataset into the SQLAlchemy Application
  loadcyverse-sqlalchemy:
    image: ghcr.io/stac-utils/stac-fastapi:main-sqlalchemy
    environment:
      - ENVIRONMENT=development
      - POSTGRES_USER=username
      - POSTGRES_PASS=password
      - POSTGRES_DBNAME=postgis
      - POSTGRES_HOST=database
      - POSTGRES_PORT=5432
    volumes:
      - ../cyverse-stac:/app/cyverse-stac
      - ./stac_fastapi:/app/stac_fastapi
      - ./scripts:/app/scripts
    command: >
      bash -c "/app/scripts/wait-for-it.sh app-sqlalchemy:8081 -t 60 && cd /app/stac_fastapi/sqlalchemy && alembic upgrade head && python /app/scripts/ingest_cyverse.py http://app-sqlalchemy:8081"
    depends_on:
      - database
      - app-sqlalchemy

  # Load cyverse demo dataset into the PGStac Application
  loadcyverse-pgstac:
    image: ghcr.io/stac-utils/stac-fastapi:main-pgstac
    environment:
      - ENVIRONMENT=development
    volumes:
      - ../cyverse-stac:/app/cyverse-stac
      - ./stac_fastapi:/app/stac_fastapi
      - ./scripts:/app/scripts
    command:
      - "/app/scripts/wait-for-it.sh"
      - "-t"
      - "60"
      - "app-pgstac:8082"
      - "--"
      - "python"
      - "/app/scripts/ingest_cyverse.py"
      - "http://app-pgstac:8082"
    depends_on:
      - database
      - app-pgstac


########################################
## End CyVerse Testing Stuff
##
#########################################
networks:
  default:
    name: stac-fastapi-network

cyverse-gis/cyverse-stac

Clone the cyverse-stac repository to the home directory where you cloned stac-fastapi

git clone https://github.com/cyverse-gis/cyverse-stac

Inside the /cyverse-stac repo is a directory called catalogs - this is where we are maintaining the list of public STAC Collections in CyVerse.

The catalogs directory is ingested by the ingest_cyverse.py file in the stac-fastapi/scripts directory.

The docker-compose.yml is also modified to include the relative path to the cyverse-stac directory

Edit ingest_cyverse.py

ingest_cyverse.py
"""Ingest sample data during docker-compose"""
import json
import sys
from pathlib import Path
from urllib.parse import urljoin

import requests

workingdir = Path(__file__).parent.absolute()

app_host = sys.argv[1]

if not app_host:
    raise Exception("You must include full path/port to stac instance")

def post_or_put(url: str, data: dict):
    """Post or put data to url."""
    r = requests.post(url, json=data)
    if r.status_code == 409:
        new_url = url if data["type"] == "Collection" else url + f"/{data['id']}"
        # Exists, so update
        r = requests.put(new_url, json=data)
        # Unchanged may throw a 404
        if not r.status_code == 404:
            r.raise_for_status()
    else:
        r.raise_for_status()

##################################
#
# Begin Collection Ingestion
#
##################################

## Ingest Open Forest Observatory Collections

ofodata =  Path("/app/cyverse-stac/catalogs/ofo")

def ingest_ofo_data(app_host: str = app_host, data_dir: Path = ofodata):
    """ingest data."""

    with open(data_dir / "collection.json") as f:
        collection = json.load(f)

    post_or_put(urljoin(app_host, "/collections"), collection)

    with open(data_dir / "index.geojson") as f:
        index = json.load(f)

    for feat in index["features"]:
        post_or_put(urljoin(app_host, f"collections/{collection['id']}/items"), feat)


# Ingest Arizona Experiment Station Collections 

srerdata =  Path("/app/cyverse-stac/catalogs/arizona-experiment-station")

def ingest_cyverse_data(app_host: str = app_host, data_dir: Path = srerdata):
    """ingest data."""

    with open(data_dir / "collection.json") as f:
        collection = json.load(f)

    post_or_put(urljoin(app_host, "/collections"), collection)

    with open(data_dir / "index.geojson") as f:
        index = json.load(f)

    for feat in index["features"]:
        post_or_put(urljoin(app_host, f"collections/{collection['id']}/items"), feat)

## Ingest Joplin Collections

joplindata = Path("/app/cyverse-stac/catalogs/joplin")

def ingest_joplin_data(app_host: str = app_host, data_dir: Path = joplindata):
    """ingest data."""

    with open(data_dir / "collection.json") as f:
        collection = json.load(f)

    post_or_put(urljoin(app_host, "/collections"), collection)

    with open(data_dir / "index.geojson") as f:
        index = json.load(f)

    for feat in index["features"]:
        post_or_put(urljoin(app_host, f"collections/{collection['id']}/items"), feat)

if __name__ == "__main__":
    ingest_cyverse_data()
    ingest_joplin_data()
    ingest_ofo_data()



Start Docker-Compose

cd ~/stac-fastapi

In the /stac-fastapi directory there is a docker-compose.yml which is set to point at a demo dataset called joplin you will need to modify the docker-compose.yml and point it at a new collection.json and index.geojson

Note: if you do not modify the docker-compose.yml the sample dataset will be shown.


Start Docker-Compose in detached mode. The -d flag will start Docker Compose in the background

docker-compose up -d 

Start CaddyServer

Caddy privileges will need to be escalated before it can use port :443

sudo setcap CAP_NET_BIND_SERVICE=+eip $(which caddy)

Start a fresh tmux session

caddy reverse-proxy --from stac.cyverse.org --to localhost:8080 --change-host-header &

Detach the session


Last update: 2024-03-21