Skip to content

Dataset Card

This is a template for Hugging Face dataset cards tailored for geospatial embedding datasets. Copy the markdown below and paste it into your dataset's README.md on Hugging Face Hub. Replace the placeholders in {braces} with your own information.

Template

yaml
---
# === Basic Information (Required) ===
language:
- {lang_0}
- {lang_1}

license: {license}
license_name: {license_name}
license_link: {license_link}
license_details: {license_details}

creator: {creator}
funder: {funder}

tags:
- {tag_0}
- {tag_1}
- {tag_2}

# === Embedding Properties (Required) ===
embedding_spatial_types:
- {embedding_spatial_type_0}
- {embedding_spatial_type_1}

embedding_temporal_type:
- {embedding_temporal_type_0}
- {embedding_temporal_type_1}

embedding_spatial_context: {embedding_spatial_context}
embedding_temporal_context: {embedding_temporal_context}
embedding_dimension: {embedding_dimension}

grid_spacing:
- {x_meters, y_meters}
- {x_meters, y_meters}

# === Source Information (Required) ===
inference_datasets:
- {source_dataset_0}
- {source_dataset_1}

model_name: {model_name}
model_link: {model_link}

# === Processing Details (Optional) ===
postprocessing: {postprocessing}
quantization: {quantization}

# === Standard HF Fields (Optional) ===
paperswithcode_id: {paperswithcode_id}

# === Dataset Info (Optional) ===
dataset_info:
  features:
    - name: {feature_name_0}
      dtype: {feature_dtype_0}
    - name: {feature_name_1}
      dtype: {feature_dtype_1}
  download_size: {download_size}
  dataset_size: {dataset_size}

# === Access Control (Optional) ===
extra_gated_fields:
- {field_name_0}: {field_type_0}
- {field_name_1}: {field_type_1}

extra_gated_prompt: {extra_gated_prompt}
---

Field Reference

Basic Information

FieldRequiredDescriptionExample
languageNoLanguage codesfr, en
licenseYesLicense identifier from HF licensesapache-2.0
license_nameNoCustom license ID (if license = other)my-license-1.0
license_linkNoPath or URL to license file (if license = other)LICENSE.md
license_detailsNoLegacy textual description of a custom license
creatorYesOrganization or individual that created the datasetNASA
funderNoFunding institutionsNSF, ESA
tagsNoSearchable tagsgeospatial, embeddings, sentinel-2

Embedding Properties

FieldRequiredDescriptionAcceptable Values
embedding_spatial_typesYesSpatial type of embeddingspixel, patch, scene
embedding_temporal_typeYesTemporal type of embeddingssingle-date, multi-date
embedding_spatial_contextYesSpatial context scopespatial context determined by embedding spatial type, spatial context beyond embedding spatial type
embedding_temporal_contextYesTemporal context scopetemporal context determined by embedding spatial type, spatial context beyond embedding temporal type
embedding_dimensionYesEmbedding vector size (integer)768
grid_spacingYesSize in meters of the output footprint (x, y)10, 10

Source Information

FieldRequiredDescriptionExample
inference_datasetsYesSource data products used to generate embeddingssentinel-2-l2a, sentinel-1-rtc
model_nameYesName of the model used to generate embeddingsClay-v1
model_linkYesURL to the model cardhttps://huggingface.co/...

Processing Details (Optional)

FieldDescriptionExample
postprocessingAny processing applied to embeddings after generation (e.g., smoothing)
quantizationAny quantization applied to embedding values, or nullint8

Dataset Info (Optional)

FieldDescriptionExample
featuresList of features with name and dtypename: id, dtype: int32
download_sizeDownload size in bytes35142551
dataset_sizeDataset size in bytes89789763

Access Control (Optional)

Use these fields if you want your dataset protected behind a gate. See HF gated datasets for more info.

FieldDescriptionExample
extra_gated_fieldsFields users must fill out to accessName: text, Email: text, Affiliation: text
extra_gated_promptMessage shown to users requesting access