How to Use Private Data Source¶
Warning
The information below is incomplete and requires extra work. Use it at your own risk!
Interface and Examples¶
EDM supports two types of executions:
emmi-data <SERVICE_NAME> <COMMAND>— specify a service name (each has unique commands).emmi-data <COMMAND>— run a global command.
Hugging Face¶
emmi-data huggingface estimate EmmiAI/AB-UPT
emmi-data huggingface ext EmmiAI/NeuralDEM .th ~/data --type model --manifest-out manifest.json
The ext command downloads all .th files from EmmiAI/NeuralDEM into ~/data.
The --manifest-out option writes a manifest for integrity checks.
AWS¶
emmi-data aws estimate noaa-goes16 ABI-L1b-RadC/2023/001/00/
emmi-data aws fetch my-bucket data/prefix/ ./data --extension .parquet --manifest-out s3-manifest.json
The fetch command downloads only .parquet files into ./data, while creating
a manifest file.
Verification¶
Verification determines whether files are complete. If manifest.json exists, corrupted
or missing files can be redownloaded:
emmi-data verification check -r ./data -m manifest.json --action redownload
If no manifest exists, create one with:
emmi-data verification build -r ./data -m manifest.json
To explore all options, use the --help flag.