emmi_data_management.interfaces.s3_uploader¶
Classes¶
Asynchronously uploads existing local files to an S3 path. |
Module Contents¶
- class emmi_data_management.interfaces.s3_uploader.S3FileUploader(local_root, s3_path, workers=4, max_inflight=256)¶
Asynchronously uploads existing local files to an S3 path.
This class uses a thread pool to upload multiple files in parallel. It’s designed to run after a process (like AsyncWriter) has finished writing files to a local directory.
Its API mirrors AsyncWriter, but write() takes a local file path.
- local_root¶
The root directory of the local files. Used to calculate the relative path for S3 keys.
- Type:
Path
- Raises:
ValueError – If s3_path doesn’t start with ‘s3://’.
ValueError – If local path is not inside a local directory.
ImportError – If rich is not installed and show_progress is not True.
- Parameters:
local_root (pathlib.Path | str) – The local root directory (e.g., ‘./my_local_results’).
s3_path (str) – The full S3 destination path (e.g., ‘s3://my-bucket/results/’).
workers (int) – Thread-pool size for parallel uploads.
max_inflight (int) – Max number of files to upload in parallel before blocking the main thread.
- local_root¶
- bucket¶
- prefix¶
- max_inflight = 256¶
- pool¶
- s3_client¶
- write(local_file)¶
Schedules a single local file for upload.
The S3 key is automatically determined from the file’s path relative to the local_root given at initialization.
Example
local_root = "/tmp/results" S3FileUploader.write("/tmp/results/subdir/file.pt") # Uploads to: s3://[bucket]/[prefix]/subdir/file.pt
- Parameters:
local_file (pathlib.Path | str) – The path to the local file to upload.
- Return type:
None
- close()¶
Waits for all in-flight uploads to complete and shuts down the thread pool.
Propagates any exceptions raised by background upload tasks.
- Return type:
None
- upload_all(remove_source=False, show_progress=True)¶
A high-level helper to find all files in ‘local_root’, upload them, and optionally remove the local files.
This is a class-based alternative to the aws s3 sync command.