emmi_data_management.interfaces.s3_uploader

Classes

S3FileUploader

Asynchronously uploads existing local files to an S3 path.

Module Contents

class emmi_data_management.interfaces.s3_uploader.S3FileUploader(local_root, s3_path, workers=4, max_inflight=256)

Asynchronously uploads existing local files to an S3 path.

This class uses a thread pool to upload multiple files in parallel. It’s designed to run after a process (like AsyncWriter) has finished writing files to a local directory.

Its API mirrors AsyncWriter, but write() takes a local file path.

local_root

The root directory of the local files. Used to calculate the relative path for S3 keys.

Type:

Path

bucket

The name of the S3 bucket.

Type:

str

prefix

The key prefix (folder) within the S3 bucket.

Type:

str

Raises:
  • ValueError – If s3_path doesn’t start with ‘s3://’.

  • ValueError – If local path is not inside a local directory.

  • ImportError – If rich is not installed and show_progress is not True.

Parameters:
  • local_root (pathlib.Path | str) – The local root directory (e.g., ‘./my_local_results’).

  • s3_path (str) – The full S3 destination path (e.g., ‘s3://my-bucket/results/’).

  • workers (int) – Thread-pool size for parallel uploads.

  • max_inflight (int) – Max number of files to upload in parallel before blocking the main thread.

local_root
bucket
prefix
max_inflight = 256
pool
s3_client
write(local_file)

Schedules a single local file for upload.

The S3 key is automatically determined from the file’s path relative to the local_root given at initialization.

Example

local_root = "/tmp/results"
S3FileUploader.write("/tmp/results/subdir/file.pt")
# Uploads to: s3://[bucket]/[prefix]/subdir/file.pt
Parameters:

local_file (pathlib.Path | str) – The path to the local file to upload.

Return type:

None

close()

Waits for all in-flight uploads to complete and shuts down the thread pool.

Propagates any exceptions raised by background upload tasks.

Return type:

None

upload_all(remove_source=False, show_progress=True)

A high-level helper to find all files in ‘local_root’, upload them, and optionally remove the local files.

This is a class-based alternative to the aws s3 sync command.

Parameters:
  • remove_source (bool) – If True, deletes the entire local_root directory after all uploads are successful.

  • show_progress (bool) – If True, shows a progress bar. Requires rich to be installed.

Return type:

None