ksuit.data.pipeline.multistage¶

Attributes¶

`SampleProcessorType`
`BatchProcessorType`

Classes¶

MultiStagePipeline

A Collator that processes the list of samples into a batch in multiple stages:

Module Contents¶

ksuit.data.pipeline.multistage.SampleProcessorType¶

ksuit.data.pipeline.multistage.BatchProcessorType¶

class ksuit.data.pipeline.multistage.MultiStagePipeline(collators=None, sample_processors=None, batch_processors=None)¶

Bases: ksuit.data.pipeline.collator.Collator

A Collator that processes the list of samples into a batch in multiple stages:

sample_processors: Processing the data before collation on a per-sample level.
collators: Conversion from a list of samples into a batch (dict of usually tensors).
batch_processors: Processing after collation on a batch-level.

Most of the work is usually done by the sample_processors. One or two collators, and batch processors are often not needed. However this depends on the use case. .. rubric:: Example

>>> sample_processors = [MySampleProcessor1(), MySampleProcessor2()]
>>> collators = [MyCollator1(), MyCollator2()]
>>> batch_processors = [MyBatchProcessor1(), MyBatchProcessor2()]
>>> multistage_pipeline = MultiStagePipeline(
>>>     sample_processors=sample_processors,
>>>     collators=collators,
>>>     batch_processors=batch_processors
>>> )
>>> batch = multistage_pipeline(samples)

Parameters:

sample_processors (dict[str, SampleProcessorType] | list[SampleProcessorType] | None) – A list of callables that will be applied sequentially to pre-process on a per-sample level (e.g., subsample a pointcloud).
collators (dict[str, ksuit.data.pipeline.collator.CollatorType] | list[ksuit.data.pipeline.collator.CollatorType] | None) – A list of callables that will be applied sequentially to convert the list of individual samples into a batched format. If None, the default PyTorch collator will be used.
batch_processors (dict[str, BatchProcessorType] | list[BatchProcessorType] | None) – A list of callables that will be applied sequentially to process on a per-batch level.

sample_processors = []¶

batch_processors = []¶

get_sample_processor(predicate)¶

Retrieves a sample processor by a predicate function. Examples: - Search by type (assumes the sample processor type only occurs once in the list of sample processors)

pipeline.get_sample_processor(lambda p: isinstance(p, MySampleProcessorType))

Search by type and member pipeline.get_sample_processor(lambda p: isinstance(p, PointSamplingSampleProcessor) and “input_pos” in p.items)

Parameters:: predicate (collections.abc.Callable[[Any], bool]) – A function that is called for each processor and selects if this is the right one.
Returns:: The matching sample processor.
Return type:: Any
Raises:: ValueError – If no matching sample processor are found, multiple matching sample processors are found or if there are no sample processors.