Step
BaseStep
¶
Class representing a step in a processing pipeline.
Entry point is __call__
.
Users would inherit either LocalStep or GlobalStep.
Step can be cached (on by default: cache_step=True
) to prevent re-computation.
Individual steps can disable caching if and only if they do not manipulate the dataset, as re-computation of later steps is conditioned on the necessity of caching.
Source code in ragfit/processing/step.py
__call__(datasets, **kwargs)
¶
Pipeline is running these steps using __call__
.
calc_hash()
¶
Calculate hash for a step based on its properties.
Updates the step_hash
property.
Source code in ragfit/processing/step.py
get_hash()
¶
process(dataset_name, datasets, **kwargs)
¶
process_inputs(datasets, **kwargs)
¶
Run the step process
function for each dataset in inputs
.
GlobalStep
¶
Bases: BaseStep
Class representing a step in a processing pipeline, processing the entire dataset.
The function to overwrite is process_all
; the function accepts the dataset and all the other datasets, if needed.
Source code in ragfit/processing/step.py
LocalStep
¶
Bases: BaseStep
Class representing a step in a processing pipeline, processing individual examples.
The function to overwrite is process_item
; the function accepts an item, index, and all the other datasets, if needed.