Output
HFHubOutput
¶
Bases: GlobalStep
Simple class to output the dataset to Hugging Face Hub.
Caching is disabled as this step does not manipulate the dataset hence no need for caching.
Source code in ragfit/processing/global_steps/output.py
__init__(hfhub_tag, private=True, **kwargs)
¶
Parameters:
-
hfhub_tag
(str
) –Tag for the Hugging Face Hub.
-
private
(bool
, default:True
) –Whether the dataset should be private or not. Default is True.
Source code in ragfit/processing/global_steps/output.py
OutputData
¶
Bases: GlobalStep
Simple class to output the dataset to a jsonl file.
Caching is disabled as this step does not manipulate the dataset hence no need for caching.
Source code in ragfit/processing/global_steps/output.py
__init__(prefix, filename=None, directory=None, **kwargs)
¶
Parameters:
-
prefix
(str
) –Prefix for the output.
-
filename
(str
, default:None
) –Name of the output file. If not provided, the output file name will be generated based on the prefix and dataset name.
-
directory
(str
, default:None
) –Directory to save the output file. If not provided, the output file will be saved in the current directory.
The output name is {prefix}-{dataset_keyname/filename}.jsonl
if filename
is not provided.