Implementation of RAFT: Adapting Language Model to Domain Specific RAG.
This class compiles a list of negative documents with probability raft_p,
and a combination of positive and negative documents with probability 1 - raft_p.
Zhang, Tianjun, Shishir G. Patil, Naman Jain, Sheng Shen, Matei Zaharia,
Ion Stoica, and Joseph E. Gonzalez. 2024. “RAFT: Adapting Language Model
to Domain Specific RAG.” arXiv. http://arxiv.org/abs/2403.10131.
Source code in ragfit/processing/local_steps/raft.py
classRAFTStep(LocalStep):""" Implementation of RAFT: Adapting Language Model to Domain Specific RAG. This class compiles a list of negative documents with probability `raft_p`, and a combination of positive and negative documents with probability 1 - `raft_p`. Zhang, Tianjun, Shishir G. Patil, Naman Jain, Sheng Shen, Matei Zaharia, Ion Stoica, and Joseph E. Gonzalez. 2024. “RAFT: Adapting Language Model to Domain Specific RAG.” arXiv. http://arxiv.org/abs/2403.10131. """def__init__(self,k:int=5,raft_p=0.5,neg_docs_num=5,positive_key="positive_passages",negative_key="negative_passages",output_key="docs",**kwargs,):""" Args: k (int): The number of positive passages to consider. raft_p (float, optional): The probability of using positive passages. Defaults to 0.5. neg_docs_num (int, optional): The number of negative passages to consider. Defaults to 2. positive_key (str, optional): The key containing the positive passages. Defaults to "positive_passages". negative_key (str, optional): The key containing the negative passages. Defaults to "negative_passages". output_key (str, optional): The key to store the output. Defaults to "docs". """super().__init__(**kwargs)self.k=kself.raft_p=raft_pself.neg_docs_num=neg_docs_numself.positive_key=positive_keyself.negative_key=negative_keyself.output_key=output_keydefprocess_item(self,item:dict,index,datasets,**kwargs):docs_pos=item[self.positive_key]docs_neg=item.get(self.negative_key,[])p=random.random()# nosecoracle=0ifp>self.raft_p:docs=docs_pos[:self.k]+docs_neg[:self.neg_docs_num]else:docs=docs_neg[:self.neg_docs_num]item[self.output_key]=docsreturnitem
def__init__(self,k:int=5,raft_p=0.5,neg_docs_num=5,positive_key="positive_passages",negative_key="negative_passages",output_key="docs",**kwargs,):""" Args: k (int): The number of positive passages to consider. raft_p (float, optional): The probability of using positive passages. Defaults to 0.5. neg_docs_num (int, optional): The number of negative passages to consider. Defaults to 2. positive_key (str, optional): The key containing the positive passages. Defaults to "positive_passages". negative_key (str, optional): The key containing the negative passages. Defaults to "negative_passages". output_key (str, optional): The key to store the output. Defaults to "docs". """super().__init__(**kwargs)self.k=kself.raft_p=raft_pself.neg_docs_num=neg_docs_numself.positive_key=positive_keyself.negative_key=negative_keyself.output_key=output_key