nlp_architect.pipelines package

Submodules

nlp_architect.pipelines.spacy_bist module

class nlp_architect.pipelines.spacy_bist.SpacyBISTParser(verbose=False, spacy_model='en', bist_model=None)[source]

Bases: object

Main class which handles parsing with Spacy-BIST parser.

Parameters:	verbose (bool, optional) – Controls output verbosity. spacy_model (str, optional) – Spacy model to use https ((see) – //spacy.io/api/top-level#spacy.load). bist_model (str, optional) – Path to a .model file to load. Defaults pre-trained model’.

dir = PosixPath('/home/runner/nlp-architect/cache/bist-pretrained')

parse(doc_text, show_tok=True, show_doc=True)[source]

Parse a raw text document.

Parameters:	doc_text (str) – show_tok (bool, optional) – Specifies whether to include token text in output. show_doc (bool, optional) – Specifies whether to include document text in output.
Returns:	The annotated document.
Return type:	CoreNLPDoc

to_conll(doc_text)[source]

Converts a document to CoNLL format with spacy POS tags.

Parameters:	doc_text (str) – raw document text.
Yields:	list of ConllEntry – The next sentence in the document in CoNLL format.

nlp_architect.pipelines.spacy_np_annotator module

class nlp_architect.pipelines.spacy_np_annotator.NPAnnotator(model, word_vocab, char_vocab, chunk_vocab, batch_size: int = 32)[source]

Bases: object

Spacy based NP annotator - uses models.SequenceChunker model for annotation

Parameters:	model (SequenceChunker) – a chunker model word_vocab (Vocabulary) – word-id vocabulary of the model char_vocab (Vocabulary) – char id vocabulary of words of the model chunk_vocab (Vocabulary) – chunk tag vocabulary of the model batch_size (int, optional) – inference batch size

classmethod load(model_path: str, parameter_path: str, batch_size: int = 32, use_cudnn: bool = False)[source]

Load a NPAnnotator annotator

Parameters:	model_path (str) – path to trained model parameter_path (str) – path to model parameters batch_size (int, optional) – inference batch_size use_cudnn (bool, optional) – use gpu for inference (cudnn cells)
Returns:	NPAnnotator class with loaded model

class nlp_architect.pipelines.spacy_np_annotator.SpacyNPAnnotator(model_path, settings_path, spacy_model='en', batch_size=32, use_cudnn=False)[source]

Bases: object

Simple Spacy pipe with NP extraction annotations

nlp_architect.pipelines.spacy_np_annotator.get_noun_phrases(doc: spacy.tokens.doc.Doc) → [<class 'spacy.tokens.span.Span'>][source]

Get noun phrase tags from a spacy annotated document.

Parameters:	doc (Doc) – a spacy type document
Returns:	a list of noun phrase Span objects

nlp_architect.pipelines.spacy_np_annotator.set_noun_phrases(doc: spacy.tokens.doc.Doc, nps: [<class 'spacy.tokens.span.Span'>]) → None[source]

Set noun phrase tags

Parameters:	doc (Doc) – a spacy type document nps ([Span]) – a list of Spans

nlp_architect.pipelines package

Submodules

nlp_architect.pipelines.spacy_bist module

nlp_architect.pipelines.spacy_np_annotator module

Module contents