nlp_architect.models package

Subpackages

Submodules

nlp_architect.models.bist_parser module

class nlp_architect.models.bist_parser.BISTModel(activation='tanh', lstm_layers=2, lstm_dims=125, pos_dims=25)[source]

Bases: object

BIST parser model class. This class handles training, prediction, loading and saving of a BIST parser model. After the model is initialized, it accepts a CoNLL formatted dataset as input, and learns to output dependencies for new input.

Parameters:	activation (str, optional) – Activation function to use. lstm_layers (int, optional) – Number of LSTM layers to use. lstm_dims (int, optional) – Number of LSTM dimensions to use. pos_dims (int, optional) – Number of part-of-speech embedding dimensions to use.

model

The underlying LSTM model.

Type:	MSTParserLSTM

params

Additional parameters and resources for the model.

Type:	tuple

options

User model options.

Type:	dict

fit(dataset, epochs=10, dev=None)[source]

Trains a BIST model on an annotated dataset in CoNLL file format.

Parameters:	dataset (str) – Path to input dataset for training, formatted in CoNLL/U format. epochs (int, optional) – Number of learning iterations. dev (str, optional) – Path to development dataset for conducting evaluations.

load(path)[source]: Loads and initializes a BIST model from file.

predict(dataset, evaluate=False)[source]

Runs inference with the BIST model on a dataset in CoNLL file format.

Parameters:	dataset (str) – Path to input CoNLL file. evaluate (bool, optional) – Write prediction and evaluation files to dataset’s folder.
Returns:	The list of input sentences with predicted dependencies attached.
Return type:	res (list of list of ConllEntry)

predict_conll(dataset)[source]

Runs inference with the BIST model on a dataset in CoNLL object format.

Parameters:	dataset (list of list of ConllEntry) – Input in the form of ConllEntry objects.
Returns:	The list of input sentences with predicted dependencies attached.
Return type:	res (list of list of ConllEntry)

save(path)[source]: Saves the BIST model to file.

nlp_architect.models.chunker module

class nlp_architect.models.chunker.SequenceChunker(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence Chunker model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:	x – samples for inference batch_size (int, optional) – forward pass batch size
Returns:	tuple of numpy arrays of chunk labels

class nlp_architect.models.chunker.SequencePOSTagger(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence POS tagger model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:	x – samples for inference batch_size (int, optional) – forward pass batch size
Returns:	tuple of numpy arrays of POS labels

class nlp_architect.models.chunker.SequenceTagger(use_cudnn=False)[source]

Bases: object

A sequence tagging model for POS and Chunks written in Tensorflow (and Keras) based on the paper ‘Deep multi-task learning with low level tasks supervised at lower layers’. The model has 3 Bi-LSTM layers and outputs POS and Chunk tags.

Parameters:	use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)

build(vocabulary_size, num_pos_labels, num_chunk_labels, char_vocab_size=None, max_word_len=25, feature_size=100, dropout=0.5, classifier='softmax', optimizer=None)[source]

Build a chunker/POS model

Parameters:

vocabulary_size (int) – the size of the input vocabulary
num_pos_labels (int) – the size of of POS labels
num_chunk_labels (int) – the sie of chunk labels
char_vocab_size (int, optional) – character vocabulary size
max_word_len (int, optional) – max characters in a word
feature_size (int, optional) – feature size - determines the embedding/LSTM layer hidden state size
dropout (float, optional) – dropout rate
classifier (str, optional) – classifier layer, ‘softmax’ for softmax or ‘crf’ for conditional random fields classifier. default is ‘softmax’.
optimizer (tensorflow.python.training.optimizer.Optimizer, optional) – optimizer, if None will use default SGD (paper setup)

fit(x, y, batch_size=1, epochs=1, validation_data=None, callbacks=None)[source]

Fit provided X and Y on built model

Parameters:	x – x samples y – y samples batch_size (int, optional) – batch size per sample epochs (int, optional) – number of epochs to run before ending training process validation_data (optional) – x and y samples to validate at the end of the epoch callbacks (optional) – additional callbacks to run with fitting

load(filepath)[source]

Load model from disk

Parameters:	filepath (str) – file name of model

load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:	weights (numpy.ndarray) – 2D matrix of word weights

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:	x – samples for inference batch_size (int, optional) – forward pass batch size
Returns:	tuple of numpy arrays of pos and chunk labels

save(filepath)[source]

Save the model to disk

Parameters:	filepath (str) – file name to save model

nlp_architect.models.cross_doc_sieves module

nlp_architect.models.cross_doc_sieves.run_entity_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on Entity mentions :param topics: The Topics (with mentions) to evaluate :param resources: (SievesContainerInitialization) resources for running the evaluation

Returns:	List of topics and mentions with predicted cross doc coref within each topic
Return type:	Clusters

nlp_architect.models.cross_doc_sieves.run_event_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on event mentions :param topics: The Topics (with mentions) to evaluate :param resources: resources for running the evaluation

Returns:	List of clusters and mentions with predicted cross doc coref within each topic
Return type:	Clusters

nlp_architect.models.crossling_emb module

class nlp_architect.models.crossling_emb.Discriminator(input_data, Y, lr_ph)[source]

Bases: object

build_train_graph(disc_pred)[source]: Builds training graph for discriminator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.Generator(src_ten, tgt_ten, emb_dim, batch_size, smooth_val, lr_ph, beta, vocab_size)[source]

Bases: object

build_train_graph(disc_pred)[source]: Builds training graph for generator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.WordTranslator(hparams, src_vec, tgt_vec, vocab_size)[source]

Bases: object

Main network which does cross-lingual embeddings training

apply_procrustes(sess, final_pairs)[source]: Applies procrustes to W matrix for better mapping :param sess: Tensorflow Session :type sess: tf.session :param final_pairs: Array of pairs which are mutual neighbors :type final_pairs: ndarray

generate_xling_embed(sess, src_dict, tgt_dict, tgt_vec)[source]: Generates cross lingual embeddings :param sess: Tensorflow session :type sess: tf.session

static report_metrics(iters, n_words_proc, disc_cost_acc, tic)[source]: Reports metrics of how training is going

run(sess, local_lr)[source]: Runs whole GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_discriminator(sess, local_lr)[source]: Runs discriminator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_generator(sess, local_lr)[source]

Runs generator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

Returns:	Returns number of words processed

save_model(save_model, sess)[source]: Saves W in mapper as numpy array based on CSLS criterion :param save_model: Save model if True :type save_model: bool :param sess: Tensorflow Session :type sess: tf.session

static set_lr(local_lr, drop_lr)[source]: Drops learning rate based on CSLS criterion :param local_lr: Learning Rate :type local_lr: float :param drop_lr: Drop learning rate by 2 if True :type drop_lr: bool

nlp_architect.models.intent_extraction module

class nlp_architect.models.intent_extraction.IntentExtractionModel[source]

Bases: object

Intent Extraction model base class (using tf.keras)

fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:	x – input samples y – input sample labels epochs (`int`, optional) – number of epochs to train batch_size (`int`, optional) – batch size callbacks (`Callback`, optional) – Keras compatible callbacks validation (`list` of `numpy.ndarray`, optional) – optional validation data to be evaluated when training

input_shape

Get input shape

Type:	`tuple`

load(path)[source]

Load a trained model

Parameters:	path (str) – path to model file

load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:	weights (numpy.ndarray) – 2D matrix of word weights

predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:	x – samples to run through the model batch_size (`int`, optional) – batch size:
Returns:	predicted values by the model
Return type:	numpy.ndarray

save(path, exclude=None)[source]

Save model to path

Parameters:	path (str) – path to save model exclude (list, optional) – a list of object fields to exclude when saving

class nlp_architect.models.intent_extraction.MultiTaskIntentModel(use_cudnn=False)[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Multi-Task Intent and Slot tagging model (using tf.keras)

Parameters:	use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)

build(word_length, num_labels, num_intent_labels, word_vocab_size, char_vocab_size, word_emb_dims=100, char_emb_dims=30, char_lstm_dims=30, tagger_lstm_dims=100, dropout=0.2)[source]

Build a model

Parameters:

word_length (int) – max word length (in characters)
num_labels (int) – number of slot labels
num_intent_labels (int) – number of intent classes
word_vocab_size (int) – word vocabulary size
char_vocab_size (int) – character vocabulary size
word_emb_dims (int, optional) – word embedding dimensions
char_emb_dims (int, optional) – character embedding dimensions
char_lstm_dims (int, optional) – character feature LSTM hidden size
tagger_lstm_dims (int, optional) – tagger LSTM hidden size
dropout (float, optional) – dropout rate

save(path)[source]

Save model to path

Parameters:	path (str) – path to save model

class nlp_architect.models.intent_extraction.Seq2SeqIntentModel[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Encoder Decoder Deep LSTM Tagger Model (using tf.keras)

build(vocab_size, tag_labels, token_emb_size=100, encoder_depth=1, decoder_depth=1, lstm_hidden_size=100, encoder_dropout=0.5, decoder_dropout=0.5)[source]

Build the model

Parameters:

vocab_size (int) – vocabulary size
tag_labels (int) – number of tag labels
token_emb_size (int, optional) – token embedding vector size
encoder_depth (int, optional) – number of encoder LSTM layers
decoder_depth (int, optional) – number of decoder LSTM layers
lstm_hidden_size (int, optional) – LSTM layers hidden size
encoder_dropout (float, optional) – encoder dropout
decoder_dropout (float, optional) – decoder dropout

nlp_architect.models.most_common_word_sense module

class nlp_architect.models.most_common_word_sense.MostCommonWordSense(epochs, batch_size, callback_args=None)[source]

Bases: object

build(input_dim)[source]

eval(valid_set)[source]

fit(train_set)[source]

get_outputs(valid_set)[source]

load(model_path)[source]

save(save_path)[source]

nlp_architect.models.ner_crf module

class nlp_architect.models.ner_crf.NERCRF(use_cudnn=False)[source]

Bases: object

Bi-LSTM NER model with CRF classification layer (tf.keras model)

Parameters:	use_cudnn (bool, optional) – use cudnn LSTM cells

build(word_length, target_label_dims, word_vocab_size, char_vocab_size, word_embedding_dims=100, char_embedding_dims=16, tagger_lstm_dims=200, dropout=0.5)[source]

Build a NERCRF model

Parameters:

word_length (int) – max word length in characters
target_label_dims (int) – number of entity labels (for classification)
word_vocab_size (int) – word vocabulary size
char_vocab_size (int) – character vocabulary size
word_embedding_dims (int) – word embedding dimensions
char_embedding_dims (int) – character embedding dimensions
tagger_lstm_dims (int) – word tagger LSTM output dimensions
dropout (float) – dropout rate

fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:	x (numpy.ndarray or `numpy.ndarray`) – input samples y (numpy.ndarray) – input sample labels epochs (`int`, optional) – number of epochs to train batch_size (`int`, optional) – batch size callbacks (`Callback`, optional) – Keras compatible callbacks validation (`list` of `numpy.ndarray`, optional) – optional validation data to be evaluated when training

load(path)[source]

Load model weights

Parameters:	path (str) – path to load model from

load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:	weights (numpy.ndarray) – 2D matrix of word weights

predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:	x (numpy.ndarray or `numpy.ndarray`) – input samples batch_size (`int`, optional) – batch size
Returns:	predicted values by the model
Return type:	numpy.ndarray

save(path)[source]

Save model to path

Parameters:	path (str) – path to save model weights

nlp_architect.models.np2vec module

class nlp_architect.models.np2vec.NP2vec(corpus, corpus_format='txt', mark_char='_', word_embedding_type='word2vec', sg=0, size=100, window=10, alpha=0.025, min_alpha=0.0001, min_count=5, sample=1e-05, workers=20, hs=0, negative=25, cbow_mean=1, iterations=15, min_n=3, max_n=6, word_ngrams=1, prune_non_np=True)[source]

Bases: object

Initialize the np2vec model, train it, save it and load it.

is_marked(s)[source]

Check if a string is marked.

Parameters:	s (str) – string to check

classmethod load(np2vec_model_file, binary=False, word_ngrams=0, word2vec_format=True)[source]

Load the np2vec model.

Parameters:

np2vec_model_file (str) – the file containing the np2vec model to load
binary (bool) – boolean indicating whether the np2vec model to load is in binary format
word_ngrams (int {1,0}) – If 1, np2vec model to load uses word vectors with subword (
information. (ngrams)) –
word2vec_format (bool) – boolean indicating whether the model to load has been stored in
word2vec format. (original) –

Returns:

np2vec model to load

save(np2vec_model_file='np2vec.model', binary=False, word2vec_format=True)[source]

Save the np2vec model.

Parameters:	np2vec_model_file (str) – the file containing the np2vec model to load binary (bool) – boolean indicating whether the np2vec model to load is in binary format word2vec_format (bool) – boolean indicating whether to save the model in original format. (word2vec) –

nlp_architect.models.np_semantic_segmentation module

class nlp_architect.models.np_semantic_segmentation.NpSemanticSegClassifier(num_epochs, callback_args, loss='binary_crossentropy', optimizer='adam', batch_size=128)[source]

Bases: object

NP Semantic Segmentation classifier model (based on tf.Keras framework).

Parameters:	num_epochs (int) – number of epochs to train the model *callback_args (dict) – callback args keyword arguments to init a Callback for the model loss* – the model’s cost function. Default is ‘tf.keras.losses.binary_crossentropy’ loss optimizer (`tf.keras.optimizers`) – the model’s optimizer. Default is ‘adam’

build(input_dim)[source]: Build the model’s layers :param input_dim: the first layer’s input_dim :type input_dim: int

eval(test_set)[source]

Evaluate the model’s test_set on error_rate, test_accuracy_rate and precision_recall_rate

Parameters:	test_set (`numpy.ndarray`) – The test set
Returns:	loss, binary_accuracy, precision, recall and f1 measures
Return type:	tuple(float)

fit(train_set)[source]

Train and fit the model on the datasets

Parameters:	train_set (`numpy.ndarray`) – The train set args – callback_args and epochs from ArgParser input

get_outputs(test_set)[source]

Classify the dataset on the model

Parameters:	test_set (`numpy.ndarray`) – The test set
Returns:	model’s predictions
Return type:	list(`numpy.ndarray`)

load(model_path)[source]

Load pre-trained model’s .h5 file to NpSemanticSegClassifier object

Parameters:	model_path (str) – local path for loading the model

save(model_path)[source]

Save the model’s prm file in model_path location

Parameters:	model_path (str) – local path for saving the model

nlp_architect.models.np_semantic_segmentation.f1(y_true, y_pred)[source]

Parameters:	y_true – y_pred –

Returns:

nlp_architect.models.np_semantic_segmentation.precision_score(y_true, y_pred)[source]

Precision metric.

Only computes a batch-wise average of precision.

Computes the precision, a metric for multi-label classification of how many selected items are relevant.

nlp_architect.models.np_semantic_segmentation.recall_score(y_true, y_pred)[source]

Recall metric.

Only computes a batch-wise average of recall.

Computes the recall, a metric for multi-label classification of how many relevant items are selected.

nlp_architect.models.pretrained_models module

class nlp_architect.models.pretrained_models.AbsaModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained ABSA model

files = ['rerank_model.h5']

sub_path = 'models/absa/'

class nlp_architect.models.pretrained_models.BistModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained BIST model

files = ['bist-pretrained.zip']

sub_path = 'models/dep_parse/'

class nlp_architect.models.pretrained_models.ChunkerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Chunker model

files = ['model.h5', 'model_info.dat.params']

sub_path = 'models/chunker/'

class nlp_architect.models.pretrained_models.IntentModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Intent model

files = ['model_info.dat', 'model.h5']

sub_path = 'models/intent/'

class nlp_architect.models.pretrained_models.MrcModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained MRC model

files = ['mrc_data.zip', 'mrc_model.zip']

sub_path = 'models/mrc/'

class nlp_architect.models.pretrained_models.NerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained NER model

files = ['model_v4.h5', 'model_info_v4.dat']

sub_path = 'models/ner/'

class nlp_architect.models.pretrained_models.PretrainedModel(model_name, sub_path, files)[source]

Bases: object

Generic class to download the pre-trained models

Usage Example:

chunker = ChunkerModel.get_instance() chunker2 = ChunkerModel.get_instance() print(chunker, chunker2) print(“Local File path = “, chunker.get_file_path()) files_models = chunker2.get_model_files() for idx, file_name in enumerate(files_models):

print(str(idx) + “: ” + file_name)

get_file_path()[source]: Return local file path of downloaded model files

classmethod get_instance()[source]: Static instance access method :param cls: Calling class :type cls: Class name

get_model_files()[source]: Return individual file names of downloaded models

nlp_architect.models.tagging module

class nlp_architect.models.tagging.InputFeatures(input_ids, char_ids, shape_ids, mask=None, label_id=None)[source]

Bases: object

A single set of features of data.

class nlp_architect.models.tagging.NeuralTagger(embedder_model, word_vocab: nlp_architect.utils.text.Vocabulary, labels: List[str] = None, use_crf: bool = False, device: str = 'cpu', n_gpus=0)[source]

Bases: nlp_architect.models.TrainableModel

Simple neural tagging model Supports pytorch embedder models, multi-gpu training, KD from teacher models

Parameters:

embedder_model – pytorch embedder model (valid nn.Module model)
word_vocab (Vocabulary) – word vocabulary
labels (List, optional) – list of labels. Defaults to None
use_crf (bool, optional) – use CRF a the classifier (instead of Softmax). Defaults to False.
device (str, optional) – device backend. Defatuls to ‘cpu’.
n_gpus (int, optional) – number of gpus. Default to 0.

static batch_mapper(batch)[source]: Map batch to correct input names

convert_to_tensors(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], max_seq_length: int = 128, max_word_length: int = 12, pad_id: int = 0, labels_pad_id: int = 0, include_labels: bool = True) → torch.utils.data.dataset.TensorDataset[source]

Convert examples to valid tagger dataset

Parameters:	examples (List[TokenClsInputExample]) – List of examples max_seq_length (int, optional) – max words per sentence. Defaults to 128. max_word_length (int, optional) – max characters in a word. Defaults to 12. pad_id (int, optional) – padding int id. Defaults to 0. labels_pad_id (int, optional) – labels padding id. Defaults to 0. include_labels (bool, optional) – include labels in dataset. Defaults to True.
Returns:	TensorDataset for given examples
Return type:	TensorDataset

evaluate(data_set: torch.utils.data.dataloader.DataLoader)[source]

Run evaluation on given dataloader

Parameters:	data_set (DataLoader) – a data loader to run evaluation on
Returns:	logits, labels (if labels are given)

evaluate_predictions(logits, label_ids)[source]

Evaluate given logits on truth labels

Parameters:	logits – logits of model label_ids – truth label ids
Returns:	dictionary containing P/R/F1 metrics
Return type:	dict

extract_labels(label_ids, logits)[source]

get_logits(batch)[source]: get model logits from given input

get_optimizer(opt_fn=None, lr: int = 0.001)[source]

Get default optimizer

Parameters:	lr (int, optional) – learning rate. Defaults to 0.001.
Returns:	optimizer
Return type:	torch.optim.Optimizer

inference(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], batch_size: int = 64)[source]

Do inference on given examples

Parameters:	examples (List[TokenClsInputExample]) – examples batch_size (int, optional) – batch size. Defaults to 64.
Returns:	a list of tuples of tokens, tags predicted by model
Return type:	List(tuple)

classmethod load_model(model_path: str)[source]

Load a tagger model from given path

Parameters:	model_path (str) – model path NeuralTagger – tagger model loaded from path

save_model(output_dir: str)[source]

Save model to path

Parameters:	output_dir (str) – output directory

to(device='cpu', n_gpus=0)[source]

Put model on given device

Parameters:	device (str, optional) – device backend. Defaults to ‘cpu’. n_gpus (int, optional) – number of gpus. Defaults to 0.

train(train_data_set: torch.utils.data.dataloader.DataLoader, dev_data_set: torch.utils.data.dataloader.DataLoader = None, test_data_set: torch.utils.data.dataloader.DataLoader = None, epochs: int = 3, batch_size: int = 8, optimizer=None, max_grad_norm: float = 5.0, logging_steps: int = 50, save_steps: int = 100, save_path: str = None, distiller: nlp_architect.nn.torch.distillation.TeacherStudentDistill = None, best_result_file: str = None, word_dropout: float = 0)[source]

Train a tagging model

Parameters:

train_data_set (DataLoader) –
train examples dataloader. - If distiller object is provided train examples should contain a tuple of

student/teacher data examples.
dev_data_set (DataLoader, optional) – dev examples dataloader. Defaults to None.
test_data_set (DataLoader, optional) – test examples dataloader. Defaults to None.
epochs (int, optional) – num of epochs to train. Defaults to 3.
batch_size (int, optional) – batch size. Defaults to 8.
optimizer (fn, optional) – optimizer function. Defaults to default model optimizer.
max_grad_norm (float, optional) – max gradient norm. Defaults to 5.0.
logging_steps (int, optional) – number of steps between logging. Defaults to 50.
save_steps (int, optional) – number of steps between model saves. Defaults to 100.
save_path (str, optional) – model output path. Defaults to None.
distiller (TeacherStudentDistill, optional) – KD model for training the model using
teacher model. Defaults to None. (a) –
best_result_file (str, optional) – path to save best dev results when it’s updated.
word_dropout (float, optional) – whole-word (-> oov) dropout rate. Defaults to 0.

update_best_model(dev_data_set, test_data_set, best_dev, best_dev_test, best_result_file, loss, epoch, save_path=None)[source]

nlp_architect.models.temporal_convolutional_network module

class nlp_architect.models.temporal_convolutional_network.CommonLayers[source]

Bases: object

Class that contains the common layers for language modeling -: word embeddings and projection layer

define_input_layer(input_placeholder_tokens, word_embeddings, embeddings_trainable=True)[source]

Define the input word embedding layer :param input_placeholder_tokens: tf.placeholder, input to the model :param word_embeddings: numpy array (optional), to initialize the embeddings with :param embeddings_trainable: boolean, whether or not to train the embedding table

Returns:	Embeddings corresponding to the data in input placeholder

define_projection_layer(prediction, tied_weights=True)[source]

Define the output word embedding layer :param prediction: tf.tensor, the prediction from the model :param tied_weights: boolean, whether or not to tie weights from the input embedding layer

Returns:	Probability distribution over vocabulary

class nlp_architect.models.temporal_convolutional_network.TCN(max_len, n_features_in, hidden_sizes, kernel_size=7, dropout=0.2)[source]

Bases: object

This class defines core TCN architecture. This is only the base class, training strategy is not implemented.

build_network_graph(x, last_timepoint=False)[source]

Given the input placeholder x, build the entire TCN graph :param x: Input placeholder :param last_timepoint: Whether or not to select only the last timepoint to output

Returns:	output of the TCN

build_train_graph(*args, **kwargs)[source]: Placeholder for defining training losses and metrics

calculate_receptive_field()[source]: Returns:

run(*args, **kwargs)[source]: Placeholder for defining training strategy

class nlp_architect.models.temporal_convolutional_network.WeightNorm(layer, data_init=False, **kwargs)[source]

Bases: tensorflow.python.keras.layers.wrappers.Wrapper

This wrapper reparameterizes a layer by decoupling the weight’s magnitude and direction. This speeds up convergence by improving the conditioning of the optimization problem.

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks: https://arxiv.org/abs/1602.07868 Tim Salimans, Diederik P. Kingma (2016)

WeightNorm wrapper works for keras and tf layers.

```python

net = WeightNorm(tf.keras.layers.Conv2D(2, 2, activation=’relu’),: input_shape=(32, 32, 3), data_init=True)(x)
net = WeightNorm(tf.keras.layers.Conv2D(16, 5, activation=’relu’),: data_init=True)
net = WeightNorm(tf.keras.layers.Dense(120, activation=’relu’),: data_init=True)(net)
net = WeightNorm(tf.keras.layers.Dense(n_classes),: data_init=True)(net)

```

Parameters:	layer – a layer instance. data_init – If True use data dependent variable initialization
Raises:	`ValueError` – If not initialized with a Layer instance. `ValueError` – If Layer does not contain a kernel of weights `NotImplementedError` – If data_init is True and running graph execution

build(input_shape)[source]: Build Layer

call(inputs)[source]: Call Layer

compute_output_shape(input_shape)[source]

Computes the output shape of the layer.

If the layer has not been built, this method will call build on the layer. This assumes that the layer will later be used with inputs that match the input shape provided here.

Parameters:	input_shape – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:	An input shape tuple.

Module contents

class nlp_architect.models.TrainableModel[source]

Bases: abc.ABC

Base class for a trainable model

convert_to_tensors(*args, **kwargs)[source]: convert any chosen input to valid model format of tensors

get_logits(*args, **kwargs)[source]: get model logits from given input

inference(*args, **kwargs)[source]: run inference

load_model(*args, **kwargs)[source]: load a model

save_model(*args, **kwargs)[source]: save the model

train(*args, **kwargs)[source]: train the model