nlp_architect.nn.torch.modules package
Submodules
nlp_architect.nn.torch.modules.embedders module
-
class
nlp_architect.nn.torch.modules.embedders.
CNNLSTM
(word_vocab_size: int, num_labels: int, word_embedding_dims: int = 100, char_embedding_dims: int = 16, cnn_kernel_size: int = 3, cnn_num_filters: int = 128, lstm_hidden_size: int = 100, lstm_layers: int = 2, bidir: bool = True, dropout: float = 0.5, padding_idx: int = 0)[source] Bases:
torch.nn.modules.module.Module
CNN-LSTM embedder (based on Ma and Hovy. 2016)
Parameters: - word_vocab_size (int) – word vocabulary size
- num_labels (int) – number of labels (classifier)
- word_embedding_dims (int, optional) – word embedding dims
- char_embedding_dims (int, optional) – character embedding dims
- cnn_kernel_size (int, optional) – character CNN kernel size
- cnn_num_filters (int, optional) – character CNN number of filters
- lstm_hidden_size (int, optional) – LSTM embedder hidden size
- lstm_layers (int, optional) – num of LSTM layers
- bidir (bool, optional) – apply bi-directional LSTM
- dropout (float, optional) – dropout rate
- padding_idx (int, optinal) – padding number for embedding layers
-
forward
(words, word_chars, **kwargs)[source] CNN-LSTM forward step
Parameters: - words (torch.tensor) – words
- word_chars (torch.tensor) – word character tensors
Returns: logits of model
Return type: torch.tensor
-
classmethod
from_config
(word_vocab_size: int, num_labels: int, config: str)[source] Load a model from a configuration file A valid configuration file is a JSON file with fields as in class __init__
Parameters: - word_vocab_size (int) – word vocabulary size
- num_labels (int) – number of labels (classifier)
- config (str) – path to configuration file
Returns: CNNLSTM module pre-configured
Return type:
-
class
nlp_architect.nn.torch.modules.embedders.
IDCNN
(word_vocab_size: int, num_labels: int, word_embedding_dims: int = 100, shape_vocab_size: int = 4, shape_embedding_dims: int = 5, char_embedding_dims: int = 16, char_cnn_filters: int = 128, char_cnn_kernel_size: int = 3, cnn_kernel_size: int = 3, cnn_num_filters: int = 128, input_dropout: float = 0.35, middle_dropout: float = 0, hidden_dropout: float = 0.15, blocks: int = 1, dilations: List = None, embedding_pad_idx: int = 0, use_chars: bool = False, drop_penalty: float = 0.0001)[source] Bases:
torch.nn.modules.module.Module
ID-CNN (iterated dilated) tagging model (based on Strubell et al 2017) with word character embedding (using CNN feature extractors)
Parameters: - word_vocab_size (int) – word vocabulary size
- num_labels (int) – number of labels (classifier)
- word_embedding_dims (int, optional) – word embedding dims
- shape_vocab_size (int, optional) – shape vocabulary size
- shape_embedding_dims (int, optional) – shape embedding dims
- char_embedding_dims (int, optional) – character embedding dims
- char_cnn_filters (int, optional) – character CNN kernel size
- char_cnn_kernel_size (int, optional) – character CNN number of filters
- cnn_kernel_size (int, optional) – CNN embedder kernel size
- cnn_num_filters (int, optional) – CNN embedder number of filters
- input_dropout (float, optional) – input layer (embedding) dropout rate
- middle_dropout (float, optional) – middle layer dropout rate
- hidden_dropout (float, optional) – hidden layer dropout rate
- blocks (int, optinal) – number of blocks
- dilations (List, optinal) – List of dilations per CNN layer
- embedding_pad_idx (int, optional) – padding number for embedding layers
- use_chars (bool, optional) – whether to use char embedding, defaults to False
- drop_penalty (float, optional) – penalty for dropout regularization
-
forward
(words, word_chars, shapes, no_dropout=False, **kwargs)[source] IDCNN forward step
Parameters: - words (torch.tensor) – words
- word_chars (torch.tensor) – word character tensors
- shapes (torch.tensor) – words shapes
Returns: logits of model
Return type: torch.tensor
-
classmethod
from_config
(word_vocab_size: int, num_labels: int, config: str)[source] Load a model from a configuration file A valid configuration file is a JSON file with fields as in class __init__
Parameters: - word_vocab_size (int) – word vocabulary size
- num_labels (int) – number of labels (classifier)
- config (str) – path to configuration file
Returns: IDCNNEmbedder module pre-configured
Return type: