BoxTransformer

BoxTransformer using BoxTransformerModule under the hood.

Note

This module is a TrainablePipe and can be used in a Pipeline, while BoxTransformerModule is a standard PyTorch module, which does not take care of the preprocessing, collating, etc. of the input documents.

Parameters

PARAMETER	DESCRIPTION
`pipeline`	Pipeline instance TYPE: `Pipeline` DEFAULT: `None`
`name`	Name of the component TYPE: `str` DEFAULT: `'box-transformer'`
`num_heads`	Number of attention heads in the attention layers TYPE: `int` DEFAULT: `2`
`n_relative_positions`	Maximum range of embeddable relative positions between boxes (further distances are capped to ±n_relative_positions // 2) TYPE: `Optional[int]` DEFAULT: `None`
`dropout_p`	Dropout probability both for the attention layers and embedding projections TYPE: `float` DEFAULT: `0.0`
`head_size`	Head sizes of the attention layers TYPE: `Optional[int]` DEFAULT: `None`
`activation`	Activation function used in the linear->activation->linear transformations TYPE: `ActivationFunction` DEFAULT: `'gelu'`
`init_resweight`	Initial weight of the residual gates. At 0, the layer acts (initially) as an identity function, and at 1 as a standard Transformer layer. Initializing with a value close to 0 can help the training converge. TYPE: `float` DEFAULT: `0.0`
`attention_mode`	Mode of relative position infused attention layer. See the relative attention documentation for more information. TYPE: `Sequence[Literal['c2c', 'c2p', 'p2c']]` DEFAULT: `('c2c', 'c2p', 'p2c')`
`n_layers`	Number of layers in the Transformer TYPE: `int` DEFAULT: `2`