Skip to content

BoxTransformer

BoxTransformer using BoxTransformerModule under the hood.

Note

This module is a TrainablePipe and can be used in a Pipeline, while BoxTransformerModule is a standard PyTorch module, which does not take care of the preprocessing, collating, etc. of the input documents.

Parameters

PARAMETER DESCRIPTION
pipeline

Pipeline instance

TYPE: Pipeline DEFAULT: None

name

Name of the component

TYPE: str DEFAULT: 'box-transformer'

num_heads

Number of attention heads in the attention layers

TYPE: int DEFAULT: 2

n_relative_positions

Maximum range of embeddable relative positions between boxes (further distances are capped to ±n_relative_positions // 2)

TYPE: Optional[int] DEFAULT: None

dropout_p

Dropout probability both for the attention layers and embedding projections

TYPE: float DEFAULT: 0.0

head_size

Head sizes of the attention layers

TYPE: Optional[int] DEFAULT: None

activation

Activation function used in the linear->activation->linear transformations

TYPE: ActivationFunction DEFAULT: 'gelu'

init_resweight

Initial weight of the residual gates. At 0, the layer acts (initially) as an identity function, and at 1 as a standard Transformer layer. Initializing with a value close to 0 can help the training converge.

TYPE: float DEFAULT: 0.0

attention_mode

Mode of relative position infused attention layer. See the relative attention documentation for more information.

TYPE: Sequence[Literal['c2c', 'c2p', 'p2c']] DEFAULT: ('c2c', 'c2p', 'p2c')

n_layers

Number of layers in the Transformer

TYPE: int DEFAULT: 2