Protopipe

Attention layer

This card is a wrapper of this Keras class.

"Attention layer" card

Note: the backend for building and training neural networks is based on Keras. The documentation of this card is a variant of the documentation of its corresponding class.

Inputs

Use scale — Boolean

If “true”, will create a scalar variable to scale the attention scores.
Causal — Boolean

Set to “true” for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past.
Dropout — Float

Between 0 and 1. Fraction of the units to drop for the attention scores.
Input — NeuralNetworkTensor

Input of this layer.

Outputs

Layer instance — NeuralNetworkLayer

Instance of this layer. It can be wrapped using a Bidirectional or a TimeDistributed wrapper.
Output — NeuralNetworkTensor

Output of this layer.