Attention layer
This card is a wrapper of this Keras class.
Note: the backend for building and training neural networks is based on Keras. The documentation of this card is a variant of the documentation of its corresponding class.
Inputs
-
Use scale — Boolean
If “true”, will create a scalar variable to scale the attention scores.
-
Causal — Boolean
Set to “true” for decoder self-attention. Adds a mask such that position
i
cannot attend to positionsj
>i
. This prevents the flow of information from the future towards the past. -
Dropout — Float
Between 0 and 1. Fraction of the units to drop for the attention scores.
-
Input — NeuralNetworkTensor
Input of this layer.
Outputs
-
Layer instance — NeuralNetworkLayer
Instance of this layer. It can be wrapped using a Bidirectional or a TimeDistributed wrapper.
-
Output — NeuralNetworkTensor
Output of this layer.