Structured State Space Sequence


Révision datée du 11 avril 2024 à 16:06 par Pitpitt (discussion | contributions) (Page créée avec « ==en construction== == Définition == XXXXXXXXX >>>>>>> VOIR Réseau à base de séquences d'espaces d'états structurés == Français == ''' séquences d'espaces d'états structurés''' == Anglais == ''' Structured State Space Sequence ''' '''S4'''  S4s, also known as structured SSMs, can be functionally similar to recurrent neural networks (RNNs): They can accept one token at time and produce a linear combination of the current token and an embeddi... »)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)

en construction

Définition

XXXXXXXXX >>>>>>> VOIR Réseau à base de séquences d'espaces d'états structurés

Français

séquences d'espaces d'états structurés

Anglais

Structured State Space Sequence

S4

 S4s, also known as structured SSMs, can be functionally similar to recurrent neural networks (RNNs): They can accept one token at time and produce a linear combination of the current token and an embedding that represents all previous tokens. Unlike RNNs and their extensions including LSTMs — but like transformers — they can also perform an equivalent computation in parallel during training. In addition, they are more computationally efficient than transformers. An S4’s computation and memory requirements rise linearly with input size, while a vanilla transformer’s rise quadratically — a heavy burden with long input sequences.


Source

Source : arxiv

Contributeurs: Claude Coulombe, wiki