129 - Transformers and Hierarchical Structure, with Shunyu Yao

02/07/2021

NLP Highlights

0:00

35:43

In this episode, we talk to Shunyu Yao about recent insights into how transformers can represent hierarchical structure in language. Bounded-depth hierarchical structure is thought to be a key feature of natural languages, motivating Shunyu and his coauthors to show that transformers can efficiently represent bounded-depth Dyck languages, which can be thought of as a formal model of the structure of natural languages. We went on to discuss some of the intuitive ideas that emerge from the proofs, connections to RNNs, and insights about positional encodings that may have practical implications. More broadly, we also touched on the role of formal languages and other theoretical tools in modern NLP. Papers discussed in this episode: - Self-Attention Networks Can Process Bounded Hierarchical Languages (https://arxiv.org/abs/2105.11115) - Theoretical Limitations of Self-Attention in Neural Sequence Models (https://arxiv.org/abs/1906.06755) - RNNs can generate bounded hierarchical languages with optimal memory (https://arxiv.org/abs/2010.07515) - On the Practical Computational Power of Finite Precision RNNs for Language Recognition (https://arxiv.org/abs/1805.04908) Shunyu Yao's webpage: https://ysymyth.github.io/ The hosts for this episode are William Merrill and Matt Gardner.

Mais episódios de "NLP Highlights"

Mais episódios

Descobre o mundo dos podcasts com a app gratuita GetPodcast.

Subscreve os teus podcasts preferidos, ouve episódios offline e obtém recomendações fantásticas.

129 - Transformers and Hierarchical Structure, with Shunyu Yao

NLP Highlights

Mais episódios de "NLP Highlights"

Are LLMs safe?

"Imaginative AI" with Mohamed Elhoseiny

142 - Science Of Science, with Kyle Lo

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

140 - Generative AI and Copyright, with Chris Callison-Burch

139 - Coherent Long Story Generation, with Kevin Yang

138 - Compositional Generalization in Neural Networks, with Najoung Kim

137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

136 - Including Signed Languages in NLP, with Kayo Yin and Malihe Alikhani

135 - PhD Application Series: After Submitting Applications