6 posts tagged with "NLP"
How Exactly Does Masking in Transformer Work
Masking is one of those concepts that is easy to wave your hands at but quite important if you want to implement the Transformer from…
BERT: A Detailed Guide to Clear Up All Your Confusions
Pre-requisites Understanding of Transformer, especially how masking works. (I strongly recommend Jay Alammar's article The Illustrated…
Byte-pair Encoding Algorithm
BPE (byte-pair encoding) is a good way (alternative to schemes such as one-hot encoding and Word2Vec pre-trained embeddings) to encode words…
Masking in the Transformer Explained
The Transformer is a landmark breakthrough in NLP that is explained quite well by Jay Alammar's article The Illustrated Transformer. This…
ML Paper Notes: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Title: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks (2016) Bengio et al. Main Ideas: For language generation…
ML Paper Notes: Attention is All You Need
Title: Attention is All You Need (2017) Vaswani et al. Main Ideas: General Experiment Setup: First experiment trained on WMT 2014 English…