6 posts tagged with "NLP"

Masking is one of those concepts that is easy to wave your hands at but quite important if you want to implement the Transformer from…

Pre-requisites Understanding of Transformer, especially how masking works. (I strongly recommend Jay Alammar's article The Illustrated…

BPE (byte-pair encoding) is a good way (alternative to schemes such as one-hot encoding and Word2Vec pre-trained embeddings) to encode words…

The Transformer is a landmark breakthrough in NLP that is explained quite well by Jay Alammar's article The Illustrated Transformer. This…

Title: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks (2016) Bengio et al. Main Ideas: For language generation…

Title: Attention is All You Need (2017) Vaswani et al. Main Ideas: General Experiment Setup: First experiment trained on WMT 2014 English…