19 posts tagged with "Machinelearning"
How Exactly Does Masking in Transformer Work
Masking is one of those concepts that is easy to wave your hands at but quite important if you want to implement the Transformer from…
BERT: A Detailed Guide to Clear Up All Your Confusions
Pre-requisites Understanding of Transformer, especially how masking works. (I strongly recommend Jay Alammar's article The Illustrated…
What does __getitems__ mean? Functions with Double Underscores in Python
While reading the PyTorch source code, I started to notice that many classes have functions named __xxx__ such as __init__ , __len__, or…
Byte-pair Encoding Algorithm
BPE (byte-pair encoding) is a good way (alternative to schemes such as one-hot encoding and Word2Vec pre-trained embeddings) to encode words…
Masking in the Transformer Explained
The Transformer is a landmark breakthrough in NLP that is explained quite well by Jay Alammar's article The Illustrated Transformer. This…
PyTorch scatter_ Function Explained
In PyTorch, is a function you can use to write the values in tensor into the tensor. The best way I found to think about this function is…
Label Smoothing Explained
Label smoothing is a very straightforward regularization technique which is explained extremely well on this page. The basic idea is that…
ML Paper Notes: Progressive Neural Networks
Title: Progressive Neural Networks (2016) Rusu et al. Main Ideas: The novel progressive network proposed in the paper is a more…
ML Paper Notes: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Title: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks (2016) Bengio et al. Main Ideas: For language generation…
ML Paper Notes: Matching Networks for One-Shot Learning
Title: Matching Networks for One-Shot Learning (2017), Vinyals et al. Main Ideas: One-shot learning means learning to classify a class from…
ML Paper Notes: Generative Adversarial Nets
Title: Generative Adversarial Nets (2014), Goodfellow et al Main Ideas: You have two neural nets - a generative and a discriminative one…
ML Paper Notes: CNN Features off-the-shelf: an Astounding Baseline for Recognition
Title: CNN Features off-the-shelf: an Astounding Baseline for Recognition (2014) Razavian et al. Main Ideas: The paper shows that just by…
ML Paper Notes: Attention is All You Need
Title: Attention is All You Need (2017) Vaswani et al. Main Ideas: General Experiment Setup: First experiment trained on WMT 2014 English…
ML Paper Notes: Unsupervised Domain Adaptation by Backpropagation
Title: Unsupervised Domain Adaptation by Backpropagation (2015), Ganin et al. Main Ideas: If we want to train a classifier, we would usually…
ML Paper Notes: Distilling the Knowledge in a Neural Network
Title: Distilling the Knowledge in a Neural Network (2015), G. Hinton et al. Main Ideas: This classic paper by Hinton et al. describes a…
Statistics: P-value Explained
Informal Definition: The probability of getting the same distribution if the null hypothesis were true. Example: you have a web page, and…
DeepDream Explained Clearly
DeepDream is one of the coolest applications of machine learning - it started out at Google as an effort to gain more insights into the…
Making Sense of AI on Blockchain: Part 2
In the first article of this series, I introduced the lifecycle of AI (which consists of training and inference) and discussed blockchain…
Making Sense of AI on Blockchain: Part 1
Any projects that involve both AI and Blockchain are bound to raise some eyebrows — after all, putting together two hottest buzzwords of the…