Gary's Notebook

BERT: A Detailed Guide to Clear Up All Your Confusions

Pre-requisites Understanding of Transformer, especially how masking works. (I strongly recommend Jay Alammar's article The Illustrated…

What does getitems mean? Functions with Double Underscores in Python

While reading the PyTorch source code, I started to notice that many classes have functions named __xxx__ such as __init__ , __len__, or…

Byte-pair Encoding Algorithm

BPE (byte-pair encoding) is a good way (alternative to schemes such as one-hot encoding and Word2Vec pre-trained embeddings) to encode words…

Masking in the Transformer Explained

The Transformer is a landmark breakthrough in NLP that is explained quite well by Jay Alammar's article The Illustrated Transformer. This…

PyTorch scatter_ Function Explained

In PyTorch, is a function you can use to write the values in tensor into the tensor. The best way I found to think about this function is…

Label Smoothing Explained

Label smoothing is a very straightforward regularization technique which is explained extremely well on this page. The basic idea is that…

Re Tutorial: A Quick Python Start

Re, or Regex, stands for regular expression, which means "a sequence of characters that define a search pattern."1 It is particularly useful…

MachineLearning

ML Paper Notes: Progressive Neural Networks

Title: Progressive Neural Networks (2016) Rusu et al. Main Ideas: The novel progressive network proposed in the paper is a more…

ML Paper Notes: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Title: Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks (2016) Bengio et al. Main Ideas: For language generation…

RNN

ML Paper Notes: Matching Networks for One-Shot Learning

Title: Matching Networks for One-Shot Learning (2017), Vinyals et al. Main Ideas: One-shot learning means learning to classify a class from…

ML Paper Notes: Generative Adversarial Nets

Title: Generative Adversarial Nets (2014), Goodfellow et al Main Ideas: You have two neural nets - a generative and a discriminative one…

GAN

ML Paper Notes: CNN Features off-the-shelf: an Astounding Baseline for Recognition

Title: CNN Features off-the-shelf: an Astounding Baseline for Recognition (2014) Razavian et al. Main Ideas: The paper shows that just by…

CNN

ML Paper Notes: Attention is All You Need

Title: Attention is All You Need (2017) Vaswani et al. Main Ideas: General Experiment Setup: First experiment trained on WMT 2014 English…

ML Paper Notes: Unsupervised Domain Adaptation by Backpropagation

Title: Unsupervised Domain Adaptation by Backpropagation (2015), Ganin et al. Main Ideas: If we want to train a classifier, we would usually…

ML Paper Notes: Distilling the Knowledge in a Neural Network

Title: Distilling the Knowledge in a Neural Network (2015), G. Hinton et al. Main Ideas: This classic paper by Hinton et al. describes a…

Statistics: P-value Explained

Informal Definition: The probability of getting the same distribution if the null hypothesis were true. Example: you have a web page, and…

Statistics

DeepDream Explained Clearly

DeepDream is one of the coolest applications of machine learning - it started out at Google as an effort to gain more insights into the…

Deeplearning

Adding Images to Gatsby Blog in Markdown

Ok I spent the past 3 hours trying to find out this annoying little thing. Spoiler alert: You DON'T NEED any Gatsby plugins. Stop reading…

Gatsby

Markdown

Making Sense of AI on Blockchain: Part 2

In the first article of this series, I introduced the lifecycle of AI (which consists of training and inference) and discussed blockchain…

Blockchain

Making Sense of AI on Blockchain: Part 1

Any projects that involve both AI and Blockchain are bound to raise some eyebrows — after all, putting together two hottest buzzwords of the…

Blockchain

Binary Tree Level Order Traversal Python Solution

This problem can be better solved iteratively rather than recusively. The essential idea is that you can get all the next level nodes from…

Algorithms

Maximum Subarray - Kadane's Algorithm Explained

I was inspired to write this post after completing the Leetcode challenge and coming across the Kadane algorithm. It's a very interesting…

Algorithms

The Case for Investing in Cryptocurrencies: A More Rational Analysis

Two great investors, Charlie Munger and Warren Buffet have completely dismissed the value of investing in cryptocurrencies. “I think the…

Cryptocurrency

How to Run a Python Script in Your Node Backend

While javascript is awesome, you're not obligated to use it throughout your entire stack. There will come a time where it is more…