Please subscribe to keep me alive:
BLOG:
MATH COURSES (7 day free trial)
π Mathematics for Machine Learning:
π Calculus:
π Statistics for Data Science:
π Bayesian Statistics:
π Linear Algebra:
π Probability:
OTHER RELATED COURSES (7 day free trial)
π β Deep Learning Specialization:
π Python for Everybody:
π MLOps Course:
π Natural Language Processing (NLP):
π Machine Learning in Production:
π Data Science Specialization:
π Tensorflow:
REFERENCES
[1] The main Paper:
[2] Tensor2Tensor has some code with a tutorial:
[3] Transformer very intuitively explained – Amazing:
[4] Medium Blog on intuitive explanation:
[5] Pretrained word embeddings:
[6] Intuitive explanation of Layer normalization:
[7] Paper that gives even better results than transformers (Pervasive Attention):
[8] BERT uses transformers to pretrain neural nets for common NLP tasks. :
[9] Stanford Lecture on RNN:
[10] Colahβs Blog:
[11] Wiki for timeseries of events: (machine_learning_model)