List of machine learning slides by myself
Published:
This is a collections of slides I made during the past few years for group study or paper presentation sessions.
Paper Presentation Slides
Rethinking Attention with Performer
topics : architecture
One of the aspect I doesn’t like about this network is it’s heavy reliant on random process theory( I have no beef with RP ) because the kernel needs to be initialize each forward pass (in theory) and the initialization include SVD calculation which makes it pretty slow on consumer workstation. Due to this reason, I haven’t had any success with using this design in any of my projects.
Mu-Zero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
topics : reinforcement-learning
I choosen this topic merely to introduce AlphaZero 2.0 to my peers because the idea is very clean and simple yet very powerful.
topics : self-supervised-learning
Thanks to Yann Lecun cake image, self-supervised learning is a hot topic at that time. So I have choosen this not so intuitive learning method to present. A slight background, self-supervied learning before is mostly revolve around siamese network design. Such method has some stability issues when trained on only raw inputs ( so you need large batch size, crazy learning rate scheduling and some weird augmentations to make it work ). Before SimCLR paper, such approach is actually a good trick for small datasets since it’s a good and effective regularization method to prevent overfitting. This is one the very few tricks in my sleeves for using neural network on small datasets to gain better performance than sklearn baseline.
ELECTRA: Pretraining Text Encoders as Discriminators rather than Generators
topics : self-supervised-learning
Consistency regularization for GAN
topics : regularization, generative adversarial networks
I was participating some GANs research at that time, when I saw this paper it’s so easy to implement I can’t resist it. So I may as well made a presentation out of it ¯_(ツ)_/¯
Your classifier is secretly an energy based model and you should treat it like one
topics : learning
This is one the very few papers I really enjoy reading in terms of flow, name and easy to run code.
REFORMER The efficient Transformer
One the very few papers I presented which I ended up revisting again after a long time. When the implementation first finished by lucidrains, I tried it out on a chinese MLM learning task and have very few success. However recently I compared it again with vanilla transformer on wiki-en8 task and found reformer actually converge faster than it’s origin version. This actually one the very few autoregressive transformer which actually provide significant speedup compared to recent proposal such as longformer and reformer.
Graph Neural Networks Study Group
I was lucky enough to join a study group sharing about graph neural networks. These are some of the slides I prepared for the group study.
In this slides, I will go through 4 graph generation papers with each representing a specific domain faced in generating graph.
gPool and SAGPool
SortPool and EigenPool
It’s 1.26 am in Taiwan, I will update the description and add more slides later