Mike Erlihson

Rethinking Attention with Performers

The article suggests a method to lower the Transformer's complexity to a linear order and proves all the arguments also in a rigorous form. The article is not easy to read, but luckily, to understand the main idea, the first 5-6 pages are more than enough.

A causal view of compositional zero-shot recognition

One of the main challenges in zero-shot learning is enabling compositional generalization to the mode. This review is part of a series of reviews in Machine & Deep Learning that are originally published in Hebrew, aiming to make it accessible in a plain language under the name #DeepNightLearners.

Supermasks in Superposition

The article suggests a brilliant training method, called SupSup, of a large neural network, which allows it to perform several different tasks.