SlowFast – Dual-mode CNN for Video Understanding

Detecting objects in images and categorizing them is one of the more well-known Computer Vision tasks, popularized by the 2010 ImageNet dataset and challenge. While much progress has been achieved on ImageNet, a still vexing task is video understanding – analyzing a video segment and explaining what’s happening inside of it. Despite some recent progress […]

Kuzushiji-MNIST – Japanese Literature Alternative Dataset for Deep Learning Tasks

MNIST, a dataset with 70,000 labeled images of handwritten digits, has been one of the most popular datasets for image processing and classification for over twenty years. Despite its popularity, contemporary deep learning algorithms handle it easily, often surpassing an accuracy result of 99.5%. A new paper introduces Kuzushiji-MNIST, an alternative dataset which is more […]

Struct2Depth – Predicting object depth in dynamic environments

While recent advances in computer vision are helping robots and autonomous vehicles navigate complex environments effectively, some challenges still remain. One major challenge is depth prediction, i.e. the ability of a moving robot to recognize the depth of objects around it, a requirement for it to navigate a real-life environment safely. Historically, the most effective […]

Go-Explore – RL Algorithm for Exploration Problems – Solving Montezuma’s Revenge

Uber has released a blog post describing Go-Explore, a new Reinforcement Learning (RL) algorithm to deal with hard exploration problems. These problems are characterized by the lack (or sparsity) of external feedback, which makes it difficult for the algorithm to learn how to operate. One of the popular testbeds for RL algorithms is Atari Games, […]

HMTL – Multi-task Learning for solving NLP Tasks

The field of Natural Language Processing includes dozens of tasks, among them machine translation, named-entity recognition, and entity detection. While the different NLP tasks are often trained and evaluated separately, there exists a potential advantage in combining them into one model, i.e., learning one task might be helpful in learning another task and improve its […]

GPipe – Training Giant Neural Nets using Pipeline Parallelism

In recent years the size of machine learning datasets and models has been constantly increasing, allowing for improved results on a wide range of tasks. At the same time hardware acceleration (GPUs, TPUs) has also been improving but at a significantly slower pace. The gap between model growth and hardware improvement has increased the importance […]

Curiosity-Driven Learning – Exploration By Random Network Distillation

In recent years, Reinforcement Learning has proven itself to be a powerful technique for solving closed tasks with constant rewards, most commonly games. A major challenge in the field remains training a model when external feedback (reward) to actions is sparse or nonexistent. Recent models have tried to overcome this challenge by creating an intrinsic […]

Advancing to 3D Deep Neural Networks in Medical Image Analysis

For several decades computer scientists have been attempting to build medical software to help physicians analyze medical images. Until 2012, when deep neural networks first proved their effectiveness, most attempts included extensive feature engineering tailored to specific types of medical images, and were usually low-quality and therefore ineffective in helping doctors in practice. In recent […]

BERT – State of the Art Language Model for NLP

BERT (Bidirectional Encoder Representations from Transformers) is a recent paper published by researchers at Google AI Language. It has caused a stir in the Machine Learning community by presenting state-of-the-art results in a wide variety of NLP tasks, including Question Answering (SQuAD v1.1), Natural Language Inference (MNLI), and others. BERT’s key technical innovation is applying […]