Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Summary

Adapt pre-taining parameter to the task from multimodal distribution using the modulation method.
Preliminaries
Model-Agnostic Meta-Learning (MAML) : With the flexibility in the choice of models, Model-agnostic meta-learners aim to acquire meta-learned parameters from similar tasks to adapt to novel tasks from the same distribution with few gradient updates.

Motivation

MAML rely on a common single initialization shared across the entire task distribution.
Different tasks sampled from a complex task distributions can require substantially different parameters
- making it difficult to find a single initialization that is close to all target parameters
- limiting the diversity of the task distributions that they are able to learn from.

Proposal

Goal : to develop a framework to quickly master a novel task from a multimodal task distribution.
By augmenting MAML, this paper propose a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior parameters according to the identified mode, allowing more efficient fast adaptation.
Aim to develop a meta-learner that is able to acquire mode-specific prior parameters and adapt quickly given tasks sampled from a multimodal task distribution.

Modulation and Task Network

In Algorithm 1, N is the number of blocks in the task network. Note that the task-specific parameters τi are kept fixed and only the meta-learned prior parameters of the task network are updated.

For general modulation operation in equation

the author empirically observed that Feature-wise Linear Modulation (FiLM) performs better than attention-based(softmax) modulation.

Experiments

Baseline
- MAML : the same as task network in MMAML
- Multi-MAML
  - consists of M (the number of modes) MAML models and each of them is specifically trained on the tasks sampled from a single mode.
  - If it outperforms MAML, it indicates that MAML’s performance is degenerated due to the multimodality of task distributions.
Tasks
- regression : sinusoidal, linear, quadratic, transformed l1 norm, hyperbolic tangent functions as discrete task modes.
- image classification : OMNIGLOT, MINI-IMAGENET, FC100, CUB, AIRCRAFT
- reinforcement learning
  Results
  Regression

Image Classification

Model-Agnostic Meta-Learning (MAML)
attention-based (softmax) modulation
- Recurrent Models of Visual Attention
- Attention is all you need
feature-wise linear modula- tion (FiLM)

Information

Authors : Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim
Affiliations : University of Michigan, University of Southern California
Published : NeurIPS 2019, Arxiv
Code/Project Page : https://vuoristo.github.io/MMAML
Material : Poster, Slides

Discussion

If we apply MMAML to NMT + ASR, will it be enough to adapt different input modality and different input length distributions?
What would be optimal General modulation operation for seq2seq modeling?

HyoJung Han

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Summary

Preliminaries