Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation
Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation
Summary
- Few-shot classification on the novel category from the unseen domain.
Motivation
- Despite the success of recognizing novel classes sampled from the same domain as in the training stage, existing metric-based approaches often do not generalize well to categories from different domains.
- Previous methods proposed the domain shift issue aim at recognizing instance from the same category in the training stage.
Proposal
- Tackle the domain generalization problem for recognizing novel category in the few-shot classification setting in different domain sets.
- Integrate feature-wise transformation layer to modulate the feature activations with affine transformations into the feature encoder.
- Learning-to-learn algorithm to optimize the proposed feature-wise transformation layers.
Preliminaries
A metric-based algorithm generally contains a feature encoder E and a metric function M. A task T consists of a support set S = {(Xs, Ys)} and a query set Q = {(Xq, Yq)}.
Methods
- Feature-Wise Transformation Layer Given an intermediate feature activation map z in the feature encoder with the dimension of C ×H ×W, we first sample the scaling term γ and bias term β from Gaussian distributions,
the modulated activation ˆz as
- Learning-to-learn
Results
- The distance between features extracted from different domains becomes smaller with the help of feature-wise transformation layers.
- The proposed learning-to-learn scheme close the domain gap and improve the generalization ability of metric-based models.
Related Work
- modulation with meta-learning
- Authors : Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Ming-Hsuan Yang
- Affiliations : University of California Merced
- Published : ICLR 2020 spotlight, Arxiv
- Code : git
- Material : video
- Blog/Project Page : http://vllab.ucmerced.edu/ym41608/projects/CrossDomainFewShot/
Discussion
- Can we apply affine transformation to tackle different length of sequence from different modality?(NMT, ST)
- Check with t-SNE plot for different task domain.