Benefits of Multimodal Training

Over winter break I learned that for linear regression there was a benefit to training with multiple modalities even when your test task is unimodal. In the time since then, I learned that this situation is better termed cross modal learning. Unfortunately, all the theoretical papers that tackle cross modal learning do not analyze the generalization error of model architectures that have a joint representation between the various modalities, which is what we see most often in practice.

Today I have determined experimentally that the cross-modal learning is good for nonlinear regressions: there is a two layer network that benefits from cross-modal learning. My next steps are to determine if this effect still holds in deeper architectures. Stay tuned!