Learning Correspondence From Images, Videos And Texts