机译:A cross-modal crowd counting method combining CNN and cross-modal transformer
Yanshan Univ||Key Lab Comp Virtual Technol & Syst Integrat Hebei;
Yanshan Univ;
Cross -modal crowd counting; CNN; Transformer; Cross layer connection structure; Cross -modal attention module;