Abstract: Recently, video recognition is emerging with the help of multi-modal learning, which focuses on integrating distinct modalities to improve the performance or robustness of the model.
Abstract: The aim of the violent recognition task is to determine whether a video contains violent behaviors. Given that violent behavior often comes with visual and audio anomalies, multimodal ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果