Simple: TK-Loss is flexible to intergrated with state-of-the-art transformer-based VIS models, with no trainable parameters.Novelty: a new parameter-free Temporal KNN-patch Loss (TK-Loss), which leverages temporal masks consistency using unsupervised one-to-k patch correspondence.Using ResNet-101, MaskFreeVIS achieves 49.1 AP without using video masks, and 47.3 AP only using COCO mask initialized model. Using SwinL and built on Mask2Former, MaskFreeVIS achieved 56.0 AP on YTVIS without using any video masks labels. High-performing video instance segmentation without using any video masks or even image mask labels.Lei Ke, Martin Danelljan, Henghui Ding, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu Our project website contains more information, including the visual video comparison: vis.xyz/pub/maskfreevis. We aim to remove the necessity for expensive video masks and even image masks for training VIS models. This is the official pytorch implementation of MaskFreeVIS built on the open-source detectron2.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |