Hi.
I have a question when I using the vidor_eval.ipynb script to generate mAP. The script seems to only support single-label case? If a ground-truth human-object pair has multiple interactions, for example gt <human1, (watch, next to), obj2>, only <human1, watch, obj2> can be matched to a prediction. This gt pair <human1, obj2> is then added to gt_bbox_pair_matched and cannot be matched to other predictions.
Thank you
Hi.
I have a question when I using the vidor_eval.ipynb script to generate mAP. The script seems to only support single-label case? If a ground-truth human-object pair has multiple interactions, for example gt <human1, (watch, next to), obj2>, only <human1, watch, obj2> can be matched to a prediction. This gt pair <human1, obj2> is then added to
gt_bbox_pair_matchedand cannot be matched to other predictions.Thank you