¿µ¹®³»¿ë (English Abstract) |
In this paper, we address data-augmentation method for referring expression segmentation (RES): segment an object that is described by a natural language sentence. Despite taking the advantage of expressive power of improved model architectures and training techniques, lack of referring expression segmentation data seems to bound the task¡¯s performance. To make matters worse, building data of referring expression segmentation is a demanding and expensive job since it needs qualitative images that contain multiple objects with same categories and various expressions per each object. One way to mitigate this problem is to augment the pre-existing dataset. In this paper, we propose a data augmentation method for referring expression segmentation task, generating new referring expressions based on the natural language encoder model. LAVT was used for our baseline referring expression segmentation model and verified the efficiency of our data augmentation method with the RefCOCO dataset. |