Abstract: Visual Dialog is a typical AI-agent task on images, in which the agent interprets information from heterogeneous modalities and provides the correct answer. In this area, most approaches are ...
Abstract: RGB-D semantic segmentation can be advanced with convolutional neural networks due to the availability of Depth data. Although objects cannot be easily discriminated by just the 2D ...