亚欧一区,无码丰满熟妇一区二区浪,欧美一级欧美,超碰人人妻

當(dāng)前位置: > 學(xué)術(shù)報(bào)告 > 文科 > 正文

文科

Image Captioning and Visual Question

發(fā)布時(shí)間:2016-12-23 瀏覽:

講座題目:Image Captioning and Visual Question

講座人:Qi Wu 教授

講座時(shí)間:09:00

講座日期:2016-12-23

地點(diǎn):長(zhǎng)安校區(qū) 文津樓三段522學(xué)術(shù)研討室

主辦單位:計(jì)算機(jī)科學(xué)學(xué)院 智能視覺計(jì)算科研團(tuán)隊(duì)

講座內(nèi)容:The fields of natural language processing (NLP) and computer vision (CV)have seen great advances in their respective goals of analysing and generatingtext, and of understanding images and videos. While both fields share a similarset of methods rooted in artificial intelligence and machine learning, theyhave historically developed separately. Recent years, however, have seen anupsurge of interest in problems that require combination of linguistic andvisual information. For example, Image Captioning and Visual Question Answering(VQA) are two important research topics in this area.

In this talk I will first outline some of the most recent progresses,present some theories and techniques for these two Vision-to-Language tasks,and then discuss our recent works. In these works, we first propose a method ofincorporating high-level concepts into the successful CNN-RNN approach, andshow that it achieves a significant improvement on the state-of-the-art in bothimage captioning and visual question answering. We further show that the samemechanism can be used to incorporate external knowledge, which is criticallyimportant for answering high level visual questions. Our final model achieves the best reportedresults on both image captioning and visual question answering on severalbenchmark datasets.