publications
2021
2020
- MIKE
GlosysIC Framework: Transformer for Image Captioning with Sequential Attention.
In Lecture Notes in Computer Science, Springer 2020.
Over the past decade, the field of Image captioning has witnessed a lot of intensive research interests. This paper proposes “GlosysIC Framework: Transformer for Image Captioning with Sequential Attention” to build a novel framework that harnesses the combination of Convolutional Neural Network (CNN) to encode image and transformer to generate sentences. Compared to the existing image captioning approaches, GlosysIC framework serializes the Multi head attention modules with the image representations. Furthermore, we present GlosysIC architectural framework encompassing multiple CNN architectures and attention based transformer for generating effective descriptions of images. The proposed system was exhaustively trained on the benchmark MSCOCO image captioning dataset using RTX 2060 GPU and V100 GPU from Google Cloud Platform in terms of PyTorch Deep Learning library. Experimental results illustrate that GlosysIC significantly outperforms the previous state-of-the-art models.