Caption this

Motivation

Image captioning on a mobile terminal using a machine learning model

Objectives

-Development and implementation of a neuronal network that will take an image as input and will generate a sentence that summarises the contents of the image. -Have a web server that runs the previously mentioned model -An android application that will act as a client to our server

Block diagram

Neural network architecture

The recurrent network(Decoder)

The decoder is connected to the penultimate layer of the VGG16 with a layer that reduces the size from 4096 to 512. 512 is also the number of internal states of our GRU architecture. The last layer will contain a layer of 10000 elements which is also the dimension of our vocabulary.

Training process

The recurent model was trained with the following hyperparameters

-Images were given under the format of the penultimate layer of vgg16.

-Optimizer: RMSprop

20 epochs,batch size 3000 images.

One epoch took 5 hours on a nvidiaGTX 1060.

Acurracy

BLEU(2002)= 34.08%

Meteor(2005)= 34.08%

Cost function evolution on validation set

Results

On the validation set

Server implementation

Firebase authentification

Technologies used in server implementaion

Flask(REST API)

HTML/CSS(UI)

Docker(Scalability)

Firebase(Authentification,Scalability)

Google Cloud(Hosting)

Scalability and loadbalancing

Docker instances

Web application

Path to REST API for image description generation

POST http://127\.0\.0\.1:5000/api/predict and the image as payload

Android Aplication

Structure

Introduction page

Login/Register Pages

Image selection page

Description generator

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
Android		Android
Caption Model		Caption Model
Teorie		Teorie
img		img
server		server
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Caption this

Motivation

Objectives

Neural network architecture

The recurrent network(Decoder)

Training process

Server implementation

Web application

Android Aplication

About

Releases

Packages

Contributors 2

Languages

baekhos/Caption-This

Folders and files

Latest commit

History

Repository files navigation

Caption this

Motivation

Objectives

Neural network architecture

The recurrent network(Decoder)

Training process

Server implementation

Web application

Android Aplication

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages