This folder contains APIs to access the data and write test results onto disks. Please make sure the data is downloaded before running the code. You can refer to the download section in the main README.md.
In both Python and MATLAB APIs, two utility classes (nlvd_dataset
and nlvd_test
) are provided. nlvd_dataset
is for reading annotations, and nlvd_test
is for dumping test results. The interfaces in Python and MATLAB are largely consistent.
A demo code is available at [toolbox_folder]/development/python/demo.py
.
Manual
nlvd_dataset.py
: This file defines the classNLVDDataset
(alias:nlvd_dataset
), which can be used to retrieve data.- Constructor
__init__(dataset_name, subset_name)
: Requires name of the dataset (vg_v1
is the only choice in the current version) and the subset to create an dataset object. Thesubset_name
includestrain
,test
andval
. image_ids_in_subset()
: Return the ids of all images in the subset.annotation(image_id)
: Return the annotation for an image in adict
. Coordinates in the 'region' fields start from 0.text_id_to_phrase(text_ids)
: Return a list of text phrases specified by thetext_ids
.text_ids
should be iterable.test_text_ids(image_id, level_id)
: Return a list of text ids for the test query on the given difficulty levellevel_id
andimage_id
. For more details about test difficulty levels, see here.create_test(test_title, level_id)
: Create anNLVDTest
(alias:nlvd_test
) object under the given difficulty levellevel_id
(note that localization task is performed on level-0
) for storing the test results. The result file path will be stored in the folder[toolbox_folder]/results/[dataset_name]/[test_title]
. See below for details about theNLVDTest
class.
- Constructor
nlvd_test.py
: This file defines the classNLVDTest
(alias:nlvd_test
). It can be used to get the test queries and write test results to a file in defined format.-
Constructor
__init__(dataset, test_dir, level_id)
: Require anNLVDDataset
object, the folder pathtest_dir
for writing the result file, and the test difficulty levellevel_id
. The path of the result file is[test_dir]/level_[level_id].txt
, which is opened during construction.
It is highly recommended to use thecreate_test
function in theNLVDDataset
object to create the correspondingNLVDTest
object. -
text_ids(image_id)
: Return a list of text ids for the test query on the given difficulty level andimage_id
. -
set_results(image_id, boxes_and_scores)
: It writes the detection/localization result of the image specified byimage_id
to the result file.boxes_and_scores
is adict
with text ids as the keys. For each key, the value is a list of detected bounding box and their scores (e.g.,[box_score_1, box_score_2, ..., box_score_N]
). For each bounding box, it is a 5D-tuple/list (e.g.box_score=[y1, x1, y2, x2, score]
). The coordinates are 0-based. -
finish()
: Tell the object that testing is finished. It flushes and closes the result file.
-
A demo code is available at [toolbox_folder]/development/matlab/demo.m
.
Manual
nlvd_dataset.m
: This file defines the classnlvd_dataset
, which can be used to retrieve data.- Constructor
nlvd_dataset(dataset_name, subset_name)
: Requires name of the dataset (vg_v1
is the only choice in the current version) and the subset to create an dataset object. Thesubset_name
includes 'train', 'test' and 'val'. image_ids_in_subset()
: Return the ids of all images in the subset.annotation(image_id)
: Return the annotation for an image in a struct with fields 'image_id', 'image_path' and 'regions'. 'regions' is a struct array including the 'x', 'y', 'height', 'width', 'phrase_id' and 'phrase' for each region. Coordinates in the 'regions' fields start from 1.text_id_to_phrase(text_ids)
: Return a cell array of text phrases specified by thetext_ids
.text_ids
should be iterable.test_text_ids(image_id, level_id)
: Return a list of text ids for the test query on the given difficulty levellevel_id
andimage_id
. For more details about test difficulty levels, see here.create_test(test_title, level_id)
: Create annlvd_test
object under the given difficulty levellevel_id
(note that localization task is performed on level-0
) for storing the test results. The result file path will be stored in the folder[toolbox_folder]/results/[dataset_name]/[test_title]
. See below for details about thenlvd_test
class.
- Constructor
nlvd_test.m
: This file defines the classnlvd_test
. It can be used to get the test queries and write test results to a file in defined format.-
Constructor
nlvd_test(dataset, test_dir, level_id)
: Require annlvd_dataset
object, the folder pathtest_dir
for writing the result file, and the test difficulty levellevel_id
. The path of the result file is[test_dir]/level_[level_id].txt
, which is opened during construction.
It is highly recommended to use thecreate_test
function in thenlvd_dataset
object to create the correspondingnlvd_test
object. -
text_ids(image_id)
: Return a list of text ids for the test query on the given difficulty level andimage_id
. -
set_results(image_id, text_ids, boxes_and_scores)
: It writes the detection/localization result of the image specified byimage_id
to the result file.text_ids
is a list.boxes_and_scores
is a cell array with each cell corresponding to each text id in 'text_ids'. For each cell, the value is an matrix including the detection bounding boxes and their scores. Each row of the matrix is a 5D-list (e.g.[y1, x1, y2, x2, score]
). The coordinates are 1-based. -
finish()
: Tell the object that testing is finished. It flushes and closes the result file.
-
The result text file written by the development API (nlvd_test
class) has the following format.
IMAGE_ID: Phrase_Id: [y1, x1, y2, x2, score] [y1, x1, y2, x2, score] Phrase_Id: [y1, x1, y2, x2, score] [y1, x1, y2, x2, score] ... Phrase_Id: [y1, x1, y2, x2, score] [y1, x1, y2, x2, score] IMAGE_ID: ...
The coordinates in the result file are 1-based.
To evaluate the results, please refer to the evaluation API.