Datasets – EECS Support Pages

To save duplication (and bandwidth) we have a lot of the standard public datasets available on /import for use by managed desktop and compute servers. If you have any other datasets you would like included here to save using up your quota with a shared resource please mail systems@eecs.qmul.ac.uk with a link to the dataset and we’ll do the rest.

AVD (Active Vision Dataset)
COCO (Common Objects in COntext)
CUHK
Flickr30K
Flickr Logos 27
DukeMTMC (Multi-Target, Multi-Camera)
Market-1501
Pascal

COCO (Common Objects in COntext). The COCO set is available at /import/datasets_public_1/coco/ and includes the following splits:
- 2014 Train/Val Detection 2015, Captioning 2015, Detection 2016, Keypoints 2016
- 2014 Testing Captioning 2015
- 2015 Testing Detection 2015, Detection 2016, Keypoints 2016
- 2017 Train/Val Detection 2017, Keypoints 2017, Stuff 2017
- 2017 Testing Detection 2017, Keypoints 2017, Stuff 2017
- 2017 Unlabeled [optional data for any competition]
DukeMTMC Multi-Target, Multi-Camera Tracking Project. Available at /import/datasets_public_1/DukeMTMC this is the full datset plus the following extensions:
- DukeMTMC4REID
- DukeMTMC-reID
Market-1501+500k Available at /import/datasets_public_1/Market-1501-v15.09.15 . The data available set includes the 500k bboxes as distractors in the bbox_images subdirectory.
Flickr Logos 27. Available at /import/datasets_public_1/flickr_logos_27_dataset . 810 annotated images, corresponding to 27 logo classes/brands (30 images for each class). All images are annotated with bounding boxes of the logo instances in the image. We allow multiple logo instances per class image. The training set is randomly split in six subsets, each one containing five images per class.
AVD – Active Vision Dataset – Initially developed for the simulation of motion for object instance recognition in real-world environments the AVD dataset is > 30,000 RGBD images from 15 scenes with over 70,000 bounding boxes. It is available at: /import/datasets_public_1/AVD
Flickr30K – a standard benchmark for sentence-based image description. Over 30,000 (what a suprise) images plus XML annotations. Available at /import/datasets_public_1/Flickr30K
CUHK – A collection of datasets collated by Dr Xiaogang Wang at the Department of Electronic Engineering, The Chinese University of Hong Kong. All of these datasets are available under /import/datasets_public_1/CUHK .
Pascal Visual Object Classes. Provided by Oxford University Information Engineering. The PASCAL VOC project provides standardised image data sets for object class recognition and a common set of tools for accessing the data sets and annotations. We currently host the following data sets:
- Fully Annotated Databases
- Partially Annotated Databases
- Unannotated Databases
  1. 101 Object Categories