UCBM-ELT dataset

Human action classification in videos

Article: Download

Dataset Request:

Download the pdf, fill it out and send it to:
gaia dot dobici at gmail dot com
p dot soda at unicampus dot it
cosbi dot dev at gmail dot com


SUMMARY OF THE DATASET

The UCBM-ELT dataset is a collection of 648 videos recorded in indoor/outdoor scenarios at a frame rate of 30 fps and a resolution of 1920 x 1080 px, designed to solve the classification of human actions task. The videos were acquired between May 2022 and July 2022 at the Università Campus Bio-Medico di Roma (https://www.unicampus.it/), in collaboration with ELT Elettronica Group (https://www.eltgroup.net/), in the research unit of Computer Systems and Bioinformatics, Department of Engineering (http://www.cosbi-lab.it/). The dataset contains 9 single-person-daily actions (Point, Wave, Jump, Crouch, Sneeze, SitDown, StandUp, Walk, PersonRun) obtained from 36 subjects, each action is performed twice by every actor, once in a well-lit environment and once in a dark or low-light setting to replicate nighttime conditions. The actors were free to act naturally and spontaneously, according to their personal interpretation. This resulted in great variability in angle, distance from the camera lens, and the Field Of View of the footage that contributed to more realistic and challenging scenarios.


DATASET INFORMATION

UCBM-ELT dataset will be made available to specific researchers for research purposes only (commercial use is excluded, and also commercial use of any results from UCBM-ELT dataset) after a definitive version of the corpus was established and the development work of the UCBM-ELT dataset group was published.

Below are some pieces of information about the actions in the dataset:

  • Point, defined as “Someone points”. ID used: A01;
  • Wave, defined as “Someone waves hand to catch peoples’ attention”. ID used: A02;
  • Jump, defined as “Someone jumps”. ID used: A03;
  • Crouch, defined as “Someone crouches then stands up”. ID used: A04;
  • Sneeze, defined as “Someone sneezes”. ID used: A05;
  • SitDown, defined as “Someone sits down on a chair”. ID used: A06;
  • StandUp, defined as “Someone stands up from a chair”. ID used: A07;
  • Walk, defined as “Someone walks normally”. ID used: A08;
  • PersonRun, defined as “Someone runs”. ID used: A09.

Here there are other useful issues:

  • For each of the 9 actions we have two videos per subject, resulting in 72 videos in total per action.
  • For each of the 36 subjects we have 18 videos, 2 per each action.
  • Each file is named with a single string containing:
    • the ID of the subject, with format ID00XX.
    • a letter specifying the type of repetition. There are two repetitions, according to light conditions. D: daylight, N: night.
    • a string specifying the camera used, “Phone” in our dataset for all the videos.
    • the class label, in the format A0X.
    • number of repetition per action and light condition, always one in our dataset (S01).
  • Experiments must be performed in 10-fold cross-validation, as described in the material available with the dataset.

Upon completing the appropriate form and sending it to the emails at the top of the page, we will make the UCBM-ELT dataset available. You will have access to all videos, labels and cross-validation splits.