Scenarios 10-12


If you want to use the dataset or scripts on this page, please use the link below to generate the final list of papers that need to be cited.


Scenarios 10-12 emulate a Pedestrian-to-Infrastructure mmWave communication setup. The adopted testbed comprises two units. Unit 1 primarily consists of a stationary base station equipped with an RGB camera and a mmWave phased array. The stationary unit adopts a 16-element 60GHz-band phased array, and it receives the transmitted signal using an over-sampled codebook of 64 pre-defined beams. The second unit (Unit 2) is a person carrying a mmWave transmitter. The transmitter consists of a quasi-omni antenna constantly transmitting (omnidirectional) at the 60 GHz band. Please refer to the detailed description of the testbed presented here

Downtown Tempe: The scenarios were collected at three different locations on the Arizona State University (ASU) Tempe campus. Scenarios 10 and 11 were collected near the Memorial Union, whereas Scenario 12 was collected near the Goldwater Center.  In particular, the locations were selected with high footfall to generate diverse visual and communication datasets. The data was collected at different times of the day to further increase the variance in the dataset. 

Collected Data


Number of Data Collection Units: 1 (using DeepSense Testbed #1)

Number of Data Samples:  1528

Data Modalities: RGB images, 64-dimensional received power vector

Sensors at Unit 1: (Stationary Receiver)

  • Wireless Sensor [Phased Array]: A 16-element antenna array operating in the 60 GHz frequency band and receives the transmitted signal using an over-sampled codebook of 64 pre-defined beams
  • Visual Sensor [Camera]: The main visual perception element in the testbed is an RGB-D camera. The camera is used to capture RGB images of 960×540 resolution at a base frame rate of 30 frames per second (fps)
Number of Units2
Total Data ModalitiesRGB images, 64-dimensional received power vector
Hardware ElementsRGB camera, mmWave phased array receiver
Data ModalitiesRGB images, 64-dimensional received power vector
Hardware ElementsmmWave omni-directional transmitter
Data ModalitiesNone

Data Visualization


Please login to download the DeepSense datasets

How to Access Scenarios 10-12 Data?

Step 1. Download Scenario Data

Step 2. Extract the file

Scenario X folder consists of two sub-folders:

  • unit1: Includes the data captured by unit 1
  • resources: Includes the scenario-specific annotated dataset, data labels and other additional information. For more details, refer to the resources section below. 

Scenario X folder also includes the “scenarioX.csv” file with the paths to all the collected data. For each coherent time, we provide the corresponding visual and wireless data


What are the Additional Resources?

Resources consist of the following information:

  • visual data annotations: For the visual data, we provide the coordinates of the 2D bounding box and attributes for each frame
  • data labels: The labels consists of the ground-truth beam indices computed from the mmWave received power vectors

Visual Data Annotations

After performing the post-processing steps presented here, we generate the annotations for the visual data. Using state-of-the-art machine learning algorithms and multiple validation steps, we achieve highly accurate annotations. In this particular scenario, we provide the coordinates of the 2D bounding box and attributes for each frame. We, also, provide the ground-truth labels for 2 object classes, “Tx”, and “Distractor”. The “Tx” refers to the transmitting vehicle in the scene and “Distractor” for any other objects, such as human, other vehicles, etc. We follow the YOLO format for the bounding-box information. In the YOLO format, each bounding box is described by the center coordinates of the box and its width and height. Each number is scaled by the dimensions of the image; therefore, they all range between 0 and 1. Instead of category names, we provide the corresponding integer categories. We follow the following assignment: (i) “Tx” as “0” , and (ii) “Distractor” as “1”. 

Data Labels

The labels comprises of the ground-truth beam indices computed from the mmWave received power vectors, the direction of travel (unit2), and the sequence index. 

  • Ground-Truth Beam: The phased array of unit 1 utilizes an over-sampled beamforming codebook of N = 64 vectors, which are designed to cover the field of view. It captures the received power by applying the beamforming codebook elements as a combiner. For each received power vector of dimension [64 x 1], the index with the maximum received power value is selected as the optimal beam index. 
An example table comprising of the data labels and the additional information is shown below.
index unit1_rgb unit1_pwr_60ghz unit1_bbox unit1_beam_index
1 ./unit1/camera_data/image_1.jpg ./unit1/mmWave_data/power_1.txt ./resources/bbox/bbox_1.txt 45
2 ./unit1/camera_data/image_3.jpg ./unit1/mmWave_data/power_3.txt ./resources/bbox/bbox_3.txt 36
3 ./unit1/camera_data/image_5.jpg ./unit1/mmWave_data/power_5.txt ./resources/bbox/bbox_5.txt 22
4 ./unit1/camera_data/image_6.jpg ./unit1/mmWave_data/power_6.txt ./resources/bbox/bbox_6.txt 5
5 ./unit1/camera_data/image_8.jpg ./unit1/mmWave_data/power_8.txt ./resources/bbox/bbox_8.txt 62
The objective of the registration is to have a way to send you updates in the future regarding the new scenarios, the bug fixes, and the machine learning competition opportunities.