Higher-order ambisonic sound scene repository (HOA-SSR) dataset
A dataset with 28 audio-visual scenes of 360 video with the higher order ambisonic - the dataset is publicly available upon request.
Recording Tools and Formats
The audio-visual scenes were captured simultaneously by using the Insta360 Pro2 VR camera and em32 Eigenmike. The audio-visual data is time-synchronized with 20s length.
The audio data is provided in 4th order ambisonic AmbiX (25 channels) PCM format, 48 kHz, and 24-bit. The video data is available in 360 video equirectangular projection (ERP) format, 8K resolution (7680x3840), 30 fps, and YUV 4:2:2 color chroma. The prospective users are encouraged to consider technical detail described in the paper for reproducible results.
Full-reference objective quality metrics were performed between distorted and reference signals for each audio-visual data. Audio data were encoded to 16kbps, 32kbps, 64 kbps (bitrate/channel) in AAC-LC encoder, to be evaluated by using three objective audio quality metrics i.e., PEAQ, ViSQOL, and AMBIQUAL. H.265/AVC was used in FFmpeg (libx265) to encode video sources to 3 resolutions (1920x1080, 3840x1920, 6144x3072) and 4 quantization parameters (QP: 0, 22, 28, 34).
Objective video quality metrics include PSNR and its variants, SSIM, MS-SSIM, VMAF2K, VMAF4K.
Perceptual Quality Experiment
Three subjective experiments were conducted to evaluate the perceptual quality of 16 audio-visual scenes including listening test, viewing test, and audiovisual test. Twenty trained assessors took a part in experiments held in Listening Test and VR facilities in SenseLab, FORCE Technology.
Multiple Stimulus with Hidden Reference methodology was used for all experiments. Auditory cues were 4th order ambisonic decoded to 26-channel loudspeaker setup compliant with EBU 3276 and ITU-R BS.1116-3, whereas visual cues were presented over head-mounted display (Samsung Odyssey+ Mixed Reality Headset).
Consent and Ethics
All participants provided informed consent and all experiments were approved by the Danish Committee System on Health Research Ethics for the Capital Region of Denmark (Journal-nr H-20031815).
This research was supported by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No.765911 RealVision.
We convey the acknowledgment to FORCE Technology and industrial partners who created the 360 audio-visual datasets under the HOA-SSR joint project, and to XRHub Team for the great help in dealing with technicalities and field recording.
Request for access here
To access and download the dataset, send a request by submitting the form below