IEEE ICME 2024 Grand Challenge

Semi-supervised Acoustic Scene Classification

under Domain Shift

1Northwestern Polytechnical University, China
3Nanyang Technological University, Singapore
4Institute of Acoustics, Chinese Academy of Sciences, China
5University of Surrey, UK
*corresponding coordinator: susantorahardja@ieee.org baijs@mail.nwpu.edu.cn

Fig. 1 The overview of Semi-supervised Acoustic Scene Classification under Domain Shift Challenge

News

2024-FEB-05 The challenge has started and links to the development dataset and registration are available.

2024-MAR-10 Update on the baseline codes to correct issue with the evaluation metric.

2024-MAR-15 The link to the evaluation dataset is available.

2024-MAR-19 The link to the final result submission is available. See the updates below.

2024-MAR-22 The challenge has ended. We received submissions from 13 teams. The final results of the challenge will be announced soon.

2024-MAR-29 The challenge results have been announced. Congratulations to the winners!

Challenge results

Rank Team Name Score(Macro-accuracy) Technical Report Bus Airport Metro Restaurant Shopping mall Public square Urban park Traffic street Construction site Bar
1 NERCSLIP-USTC 0.758 Report 0.820 0.727 0.930 0.640 0.610 0.610 0.687 0.720 0.920 0.920
2 Aural Pioneers 0.752 Report 0.760 0.940 0.990 0.590 0.680 0.510 0.607 0.760 0.690 0.990
3 Audio Warriors 0.700 Report 0.550 0.840 0.960 0.740 0.530 0.340 0.593 0.750 0.750 0.950
4 whuaudio 0.699 Report 0.690 0.733 0.870 0.640 0.560 0.620 0.633 0.760 0.560 0.920
5 RM3Team 0.631 Report 0.420 0.800 0.980 0.600 0.690 0.280 0.587 0.740 0.610 0.600
6 CoolWorld 0.615 Report 0.440 0.793 0.920 0.680 0.480 0.100 0.667 0.680 0.540 0.850
7 Sunshine 0.615 Report 0.380 0.833 0.920 0.540 0.440 0.340 0.587 0.740 0.500 0.870
* *Baseline* 0.600 Report 0.400 0.547 0.900 0.690 0.510 0.290 0.460 0.650 0.680 0.870
8 Alchemy AI 0.585 Report 0.730 0.847 0.720 0.410 0.370 0.080 0.667 0.590 0.480 0.960
9 YMZXYYY 0.555 Report 0.230 0.820 0.790 0.750 0.530 0.210 0.527 0.690 0.500 0.500
10 SoundBytes 0.474 Report 0.320 0.440 0.780 0.570 0.510 0.280 0.440 0.600 0.460 0.340
11 TCGQ 0.473 Report 0.330 0.387 0.960 0.610 0.490 0.170 0.420 0.530 0.500 0.330
12 SAL@NCUT 0.192 Report 0.190 0.027 0.380 0.080 0.330 0.000 0.467 0.080 0.120 0.250
13 AIC-CYQ 0.089 Report 0.060 0.153 0.140 0.180 0.060 0.050 0.053 0.080 0.070 0.040

Paper Submission Guidelines

The top 5 teams are invited to submit the paper to the ICME 2024 Workshop. Papers should be submitted via the ICME 2024 CMT. Please select the GC-ASC track. The workshop papers have the same format as regular papers required by the ICME 2024 guidelines.

Timeline

Date(AoE Time) Activity
FEB 5, 2024 Challenge launch & Team registration
MAR 15, 2024 Evaluation dataset release
MAR 22, 2024 Final submission deadline
MAR 29, 2024 Results announcement
APR 5, 2024 GC paper submission deadline
APR 12, 2024 GC paper acceptance notification

Abstract

Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is domain shift caused by a distribution gap between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Although this task in recent years has achieved substantial progress in device generalization, the challenge of domain shift between different regions, involving characteristics such as time, space, culture, and language, remains insufficiently explored at present. In addition, considering the abundance of unlabeled acoustic scene data in the real world, it is important to study the possible ways to utilize these unlabelled data. Therefore, we introduce the "Semi-supervised Acoustic Scene Classification under Domain Shift" in the ICME 2024 grand challenge dealing with the problems. We encourage participants to innovate with semi-supervised learning techniques, aiming to develop more robust ASC models under domain shift.

Dataset description

The CAS 2023 dataset is a large-scale dataset that serves as a foundation for research related to environmental acoustic scenes. The dataset includes 10 common acoustic scenes, with a total duration of over 130 hours. Each audio clip is 10 seconds long with metadata about the recording location and timestamp. The dataset was collected by members of the Joint Laboratory of Environmental Sound Sensing at the School of Marine Science and Technology, Northwestern Polytechnical University. The data collection spanned from April 2023 to September 2023, covering 22 different cities across China.

The ASC challenge dataset consists of development and evaluation datasets, all derived from the CAS 2023 dataset. The development dataset is about 24 hours including the recordings from 8 cities. We provided scene labels for 20% of the data in the development dataset to allow participants to develop effective semi-supervised methods. In the evaluation dataset, data are selected from 12 cities, with 5 unseen cities specifically chosen to provide a more comprehensive evaluation of submissions under domain shift.

Acoustic scenes (10): Bus, Airport, Metro, Restaurant, Shopping mall, Public square, Urban park, Traffic street, Construction site, Bar

Cities (22): Xi'an, Xianyang, Changchun, Jinan, Hefei, Sanya, Nanning, Haikou, Guilin, Guangzhou, Chongqing, Shenyang, Beijing, Baishan, Taiyuan, Tianjin, Nanchang, Shanghai, Luoyang, Liupanshui, Shangrao, Dandong.

More details can be found at https://arxiv.org/abs/2402.02694.

Dataset link

The development dataset is released at https://zenodo.org/records/10616533.

The evaluation dataset is released at https://zenodo.org/records/10820626.

Team registration

The participants can register by filling out the Google form.

If you have any questions, please join Google Groups for discussion.

Evaluation

The baseline system for the ICME 2024 "Semi-supervised Acoustic Scene Classification under Domain Shift" challenge is based on a semi-supervised framework with a Squeeze-and-Excitation and Transformer (SE-Trans) model pre-trained on the TAU Urban Acoustic Scenes (UAS) 2020 Mobile development dataset. Baseline codes are released in github. Systems will be ranked by macro-average accuracy (average of the class-wise accuracies). If two teams got the same score on the evaluation dataset, the team with the smaller model size will be ranked higher.

Rules

  • Only the TAU UAS 2020 Mobile development dataset and CochlScene dataset are allowed for model pre-training and to facilitate the training process in this challenge.
  • Model ensembles are NOT allowed in this challenge.

Awards

Authors of the top 3 ranked solutions will be awarded prizes funded by Xi'an Lianfeng Acoustic Technologies Co., Ltd.

Top 1: 600 USD

Top 2: 500 USD

Top 3: 400 USD

Note: All winning teams will receive corresponding challenge bonuses. Personal income tax or other forms of taxes on bonuses will be borne by the winners and paid by the challenge organizion. The prize money of the winning team will be distributed to the team captain. The team captain shall be responsible for allocating and distributing the prize money and prizes among its members.

Results Submission

Participants must adhere to the following terms and conditions. Failure to comply will result in disqualification from the Challenge.

The model should not be trained on any data from the evaluation dataset.

Participants are required to submit the results by the challenge submission deadline.
To ensure the results can be reproduced by the organizers, the submission should include:
1) A .csv file including the classification results for each audio in the evaluation set with the following template:

filename scene_label
49f03553cf0a4744ac2af9fa55703bb321_2 Bus

2) Inference codes and trained models for evaluation by the organizers.

3) Comprehensive documentation, instructions, or other relevant information to facilitate the execution of the codes by the organizers.

4) A technical report (*.pdf) explaining the method in sufficient detail (2-6 pages including references). This report will be publicly available on the challenge website. The ICME paper template is recommend.

All files should be packaged into a zip file for submission. You can submit your final results through the Google Form. Each team is limited to submitting only ONE system. If multiple submissions are made, only the last submission before the deadline will be considered. Please carefully provide the correct information: Team Name, Institute, Team Contact Name(Last name, First name) and Institutional E-mail.