There is no team registration requirement for this year's challenge.
Participants are required to submit the codes and model checkpoints for reproducing the results by submission link. More information please refer to "Results Submission".
Max of 6 papers from the winners will be included in the conference proceedings.
2025-July-22 The metadata of the evaluation dataset is available at https://https://github.com/JishengBai/APSIPA2025GC-ASC/tree/main/metadata.
2025-June-15 The challenge has started and the metadata of the development dataset is available at https://https://github.com/JishengBai/APSIPA2025GC-ASC/tree/main/metadata.
Participants must adhere to the following terms and conditions. Failure to comply will result in disqualification from the Challenge.
The model should not be trained on any data from the evaluation dataset.
Participants are required to submit the results by the challenge submission deadline.
To ensure the results can be reproduced by the organizers, the submission should include:
1) A .csv file including the classification results for each audio in the evaluation set with the following template:
filename | scene_label |
---|---|
49f03553cf0a4744ac2af9fa55703bb321_2 | Bus |
2) Inference codes and trained models for evaluation by the organizers.
3) Comprehensive documentation, instructions, or other relevant information to facilitate the execution of the codes by the organizers.
4) A technical report (*.pdf) explaining the method in sufficient detail (2-6 pages including references). This report will be publicly available on the challenge website. The APSIPA ASC paper template is recommend.
All files should be packaged into a zip file for submission. You can submit your final results through the Google Form. Each team is limited to submitting only ONE system. If multiple submissions are made, only the last submission before the deadline will be considered. Please carefully provide the correct information: Team Name, Institute, Team Leader and Member(Last name, First name) and E-mail.
Date(AoE Time) | Event |
---|---|
June 15, 2025 | Challenge launch Baseline system & Development metadata release |
July 22, 2025 | Evaluation metadata release |
August 1, 2025 | Final result & Code submission |
August 8, 2025 | Results announcement |
August 15, 2025 | Special session submission deadline of GC paper |
August 22, 2025 | Acceptance notification of GC paper |
The APSIPA ASC 2025 GC (City and Time-Aware Semi-supervised Acoustic Scene Classification) extends the work of the ICME 2024 GC (Semi-supervised Acoustic Scene Classification under Domain Shift), which addressed the challenge of generalizing across different cities. This year's challenge explicitly incorporates city-level location and timestamp metadata for each audio sample, encouraging participants to design models that leverage both geographic and temporal context. It maintains the semi-supervised learning setting, reflecting real-world scenarios where large amounts of unlabeled data coexist with limited labeled examples. Participants are invited to develop innovative methods that combine audio content with contextual information to enhance classification performance and robustness.
For the APSIPA ASC 2025 grand challenge "City and Time-Aware Semi-supervised Acoustic Scene Classification", we provide a development dataset comprising approximately 24 hours of audio recordings from the Chinese Acoustic Scene (CAS) 2023 dataset. This challenge introduces previously unutilized contextual metadata that accompanies each recording:
City information: Identification of the recording location among 22 diverse Chinese cities (e.g., Xi'an, Beijing, Shanghai)
Timestamp information: Precise recording time accurate to year, month, day, hour, minute, and second
The CAS 2023 dataset is a large-scale dataset that serves as a foundation for research related to environmental acoustic scenes. The dataset includes 10 common acoustic scenes, with a total duration of over 130 hours. Each audio clip is 10 seconds long with metadata about the recording location and timestamp. The data collection spanned from April 2023 to September 2023, covering 22 different cities across China.
Acoustic scenes (10): Bus, Airport, Metro, Restaurant, Shopping mall, Public square, Urban park, Traffic street, Construction site, Bar
More details can be found at https://arxiv.org/abs/2402.02694.
The audio recordings of development dataset can be found at https://zenodo.org/records/10616533.
The audio recordings of evaluation dataset can be found at https://zenodo.org/records/10820626.
Metadata of development and evaluation datasets can be found at https://github.com/JishengBai/APSIPA2025GC-ASC/tree/main/metadata.
The baseline system for the APSIPA ASC 2025 GC "City and Time-Aware Semi-supervised Acoustic Scene Classification" challenge is based on a multimodal semi-supervised framework with a pre-trained SE-Trans model. Baseline codes are released at https://github.com/JishengBai/APSIPA2025GC-ASC. Systems will be ranked by macro-average accuracy (average of the class-wise accuracies). If two teams got the same score on the evaluation dataset, the team with the smaller model size will be ranked higher.