I’m a Ph.D. student at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), supervised by Professor Zhizheng Wu and Professor Haizhou Li.
I graduated from University of Electronic Science and Technology of China(电子科技大学) with a bachelor’s degree and from the Department of Electronic and Computer Engineering, Peking University(北京大学) with a master’s degree, advised by Yuexian Zou (邹月娴).
My research interest includes deepfake, spoken keyword spotting and speaker verification.
🔥 News
- 2024.08: Our paper SpMis got accepted by SLT 2024!
- 2023.12: Two papers are accepted by ICASSP 2024!
📝 Publications
📚 Spoken Misinformation Detection
An Investigation of Synthetic Spoken Misinformation Detection
Peizhuo Liu, Li Wang, Renqiang He*, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu.
- Recent advancements in speech generation, driven by generative models and large-scale training, have enabled high-quality synthetic speech but also raised concerns about its misuse for generating misinformation. While much research focuses on distinguishing machine-generated speech from human speech, the pressing challenge is detecting misinformation within spoken content, requiring analysis of factors like speaker identity, topic, and synthesis. In response, we introduce SpMis, an open-source dataset for detecting synthetic spoken misinformation. SpMis includes speech from over 1,000 speakers across five topics using state-of-the-art text-to-speech systems. Our findings highlight both promising detection capabilities and significant practical challenges, emphasizing the need for continued research in this field.
📚 Attack and Defense of Speaker Verification
AdvSV: An Over-the-Air Adversarial Attack Dataset for Speaker Verification
Li Wang, Jiaqi Li, Yuhao Luo, Jiahao Zheng, Lei Wang, Hao Li, Ke Xu, Chengfang Fang, Jie Shi, Zhizheng Wu
- Deep neural networks, including Automatic Speaker Verification (ASV) systems, are vulnerable to adversarial attacks. This study introduces an open-source adversarial attack dataset, AdvSV, for ASV research, focusing initially on over-the-air attacks, which involve perturbation generation, loudspeakers, microphones, and varying acoustic environments. Based on the Voxceleb1 Verification test set, AdvSV simulates over-the-air attacks using representative ASV models, aiming to standardize and facilitate reproducible research in this field.
An Initial Investigation of Neural Replay Simulator for Over-the-Air Adversarial Perturbations to Automatic Speaker Verification
Jiaqi Li, Li Wang, Liumeng Xue, Lei Wang, Zhizheng Wu
- Deep Learning has advanced Automatic Speaker Verification (ASV), but physical access adversarial attacks, particularly over-the-air involving loudspeakers, microphones, and replaying environments, are less studied. This research explores using a neural replay simulator to enhance over-the-air attack robustness in ASV. By simulating the replay process with a neural waveform synthesizer, the study on the ASVspoof2019 dataset shows increased success rates of these attacks, highlighting security concerns for ASV in physical access scenarios.
📚 Speaker Verification and Keyword Spotting Multi-task
Learning Decoupling Features Through Orthogonality Regularization
Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou
- This paper is committed to improving the model performance of personalized keyword spotting (identifying keywords and speakers) tasks. This paper believes that the key of personalized keyword spotting task is how to effectively extract the features shared by two tasks and decouple the features related to tasks. This paper creatively uses orthogonal regularization to constrain the model to decouple keyword information and speaker information.
🎙 Spoken Keyword Spotting
Text Anchor Based Metric Learning for Small-Footprint Keyword Spotting
Li Wang, Rongzhi Gu, Nuo Chen, Yuexian Zou
- Innovatively propose a measurement learning method based on text anchor, and use BERT to generate embedding with rich semantic information, so that the model can understand the semantic information of keywords.
📖 Educations
- 2023.06 - Present, Ph.D., The Chinese University of Hong Kong, Shenzhen, Shenzhen.
- 2019.06 - 2022.07, Master, Peking University, Shenzhen.
- 2015.09 - 2019.06, Undergraduate, University of Electronic Science and Technology of China, Chengdu.
💻 Internships
- 2021.06 - 2022.03, ByteDance, AI Lab, Shenzhen.
- 2021.04 - 2021.06, HuaWei MTI, Audio Engineering Department, Shenzhen.
- 2020.05 - 2020.11, Tencent AI Lab, Speech Processing Group, Beijing.
🏀 Services
- Reviewer
- Student Volunteer
- Teaching Assistant
- 2023 Fall, CSC1001 Introduction to Computer Science: Programming Methodology, CUHK-Shenzhen.
- 2024 Spring, CSC3160/AIR6063 Fundamentals of Speech and Language Processing/Spoken Language Processing, CUHK-Shenzhen.
- 2024 Fall, DDA3020 Machine Learning, CUHK-Shenzhen.
👏 Template of This Page
Thanks to Yi Ren (任意) for his open source contribution, template link AcadHomepage .