
北京中关村学院准聘副院长
研究方向
深度学习,强化学习,科学智能,大语言模型。
学术专著
Tao Qin. Dual Learning, Springer 2020.
主要成就与荣誉
-
2017年以计算机科学家的身份荣获《北京青年》周刊 “年度匠人精神青年榜样” 奖项;
-
提出的对偶学习助力微软在2018年中英新闻翻译任务上达到了人类专家水平;
-
带领团队在WMT2019机器翻译大赛中获得8个项目的冠军;
-
2019年设计了当时最高效的语音合成模型FastSpeech,实现了百倍的加速,并成为微软云Azure服务上支持100多种语言和200多种语音的基础模型组件;
-
2019年开发了有史以来最强大的麻将AI Suphx,成为“天凤”平台上首个升至十段的AI,其稳定段位显著优于人类顶尖选手;
-
2020年在国际知名的学术出版集团施普林格·自然(Springer Nature)出版了学术专著《对偶学习》;
-
2022年发布了BioGPT模型,在生命科学领域大幅超越了其他大型语言模型,并在PubMed问答任务上首次达到了人类专家的水平;
-
荣获ICDM 2022最佳学生论文亚军。
代表性学术论文
[1] NatureLM: Deciphering the Language of Nature for Scientific Discovery. arXiv 2025.
[2] TamGen: drug design with target-aware molecule generation through a chemical language model. Nature Communications 2024.
[3] HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model. arXiv 2025.
[4] E2Former: A Linear-time Efficient and Equivariant Transformer for Scalable Molecular Modeling. arXiv 2025.
[5] Accelerating protein engineering with fitnesslandscape modeling and reinforcement learning. bioRxiv 2023.
[6] BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics 2022.
[7] The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4. arXiv 2023.
[8] FABind: Fast and Accurate Protein-Ligand Binding. NeurIPS 2023.
[9] SMT-DTA: Improving Drug-Target Affinity Prediction with Semi-supervised Multi-task Training. Briefings in Bioinformatics 2023.
[10] Pre-training Antibody Language Models for Antigen-Specific Computational Antibody Design. KDD 2023.
[11] Dual-view Molecular Pre-training. KDD 2023.
[12] Retrosynthetic Planning with Dual Value Networks. ICML 2023.
[13] De Novo Molecular Generation via Connection-aware Motif Mining. ICLR 2023.
[14] O-GNN: incorporating ring priors into molecular modeling. ICLR 2023.
[15] R2-DDI: Relation-aware Feature Refinement for Drug-Drug Interaction Prediction. Briefings in Bioinformatics 2022.
[16] Direct Molecular Conformation Generation. TMLR 2022.
[17] Naturalspeech 3: Zero-shot speech synthesis with factorized codec and diffusion models. arXiv preprint 2023.
[18] Learning to rank: from pairwise approach to listwise approach. International Conference on Machine Learning (ICML) 2007.
[19] Fastspeech 2: Fast and high-quality end-to-end text to speech. International Conference on Learning Representations (ICLR) 2021.
[20] MPnet: Masked and permuted pre-training for language understanding. NeurIPS 2020.
[21] Fastspeech: Fast. robust and controllable text to speech. NeurIPS 2019.
[22] Generalizing to unseen domains: A survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering (TKDE) 2022.
[23] Mass: Masked sequence to sequence pre-training for language generation. International Conference on Machine Learning (ICML) 2019.
[24] Dual learning for machine translation. NeurIPS 2016.
[25] Neural architecture optimization. NeurIPS 2018.
[26] Achieving human parity on automatic Chinese to English news translation. arXiv preprint 2018.
[27] LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval Journal 2010.
[28] R-drop: Regularized dropout for neural networks. NeurIPS 2021.
[29] Incorporating BERT into neural machine translation. ICLR 2020.
[30] A survey on neural speech synthesis. arXiv preprint 2021.
[31] Introducing LETOR 4.0 datasets. arXiv preprint 2013.
[32] Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. arXiv preprint 2023.
[33] An empirical study on learning to rank of tweets. ACM SIGIR 2008.
[34] Image-to-image translation: Methods and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2020.
[35] Feature selection for ranking. European Conference on Machine Learning (ECML) 2003.
[36] Representation degeneration problem in training natural language generation models. ACL 2020.
[37] Naturalspeech 2: Latent diffusion models are natural and zero-shot speech and singing synthesizers. NeurIPS 2023.
[38] Multilingual neural machine translation with knowledge distillation. ACL 2020.
[39] NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality. NeurIPS 2022.
[40] Frank: a ranking method with fidelity loss. ACM SIGIR 2019.
[41] Adaspeech: Adaptive text to speech for custom voice. Interspeech 2021.
[42] Deliberation networks: Sequence generation beyond one-pass decoding. ACL 2021.
[43] Understanding and improving transformer from a multi-particle dynamic system point of view. NeurIPS 2021.
[44] Learning to teach. ICML 2017.
[45] A study of reinforcement learning for neural machine translation. ACL 2016.
[46] Supervised rank aggregation. ACM SIGKDD 2012.
[47] Query dependent ranking using k-nearest neighbor. ACM SIGIR 2008.
[48] Fully parameterized quantile function for distributional reinforcement learning. ICML 2020.