大衛·席爾瓦 (計算機科學家)

大衛·席爾瓦
David Silver
大衛·席爾瓦; David Silver
国籍	英国
母校	劍橋大學（BSc）; 阿爾伯塔大學（PhD）
知名于	AlphaGo; AlphaZero; AlphaStar（英语：AlphaStar (software)）
奖项	ACM計算機獎（英语：ACM Prize in Computing）（2019）; 皇家學會院士（2021）
	科学生涯
研究领域	計算機科學
机构	DeepMind

大衛·席爾瓦 FRS （英語：David Silver，1976年—）是一名英國計算機科學家和商人。他領導DeepMind的強化學習研究小組，是AlphaGo、AlphaZero的首席研究員和AlphaStar（英语：AlphaStar (software)）的共同負責人。

教育

席爾瓦於1997年畢業於劍橋大學，獲得阿迪生-韋斯利獎，並在那裡與傑米斯·哈薩比斯結識^[1]。席爾瓦於2004年回到學術界，在阿爾伯塔大學攻讀強化學習的博士學位，在那裡他共同提出了用於第一個碩士級9×9圍棋項目的算法，並於2009年畢業^[2]^[3]。他版本的程序MoGo是截至2009年的最強圍棋程式之一^[4]。

職業生涯

大學畢業後，席爾瓦共同創立了電子遊戲公司Elixir Studios（英语：Elixir Studios），並擔任其首席技術官和首席程序員，獲得多個技術和創新獎項^[1]^[5]。

席爾瓦在2011年被授予皇家學會大學研究獎學金，隨後成為倫敦大學學院的講師，現在是教授^[6]。他關於強化學習的講座可以在YouTube上找到^[7]。席爾瓦從DeepMind成立之初就為其提供諮詢，於2013年全職加入。

席爾瓦近期的研究重點是將強化學習與深度學習互相結合，包括一個直接從像素學習玩雅達利遊戲的程式^[8]。席爾瓦領導了AlphaGo項目，最終使其成為第一個在全尺寸圍棋遊戲中擊敗頂級職業棋手的程式^[9]。隨後AlphaGo獲得榮譽的9段職業認證，並獲得了坎城獅子獎的創新獎^[10]。之後他領導了AlphaZero的開發工作，利用同樣的人工智慧從頭開始學習下圍棋（只通過自己下棋而不是從人類遊戲中學習），然後以同樣的方式學習下西洋棋和日本將棋，達到比其他任何電腦程式更高的等級。

席爾瓦是DeepMind發表文章最多的員工之一，引用次數超過130,000次，h指數為78^[11]。

他因在電腦遊戲方面取得的突破性進展而被授予2019年ACM計算機獎（英语：ACM Prize in Computing）^[12]

2021年，席爾瓦因其對深度Q-學習和AlphaGo的貢獻而被選為英國皇家學會院士^[13]。

參考資料

^ ^1.0 ^1.1 Shead, Sam. David Silver: The unsung hero and intellectual powerhouse at Google DeepMind. Business Insider. [2020-09-26]. （原始内容存档于2022-11-16）.
^ David, Silver. Reinforcement Learning and Simulation-Based Search in Computer Go. ERA. 2009. doi:10.7939/R39D8T （英语）.
^ Sylvain Gelly, David Silver. Achieving Master Level Play in 9 × 9 Computer Go (PDF). Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. 2008 [2023-02-23]. （原始内容存档 (PDF)于2022-04-03）.
^ Stuart J. Russell, Peter Norvig. Artificial Intelligence: A Modern Approach 3rd. Prentice Hall. 2009.
^ What the AI Behind AlphaGo Can Teach Us About Being Human. Wired.com. [17 May 2016]. （原始内容存档于2016-05-29）.
^ CSML | David Silver. www.csml.ucl.ac.uk. [2017-05-27]. （原始内容存档于2021-04-24）（美国英语）.
^ RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning. May 13, 2015 [2023-02-23]. （原始内容存档于2023-02-25） –通过YouTube.
^ Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. Human-level control through deep reinforcement learning. Nature. 2015-02-26, 518 (7540): 529–533. Bibcode:2015Natur.518..529M. ISSN 0028-0836. PMID 25719670. S2CID 205242740. doi:10.1038/nature14236 （英语）.
^ Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis. Mastering the game of Go with deep neural networks and tree search. Nature. 28 January 2016, 529 (7587): 484–489. Bibcode:2016Natur.529..484S. ISSN 0028-0836. PMID 26819042. S2CID 515925. doi:10.1038/nature16961.
^ Google DeepMind AlphaGo in U.K. Wins Innovation Grand Prix. [2017-05-27]. （原始内容存档于2016-07-31）（英语）.
^ David Silver – Google Scholar Citations. [2022-02-01]. （原始内容存档于2023-03-25）.
^ Ormond, Jim. ACM Prize in Computing Awarded to AlphaGo Developer: David Silver Recognized for Breakthrough Advances in Computer Game-Playing. acm.org. [2020-04-02]. （原始内容存档于2023-03-07）.
^ Royal Society elects outstanding new Fellows and Foreign Members. royalsociety.org. [2021-06-08]. （原始内容存档于2021-05-06）.

[Unsung_Hero-1] 1.0 ^1.1 Shead, Sam. David Silver: The unsung hero and intellectual powerhouse at Google DeepMind. Business Insider. [2020-09-26]. （原始内容存档于2022-11-16）.

[2] David, Silver. Reinforcement Learning and Simulation-Based Search in Computer Go. ERA. 2009. doi:10.7939/R39D8T （英语）.

[3] Sylvain Gelly, David Silver. Achieving Master Level Play in 9 × 9 Computer Go (PDF). Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. 2008 [2023-02-23]. （原始内容存档 (PDF)于2022-04-03）.

[4] Stuart J. Russell, Peter Norvig. Artificial Intelligence: A Modern Approach 3rd. Prentice Hall. 2009.

[MyUser_Wired.com_May_17_2016c-5] What the AI Behind AlphaGo Can Teach Us About Being Human. Wired.com. [17 May 2016]. （原始内容存档于2016-05-29）.

[6] CSML | David Silver. www.csml.ucl.ac.uk. [2017-05-27]. （原始内容存档于2021-04-24）（美国英语）.

[7] RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning. May 13, 2015 [2023-02-23]. （原始内容存档于2023-02-25） –通过YouTube.

[8] Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. Human-level control through deep reinforcement learning. Nature. 2015-02-26, 518 (7540): 529–533. Bibcode:2015Natur.518..529M. ISSN 0028-0836. PMID 25719670. S2CID 205242740. doi:10.1038/nature14236 （英语）.

[9] Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis. Mastering the game of Go with deep neural networks and tree search. Nature. 28 January 2016, 529 (7587): 484–489. Bibcode:2016Natur.529..484S. ISSN 0028-0836. PMID 26819042. S2CID 515925. doi:10.1038/nature16961.

[10] Google DeepMind AlphaGo in U.K. Wins Innovation Grand Prix. [2017-05-27]. （原始内容存档于2016-07-31）（英语）.

[MyUser_Https:_May_17_2016c-11] David Silver – Google Scholar Citations. [2022-02-01]. （原始内容存档于2023-03-25）.

[12] Ormond, Jim. ACM Prize in Computing Awarded to AlphaGo Developer: David Silver Recognized for Breakthrough Advances in Computer Game-Playing. acm.org. [2020-04-02]. （原始内容存档于2023-03-07）.

[13] Royal Society elects outstanding new Fellows and Foreign Members. royalsociety.org. [2021-06-08]. （原始内容存档于2021-05-06）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

查论编可微分计算
概论	可微分编程自動微分张量微积分信息几何统计流形神经形态工程（英语：Neuromorphic engineering）模式识别运算学习理论（英语：Computational learning theory）归纳偏置
概念	梯度下降 SGD（英语：Stochastic gradient descent）聚类回归过拟合幻觉对抗（英语：Adversarial machine learning）注意力卷积損失函數反向传播激活函数 softmax sigmoid ReLU 正则化数据集扩散（英语：Diffusion process）自回归
应用	机器学习人工神经网络深度学习科学计算人工智能語言模型大型语言模型
硬件	TPU VPU IPU（英语：Graphcore）憶阻器 SpiNNaker（英语：SpiNNaker）
软件库	Theano TensorFlow Keras PyTorch JAX Flux.jl（英语：Flux (machine-learning framework)）
架构	多层感知器（MLP）循环神经网络（RNN）長短期記憶（LSTM）门控循环单元（英语：Gated recurrent unit）（GRU）卷积神经网络（CNN）残差神经网络（ResNet）变换器自编码器变分自编码器（VAE）生成对抗网络（GAN）图神经网络（英语：Graph neural network）（GNN）回响状态网络（英语：Echo state network）（ESN）神经图灵机（NTM）可微分神经计算机（英语：Differentiable neural computer）（DNC）
主题计算机编程技术分类人工神经网络机器学习