什么？！有人模仿你的脸，还有人模仿你全身？

刘海明 · 发表于 2019-9-29 21:28:40

以下这项发展中的合成媒体技术分支具备商业用途，但也有可能被用来扰乱选举和散布谣言。

译者 | 王婕

作者 | DJ PANGBURN

来源 | Fast Company

编辑 | 蒲蒲

俄罗斯小说家维克多·佩列文（Victor Pelevin）的赛博朋克小说《Homo Zapiens》中，一位名叫Babylen Tatarsky的诗人在苏联解体，俄罗斯经济面临崩溃之际，被一位在莫斯科的大学老友聘为广告文案撰稿人。

Tatarsky凭借着巧妙的文字天赋一路水涨船高，而他也逐渐发现了这样一个事实：时任俄罗斯总统的叶利钦等政要和当时的重大政治事件，实际上都是虚拟仿真的产物。放眼现在，随着日益纯熟的“深度换脸”技术出现，佩列文的想象似乎正在慢慢变为现实。

（数据观注释：【赛博朋克小说】赛博朋克小说属于科幻小说的类型，兴起于上世纪七十年代的美国，这一类故事里有大量对新兴信息技术和生物科技的描写，常常涉及跨国财团垄断高新技术，故事的主角一般会设定成游走在社会主流之外的边缘人，他们活在未来社会的阴暗面，喜欢修改电脑的软硬件配置，崇尚改造身体，拒绝融入主流体制，靠着合法或者非法的技术手段铤而走险，有时不惜与超级大公司对抗。这种高与低并存产生的反差，造成了一种特殊的美学效果，被概括为“高科技、低生活”六个字。）

在“深度换脸”（亦或被研究人员称之为“合成媒体”）的领域内，众人的注意力主要集中在可能对政治现实造成严重破坏的“虚假面孔”上，以及那些刻意模仿人写作风格和声音的深度学习算法。

然而，如今合成媒体技术的另一个分支——“深度换身”正在迅速发展。

2018年8月，加州大学伯克利分校的研究人员发表了一篇题为《人人都在跳舞》的论文和视频，展示了深度学习算法如何将专业舞者的动作转移到业余舞者身上。虽然这一研究成果还有待完善，但已表明机器学习的研究人员正在着手更具挑战的任务——“深度换身”。

同年，德国海德堡大学的比约恩·奥默博士领导的一个研究团队发表了一篇关于教会机器逼真还原人类动作的论文。

今年4月，日本人工智能公司Data Grid开发了一种人工智能技术，可以自动生成不存在的人体全身模型，并证实了它在时尚和服装领域中的实际应用。

显然，“深度换身”的确可以打造部分有趣的商业应用，比如换脸舞蹈应用程序，或者被应用在体育和生物医学研究上，但恶意应用的案例在如今充斥着谣言和假新闻的政治背景下也愈发受到关注。

虽然眼下“深度换身”还不能完全掩人耳目，但就像任何深度学习技术一样，它也终将进步，“深度换身”想要鱼目混珠，只是时间问题。

人体合成

为了实现深度换脸，计算机科学家使用了生成式对抗网络(GAN)，它由两个神经网络组成：合成器/生成网络，以及检测器/鉴别网络。这些神经网络在精细的反馈回路中运行，生成真实的合成图像和视频。合成器从数据库创建图像，而检测器则在另一个数据库工作，用以确定合成器制造的图像是否准确可信。

“深度换脸”的首次恶意应用出现在Reddit（一个社交新闻站点）上，当时斯嘉丽·约翰逊等女演员的脸被移植到了色情电影演员的脸上。

Fast.AI公司联合创始人瑞秋·托马斯表示，在目前已存在的“深度换脸”成品中，95%都是想通过“虚假”的不雅素材来进行个人骚扰。托马斯说：“其中一些深度换脸视频并不一定就使用了非常精细复杂的技术。”

然而，这种情况正开始转变。

法里德（新罕布什尔州汉诺威达特茅斯学院计算机科学教授）指出，中国的“深度换脸”应用程序“Zao”就很好地说明了这项技术在不到两年的时间里发展得有多快。

“那些来自Zao的换脸视频看起来真的非常棒，而且这些人造品有很多就跟在电影版本当中呈现的画面一样。”法里德认为，“这无疑是一个进步，要知道想让这款app大规模应用且有数百万人的下载量并不容易。这是‘深度换脸’技术走向成熟的标志。”

“通过深度换脸的图像和视频，我们基本上实现了CGI技术（通用网关接口，是一种重要的互联网技术，可以让一个客户端，从网页浏览器向执行在网络服务器上的程序请求数据。CGI描述了服务器和请求处理程序之间传输数据的一种标准）的大众化。”他进一步表示，“我们把CGI技术从好莱坞的电影公司中带出来，交到了YouTube视频制作者的手中。”

海德堡大学图像处理（HCI）和跨学科科学计算中心（IWR）计算机视觉教授比约恩•奥默领导了一个研究和开发人体合成媒体的团队。与该领域的大多数研究人员一样，该小组的总体目标是理解图像，并教会机器如何认知图像和视频。最终，他希望团队能够更好地了解人类是如何理解图像的。

“我们已经看到了人体合成的化身不仅为游戏行业，还有许多其他领域都创造了营收，”奥默表示，“尤其是对我的团队来说，我们考虑的是完全不同的领域，比如生物医学研究。我们希望更详细地了解人类甚至动物随着时间的推移，在残疾等类似情况下，身体姿态的演进。”

人脸合成与人体合成的过程有着巨大的差异。奥默表示，当前人们已经对人脸合成进行了更多的研究，这其中有几个原因。

首先，任何数码相机或智能手机都有内置的面部检测技术，这种技术可以用于检测像微笑这样的任务，也可以用来识别观众的目视对象。这样的应用程序能够在产生营收的同时带动更多研究。但正如奥默所说，它们也导致了“大量的数据集合、数据整理和人脸图像获取，而这些都是建立深度学习研究的基础。”

其次，对奥默来说更有趣的是，虽然每个人的脸看起来都不一样，但当把脸和整个身体放在一起相比时，变化其实并不大。“这就是为什么我说面部研究已经到了一定阶段，与整个人体相比，它创造了非常好的结果，因为人体的可变性要大得多，处理起来更加复杂，如果你朝着这个方向前进，还需要学习更多”，奥默说。

奥默也不知道人体合成何时才能够达到他和研究人员想要的标准。然而，纵观那些不怀好意的深度换脸日益成熟，奥默指出，如果没有通过深度学习计算机视觉智能、人工智能或其他技术制造的伪造品来一窥究竟，人类可能早就上当了。

“但是，如果你想让它在更大的社会层面上被接受，那还需要几年的时间，”奥默说，“深度换身”和其他深度造假将变得更加低廉和更普遍。“研究界本身已经朝着一个好的方向发展，这一点得到了许多研究团体的高度赞赏，且这些团体对我们能够更加方便地获取算法这一进程的稳定发展发挥了很大的作用，比如github等。所以，你可以从一些论文上下载最新的代码，然后在不太了解隐藏内容的情况下，直接应用它。”

感到“力不从心”

不是每个人都能创造出“轰动一时”的深度换脸。然而，奥默认为，随着时间的推移，金钱将不再成为获取计算资源方面的阻碍，软件的适用性也将变得容易得多。法里德说，有了“深度换身”，不怀好意的人就可以利用深度换脸技术中的典型静止图像直接在录像中开口说话，让“目标对象”为所欲为。

VRT电台（佛兰德广播公司）的调查记者兼驻外记者汤姆范德韦赫担心，记者、还有人权活动人士和持不同政见者们，都有可能被“深度换身”武器化。

file:///C:/Users/le/AppData/Local/Temp/ksohtml1828/wps56.png

file:///C:/Users/le/AppData/Local/Temp/ksohtml1828/wps57.png

汤姆范德韦赫（图片由本人提供）

2016年大选期间假新闻的激增，以及2017年“深度换脸”的兴起，激发了范德韦赫对合成媒体的研究。2018年夏天，他在斯坦福大学开始了一项旨在对抗恶意使用“深度换脸”的方法研究。

“受威胁最大的不是大人物、政客和名人，”范德韦赫表示，“只有普通人——像你、我、女记者，以及那些可能成为或已经成为深度换脸受害者的边缘群体。”

两周前，荷兰新闻主播迪翁·斯塔克斯发现自己的脸被“深度换脸”技术映射到了一名色情女演员的身上，该视频还被上传到PornHub网站（全球最大的色情视频分享类网站之一）并在互联网上广泛传播。尽管PornHub很快就删除了这段视频，但范德韦赫表示，她的声誉已经受到了损害。

为了更好地想象“深度换身”是如何工作的，范德韦赫提到了2018年CNN首席白宫记者吉姆·阿科斯塔的镜头。在阴谋论网站Infowars，编辑保罗约瑟夫沃森上传了一段视频：阿科斯塔似乎咄咄逼人地推着一名试图拿他麦克风的白宫工作人员。

这与C-SPAN（美国一家提供公众服务的非营利性的媒体公司）播出的原始片段有明显不同。Infowars的编辑声称他并没有篡改视频，并将所有差异都归因于“视频压缩”。

但是，正如《独立报》对视频进行的时间轴编辑分析显示，沃森的视频的确缺少了原视频的其中几帧。“深度换身”就像编辑视频时对帧数进行改动一样，可以改变事件的真实性。

成立于2018年的Deeptrace Labs是一家网络安全公司，正在开发基于计算机视觉和深度学习的工具，以分析和理解视频，尤其是那些可以被任何人工智能操纵或合成的视频。

该公司创始人乔治•帕特里尼曾在阿姆斯特丹大学德尔塔实验室从事深度学习的博士后研究。他表示，几年前自己开始研究技术如何预防或防范未来合成媒体的滥用。

帕特里尼认为，由人体合成、人脸合成和音频合成组成的恶意深度造假，将很快被用来攻击记者和政客。

他提到了一段深度换脸的色情视频，视频中印度记者拉娜·阿尤布的脸被换成了一名色情女演员的身体，作为这场虚假信息运动的一部分，这一行为的目的就在于抹黑她的调查报道。

此前，她公开要求对强奸和谋杀一名8岁克什米尔女孩的行为进行司法审判。今年3月，Deeptrace Labs对加蓬总统阿里·邦戈的“深度换脸视频”进行了调查。

尽管这个非洲国家的许多人，包括加蓬军队在内都认为邦戈一动不动的脸、眼睛和身体暗藏着一个深度骗局，并基于此发动了一场不成功的政变，帕特里尼仍向《琼斯母亲》杂志表示，他不相信总统的视频是合成的。

“我们找不到任何理由相信这是深度换脸的结果。我认为总统还活着，这一猜想随后也被证实，不过他实际上是中风了。”帕特里尼说：“我想在这里指出的重点是，问题不在于视频到底是真是假，重要的是人们很清楚它会在公众舆论中引发怀疑，在某些地方还可能引发暴力。”

file:///C:/Users/le/AppData/Local/Temp/ksohtml1828/wps59.png

（图片由Deeptrace Labs提供）

最近，范德韦赫了解到，一名政党人士正在接触“深度换脸”最受欢迎的创造者之一，并要求其运用这项技术来中伤某人。这种定制的“深度换脸”可能会成为一门大生意。

“‘深度换脸’会成为人们谋利的工具，”范德韦赫说，“人们会为它买单。所以，政府并不需要亲自上阵，他们只需要联系一个专门干这行的人就可以了。”

《华尔街日报》最近的报道称，一家英国能源公司的首席执行官被骗，将24.3万美元转入了一家匈牙利供应商的账户。这位高管说，他相信自己是在和老板谈话，而且他的老板似乎也已经批准了这笔交易。

现在，这位首席执行官已经意识到他遭遇了一种名为“网络钓鱼”的深度换音造假。法里德认为，深度造假技术，甚至包括“深度换身”技术，在金融领域的欺诈很有可能呈肆虐之势。

“我可以制作一个杰夫·贝佐斯的深度换脸视频，让他在里面说亚马逊的股票正在下跌。”法里德说，“想想看，做空亚马逊股票能赚多少钱。当你控制它的时候，伤害已经造成了……现在再想象一下，当你看到一个民主党候选人说一些非法或漠不关心的话的视频时，你还认为你不能在选举前一天晚上左右成千上万选民的投票吗？”

法里德认为，社交媒体和深度造假视频的结合，无论是“深度换脸”还是“深度换身”，都很容易产生极大的不良影响。社交媒体公司经常无力或不愿调整他们的平台和内容，因此深度换脸可以像野火一样蔓延。

“当你把深度换脸的能力与在全球散布和消费这些内容的能力结合起来时，麻烦就来了。”他表示，“出于很多原因，我们生活在一个高度分化的社会，也因此人们常常会把意见相左的人往坏处想。”

但对于Fast.AI公司联合创始人瑞秋·托马斯来说，在新的网络冲突中，深度换脸对政治进程产生的负面影响几乎可以忽略不计，因为政府和行业已经在与书面形式的虚假信息作斗争。

她说，这些风险不仅与技术有关，还与人为因素有关。社会两极分化背景下，美国和其他国家的大片地区不再有可以完全信任的事实来源。

这种不信任可能会让有政治动机的“深度换脸”制造者有机可乘。正如隐私学者丹妮尔•西特龙所指出的，当深度换脸被揭穿时，它可以向那些相信谎言的人暗示，谎言是有一定道理的。西特恩称这是“说谎者的红利”。

法里德认为，“深度换身”技术的进步将在整体上使这类恶意深度造假的问题变得更糟。这项技术如今正在快速发展，很有可能在诸如《人人都在跳舞》等高校研究和“Zao”APP开发商的教唆下，将“深度换脸”合法化。

“一旦能对全身动作进行合成模仿，那时画面上就不再只是出现一个讲话的脑袋了，你甚至可以假装成别人做不雅事或杀人。”法里德说：“是不是已经可以这样操作了？目前可能还不能实现。但一两年后，人们就能做到全身深度模仿，这一猜想并不是没有道理的，而且一旦实现会发挥非常强大的作用。”

行业回应

目前，科技行业还没有达成根除深度换脸的共识，许多不同的技术正在研究和测试中。

例如，范德韦赫的研究团队创造了各种内部挑战，并探索了不同的方法。其中一个研究小组研究了胶片的数字水印以识别深度换脸。另一个团队则试图使用区块链技术来建立信任，这也是区块链技术本身的优势之一。另外，还有一个团队通过使用与最初“深度换脸”相同的深度学习技术来识别赝品。

“Sherlock AI是一个自动检测深度换脸的工具，由来自斯坦福大学的辍学生开发，”范德韦赫介绍，“因此，他们取样了一些卷积模型，然后在视频中寻找异常。这一过程也被其他深度换脸检测器使用，比如Deeptrace Labs。他们使用名为FaceForensics++的数据集，然后对其进行测试，其准确率高达97%，对人脸的识别效果也很好。”

Deeptrace Lab基于API(应用程序接口)的监控系统可以监测到深度伪造视频的创建、上传和共享。自2018年成立以来，该公司已经在互联网上发现了超过1.4万个虚假视频。

Deeptrace Lab的系统收集到的信息可以告诉公司及其客户，深度伪造品的制作者在做什么，伪造品来自哪里，他们在使用什么算法，以及这些工具的可访问性如何。

帕特里尼说，他的团队发现，95%的深度伪造品都是虚假色情类的“深度换脸”产品，大多数视频来自于一小撮名人。到目前为止，Deeptrace Lab还没有看到任何在野外应用的全身合成技术产品。

“你不能用单一的算法或想法来总结这些问题的解决方案，”帕特里尼表示，“这与建立一些能告诉你合成媒体不同情况的工具有关。”

范德韦赫认为反深度换脸技术的下一个重大发明将是软生物特征识别技术。每个人都有自己独特的面部表情——扬起的眉毛、嘴唇的动作、手部的动作，这些都可以作为某种个人特征。

加州大学伯克利分校的研究人员施卢蒂·阿加瓦尔使用了软生物计量模型来确定一些画面里的面部抽搐是否是为了视频效果而人为的结果。(阿加瓦尔的论文导师是深度造假视频专家、达特茅斯大学教授哈尼·法里德。)

“基本思路是，我们可以建立有关这些世界各国领导人的软生物识别模型，比如2020年总统候选人，然后倘若视频开始失真，我们可以对它们进行分析，来确定它们的真实性，”阿加瓦尔今年6月向伯克利新闻表示。

尽管考虑到不同的人在不同的环境下可能会呈现不同的面部抽搐，阿加瓦尔的模型并不完全可靠，但范德韦赫认为公司将来可以提供用于身份验证的软生物特征签名，这种特征可能是众所周知的眼睛扫描或全身扫描。

“我认为这是我们前进的方向：与学术界和大型科技公司合作，以创建更大的数据集。”范德韦赫表示，“作为新闻人，我们应该努力帮助媒体加深对深度伪造的了解。”

最近，Facebook和微软与大学联手发起了深度换脸检测挑战。另一个值得注意的努力是国防高级研究计划局的目标，即用语义鉴证法来处理深度换脸赝品，寻找造成错误的算法。

例如，一个人在深度换脸视频中戴了与其不相配的耳环。而在2018年9月，人工智能基金会筹集了1000万美元，通过机器学习和人类调解员创建了一个识别深度换脸和其他恶意内容的工具。

但是，托马斯仍然怀疑技术是否能完全解决深度换脸的问题，不管它们采取什么形式。她认为建立更好的系统来识别深度换脸是有价值的，但她重申，其他类型的错误信息也很猖獗。

托马斯说，利益相关者应该探索社会和心理因素，因为这些因素也会导致严重的深度换脸和其他错误信息。

为什么对深度换脸的监管难度很大？

托马斯、范德韦赫和法里德一致认为，政府将不得不介入并监管深度换脸技术，因为放大此类煽动性内容的社交媒体平台要么无力监管，要么不愿意监管自己的内容。

今年6月，众议院情报委员会主席、民主党众议员亚当·希夫就深度换脸技术造成的虚假信息和虚假信息威胁举行了首次听证会。希夫在开场白中指出，科技公司对此前的假视频做出了不同的反应。

YouTube立即删除了这段慢速播放的视频，而Facebook将其标注为假，并限制了它在整个平台上的传播速度。这些不同的反应导致希夫要求社交媒体公司制定政策，纠正深度换脸视频的上传和传播。

“在短期内，推广虚假信息和其他有害的、煽动性的内容对这些平台来说是有利可图的，因此我们的激励机制是完全错位的。”托马斯表示，“我不认为这些平台应该对它们所承载的内容承担责任，但我确实认为它们应该对它们积极推广的内容承担责任(例如，YouTube将亚历克斯•琼斯的视频推荐给那些甚至没有在寻找他的人160亿次)。”

托马斯补充道:“总的来说，我认为，考虑一下我们如何通过立法来处理那些将巨额社会成本外部化、同时私下要求利润的其它行业(如工业污染、大型烟草和快餐/垃圾食品)，是有帮助的。”

帕特里尼表示，对合成媒体的监管可能会变得很复杂。但是他也认为，目前的一些法律，比如那些涉及中伤、诽谤和版权的法律，可以用来监管恶意的深度换脸。

帕特里尼说，出台一项全面禁止深度换脸的法律将是错误的行为。相反，他主张政府支持有利于社会的合成媒体应用，同时资助研究开发检测深度换脸的工具，并鼓励初创企业和其他公司也这么做。

“政府还可以教育公民这种技术的存在，因此我们需要重新训练我们的耳朵和眼睛，不要相信我们在互联网上看到和听到的一切。”帕特里尼说：“我们需要给人们和社会先打好预防针，而不是在可能两年后因为滥用这项技术而发生非常灾难性或有争议的事情时才亡羊补牢。”

奥默表示，计算机视觉研究人员很清楚深度换脸技术的恶意应用。他认为政府应该为如何使用深度换脸技术建立问责制。

“我们都看到了图像理解的应用，以及它可能带来的好处，”奥默说，“但其中一个非常重要的部分是要明确承担哪些责任，以及谁将承担这一责任？采访过我的政府机构等显然看到他们也负有这一责任。公司也许为了股东的利益，他们可能也不得不表示他们看到了自己的责任；但是，到目前为止，我们心里都很清楚他们是如何处理这一责任的。”

“这是一件很棘手的事情，”奥默接着表示，“只是希望这一切都会过去……但是我们知道它将愈演愈烈。”

You’ve been warned: Full body deepfakes are the next step in AI-based human mimicry

This developing branch of synthetic media technology has commercial applications—but also has the potential to disrupt elections and spread disinformation.

In Russian novelist Victor Pelevin’s cyberpunk novel, Homo Zapiens, a poet named Babylen Tatarsky is recruited by an old college buddy to be an advertising copywriter in Moscow amid post-Soviet Russia’s economic collapse. With a talent for clever wordplay, Tatarsky quickly climbs the corporate ladder, where he discovers that politicians like then-Russian president Boris Yeltsin and major political events are, in fact, virtual simulations. With the advent of ever-more sophisticated deepfakes, it feels as if something like Pelevin’s vision is slowly coming true.

Within the field of deepfakes, or “synthetic media” as researchers call it, much of the attention has been focused full body deepfakes.

In August 2018, University of California Berkeley researchers released a paper and video titled “Everybody Dance Now,” demonstrating how deep learning algorithms can transfer a professional dancers’ moves developed an AI that can automatically generate whole body models of nonexistent persons, identifying practical applications in the fashion and apparel industries.

While it’s clear that full body deepfakes have interesting commercial applications, like deepfake dancing apps or in fields like athletics and biomedical research, malicious use cases are an increasing concern amid today’s polarized political climate riven by disinformation and fake news. For now, full body deepfakes aren’t capable of completely fooling the eye, but like any deep learning technology, advances will be made. It’s only a question of how soon full body deepfakes will become indistinguishable from the real.

SYNTHESIZING ENTIRE HUMAN BODIES

To create deepfakes, computer scientists use Generative Adversarial Networks, or GANs. Comprised of two neural networks—a synthesizer or generative network, and a detector or discriminative network—these neural networks work in a feedback loop of refinement to create realistic synthetic images and video. The synthesizer creates an image from a database, while the latter, working from another database, determines whether the synthesizer’s image is accurate and believable.

The first malicious use of deepfakes appeared Scarlett Johansson were mapped Fast.AI says that 95% of the deepfakes in existence are pornographic material meant to harass certain individuals with fake sexual acts. “Some of these deepfakes videos aren’t necessarily using very sophisticated techniques,” says Thomas. But, that is starting to change.

Farid points to the Chinese deepfake app Zao as being illustrative of how quickly the technology has evolved in less than than two years.

“The ones that I saw [from Zao] looked really, really good, and got around a lot of the artifacts, like in the movie versions where the face flickered,” says Farid. “It’s improving. Getting this as an app working at scale, downloading to millions of people, is hard. It’s a sign of the maturity of the deepfake technology.”

“With deepfake images and videos, we’ve essentially democratized CGI technology,” he says. “We’ve taken it out of the hands of Hollywood studios and put it in the hands of YouTube video creators.”

Björn Ommer, professor for computer vision at the Heidelberg University Collaboratory for Image Processing (HCI) & Interdisciplinary Center for Scientific Computing (IWR), leads a team that is researching and developing full body synthetic media. Like most researchers in the field, the group’s overall goal is to understand images and to teach machines how to understand images and video. Ultimately, he hopes the team gains a better understanding of how human beings understand images.

“We’ve seen synthetic avatars that have been created not just in the gaming industry but a lot of other fields that are creating revenue,” says Ommer. “For my group, in particular, it’s entirely different fields that we are considering, like biomedical research. We want to get a more detailed understanding of human or even animal posture over time, relating to disabilities and the like.”

There are critical differences between the processes of synthesizing faces and entire bodies. Ommer says that more research into face synthesis has been carried out. And there are a few reasons for this. First, any digital camera or smartphone has built-in face detection, technology that can be used for tasks like smile detection or to identify the person a viewer is looking at. Such applications can generate revenue, leading to more research. But they have also led to, as Ommer says, “a lot of data set assembly, data curation, and obtaining face images—the substrate upon which deep learning research is built.”

Secondly, and more interesting to Ommer, is that while each human face looks different, there isn’t much variability when the face is compared to an entire human body. “That is why the research on faces has come to a stage where I would say it is creating really decent results compared to entire human bodies with much more variability being there, much more complicated to handle, and much more to learn if you head in that direction,” says Ommer.

Ommer isn’t sure when full synthesized bodies will be of the quality that he and researchers want. Looking at the maturation of malicious deepfakes, however, Ommer notes that humans can already be tricked quite easily without fakes created by deep learning computer vision intelligence, artificial intelligence, or other technologies.

“But, if you want to make it appealing to larger society, it will take a few more years,” says Ommer, who says full body and other deepfakes will become cheaper and more prevalent. “The research community itself has moved in a direction—and this is very much appreciated by much of the community that is responsible for a lot of this steady progress that we see—where the algorithms are easily available, like on Github and so on. So, you can just download the most recent code from some paper, and then, without much knowledge of what’s under the hood, just apply it.”

FEELING “POWERLESS AND PARALYZED”

Not every person will be able to create a “blockbuster deepfake.” But, given more time, Ommer says money will no longer be an issue in terms of computational resources, and the applicability of software will also become much easier. Farid says that with full body deepfakes, malicious creators will be able to work deepfake technology’s typically stationary figure talking directly into the camera, making targets do and say things they never would.

Tom Van de Weghe, an investigative journalist and foreign correspondent for VRT (the Flemish Broadcasting Corporation), worries that journalists, but also human rights activists and dissidents, could have footage of them weaponized by full body deepfakes.

The explosion of fake news during the 2016 election, and the rise of deepfakes in 2017 inspired Van de Weghe to research synthetic media. In the summer of 2018, he began a research fellowship at Stanford University to study ways of battling the malicious use of deepfakes.

“It’s not the big shots, the big politicians, and the big famous guys who are the most threatened,” says Van de Weghe. “It’s the normal people—people like you, me, female journalists, and sort of marginalized groups that could become or are already becoming the victims of deepfakes.”

Two weeks ago, Dutch news anchor Dionne Stax discovered her face “deepfaked” onto a porn actress’s body, after the video was uploaded to PornHub and distributed on the internet. Although PornHub quickly removed the video, Van de Weghe says that the damage to her reputation had already been done. He also points to China’s AI public broadcasters as proof that the Chinese government has the capability to pull off realistic deepfakes.

To imagine how a full body deepfake might work, Van de Weghe points to 2018 footage of Jim Acosta, CNN’s chief White House correspondent. In a video clip uploaded by Paul Joseph Watson, an editor at conspiracy theory site Infowars, Acosta seems to aggressively push a white house staffer trying to take his microphone. The original clip, broadcast by C-SPAN, differs markedly from Watson’s. The Infowars editor claimed he didn’t doctor the footage and attributed any differences to “video compression” artifacts. But, as The Independent demonstrated in a side-by-side analysis of the videos in an editing timeline, Watson’s video is missing several frames from the original. A full body deepfake could, like editing video frames, alter the reality of an event.

Deeptrace Labs, founded in 2018, is a cybersecurity company that is building tools based on computer vision and deep learning to analyze and understand videos, particularly those that could be manipulated or synthesized by any sort of AI. Company founder Giorgio Patrini, previously a postdoc researcher on deep learning at the DELTA Lab, University of Amsterdam, says that a few years ago he started investigating how technology could prevent or defend against future misuse of synthetic media.

Patrini believes that malicious deepfakes, made up of a combination of synthetic full bodies, faces, and audio, will soon be used to target journalists and politicians. He pointed to a deepfake porn video that featured Indian journalist Rana Ayyub’s face swapped >told Mother Jones that he did not believe the video of the president had been synthesized.

“We couldn’t find any reasons to believe it was a deepfake, and I think that was later confirmed that the president is still alive but that he’d had a stroke,” says Patrini. “The main point I want to make here is that it doesn’t matter if a video is a deepfake or not yet—it’s that people know that it can spark doubt in public opinion and potentially violence in some places.”

Recently, Van de Weghe learned that a political party operative approached one of the most popular deepfake creators, requesting a deepfake to damage a certain individual. Such custom, made-to-order deepfakes could become big business.

“There is money to be earned with deepfakes,” says Van de Weghe. “People will order it. So, a government doesn’t have to create a deepfake—they just have to contact a person who is specialized in deepfakes to create one.”

The Wall Street Journal recently reported that a UK energy company CEO was fooled into transferring $243,000 to the account of a Hungarian supplier. The executive said he believed he was talking to his boss, who had seemingly approved the transaction. Now, the CEO believes he was the victim of an audio deepfake scam known as vishing. Farid believes other fraudulent deepfake financial schemes, which might include full body deepfakes, are only a matter of time.

“I could create a deepfake video of Jeff Bezos where he says that Amazon stock is going down,” says Farid. “Think of all of the money that could be made shorting Amazon stock. By the time you rein it in, the damage has already been done. . . . Now imagine a video of a Democratic party nominee saying illegal or insensitive things. You don’t think you can swing the vote of hundreds of thousands of voters the night before an election?”

Farid thinks a combination of social media and deepfake videos, whether of faces or full bodies, could easily wreak havoc. Social media companies are largely unable or unwilling to moderate their platforms and content, so deepfakes can spread like wildfire.

“When you pair the ability to create deepfake content with the ability to distribute and consume it globally, it’s problematic,” he says. “We live in a highly polarized society, for a number of reasons, and people are going to think the worst of the people they disagree with.”

But for Fast.AI’s Thomas, deepfakes are almost unnecessary in the new cyber skirmishes to negatively influence the political process, as governments and industry already struggle with fake information in the written form. She says the risks aren’t just about technology but human factors. Society is polarized, and vast swaths of the United States (and other countries) no longer have shared sources of truth that they can trust.

This mistrust can play into the hands of politically motivated deepfake creators. When a deepfake is debunked, as privacy scholar Danielle Citron noted, it can suggest to those who bought the lie that there is some truth to it. Citron calls this “the liar’s dividend.” Farid thinks advancements in full body deepfake technology will make the overall problem of this type of nefarious deepfakery worse. The technology is evolving fast, spurred by university research like “Everybody Dance Now” and private sector initiatives such as Zao to monetize deepfakes.

“Once you can do full body, it’s not just talking heads anymore: you can simulate people having sex or killing someone,” Farid says. “Is it just around the corner? Probably not. But eventually it’s not unreasonable that in a year or two that people will be able to do full body deepfakes, and it will be incredibly powerful.”

INDUSTRY RESPONSE

Currently, no consensus approach to rooting out deepfakes exists within the tech industry. A number of different techniques are being researched and tested.

Van de Weghe’s research team, for instance, created a variety of internal challenges that explored different approaches. One team investigated digital watermarking of footage to identify deepfakes. Another team used blockchain technology to establish trust, which is one of its strengths. And yet another team identified deepfakes by using the very same deep learning techniques that created them in the first place.

“Some Stanford dropouts created Sherlock AI, an automatic deepfake detection tool,” says Van de Weghe. “So, they sampled some convolutional models and then they look for anomalies in a video. It’s a procedure being used by other deepfake detectors, like Deeptrace Labs. They use the data sets called FaceForensics++, and then they test it. They’ve got like 97% accuracy and work well with faces.”

Deeptrace Labs’ API-based monitoring system can see the creation, upload, and sharing of deepfake videos. Since being founded in 2018, the company has found over 14,000 fake videos on the internet. Insights gleaned by Deeptrace Labs’ system can inform the company and its clients about what deepfake creators are making, where the fakes came from, what algorithms they are using, and how accessible these tools are. Patrini says his team found that 95% of deepfakes are face swaps in the fake porn category, with most of them being a narrow subset of celebrities. So far, Deeptrace Labs hasn’t seen any full body synthesis technology being used out in the wild.

“You cannot really summarize a solution for these problems in a single algorithm or idea,” says Patrini. “It’s about building several tools that can tell you different things about synthetic media overall.”

Van de Weghe thinks the next big thing in anti-deepfake technology will be soft biometric signatures. Every person has their own unique facial tics—raised brows, lip movements, hand movements—that function as personal signatures of sorts. Shruti Agarwal, a researcher at UC-Berkeley, used soft biometric models to determine if such facial tics have been artificially created for videos. (Agarwal’s thesis adviser is fake video expert and Dartmouth professor Hany Farid.)

“The basic idea is we can build these soft biometric models of various world leaders, such as 2020 presidential candidates, and then as the videos start to break, for example, we can analyze them and try to determine if we think they are real or not,” Agarwal told Berkeley News in June of this year.

Although Agarwal’s models aren’t fullproof, since people in different circumstances might use different facial tics, Van de Weghe think companies could offer soft biometric signatures for identity verification purposes in the future. Such a signature could be something as well-known as eye scans or a full body scan.

“I think that’s the way forward: create bigger data sets in cooperation with academics and big tech companies,” Van de Weghe says. “And we as newsrooms should try and train people and build media literacy about deepfakes.”

Recently, Facebook and Microsoft teamed up with universities to launch the Deepfake Detection Challenge. Another notable effort is the Defense Advanced Research Projects Agency’s (DARPA) goal of tackling deepfakes with semantic forensics, which looks for algorithmic errors that create, for instance, mismatched earrings worn by a person in a deepfake video. And in September 2018, the AI Foundation raised $10 million to create a tool that identifies deepfakes and other malicious content through both machine learning and human moderators.

But, Fast.AI’s Thomas remains skeptical that technology can fully solve the problem of deepfakes, whatever form they might take. She sees value in creating better systems for identifying deepfakes but reiterates that other types of misinformation are already rampant. Thomas says stakeholders should explore the social and psychological factors that play into deepfakes and other misinformation as well.

WHY IT’S TOUGH TO REGULATE DEEPFAKES

Thomas, Van de Weghe, and Farid all agree that governments will have to step in and regulate deepfake technology because social media platforms, which amplify such incendiary content, are either unable or unwilling to police their own content.

In June, Rep. Adam Schiff (D-CA), chair of the House Intelligence Committee, held the first hearing on the misinformation and disinformation threats posed by deepfakes. In his opening remarks, Schiff made note of how tech companies responded differently to the fake Pelosi video. YouTube immediately deleted the slowed-down video, while Facebook labeled it false and throttled back the speed at which it spread across the platform. These disparate reactions led Schiff to demand social media companies establish policies to remedy the upload and spread of deepfakes.

“In the short-term, promoting disinformation and other toxic, incendiary content is profitable for the major platforms, so we have a total misalignment of incentives,” says Fast.AI’s Thomas. “I don’t think that the platforms should be held liable for content that they host, but I do think they should be held liable for content they actively promote (e.g. YouTube recommended Alex Jones’ videos 16 billion times to people who weren’t even looking for him).”

“And, in general, I think it can be helpful to consider how we’ve [legislatively] dealt with other industries that externalize large costs to society while privately claiming the profits (such as industrial pollution, big tobacco, and fast food/junk food),” Thomas adds.

Deeptrace Labs’ Patrini says regulation of synthetic media could prove complicated. But, he believes some current laws, like those covering defamation, libel, and copyright, could be used to police malicious deepfakes. A blanket law to stop deepfakes would be misguided, says Patrini. Instead, he advocates government support for synthetic media applications that benefit society, while funding research into creating tools to detect deepfakes and encouraging startups and other companies to do the same.

“[Government] can also educate citizens that this technology is already here and that we need to retrain our ears and eyes to not believe everything we see and hear on the internet,” says Patrini. “We need to inoculate people and society instead of repairing things in maybe two years when something very catastrophic or controversial might happen because of misuse of this technology.”

Ommer says computer vision researchers are well aware of the malicious applications of deepfakes. And he sees a role for government to play in creating accountability for how deepfakes are used.

“We all see applications of image understanding and the benefits that it can potentially have,” says Ommer. “A very important part of this is responsibility and who will take a share in this responsibility? Government agencies and so on who have interviewed me obviously see their share in this responsibility. Companies say and probably—in the interest of their stockholders—have to say that they see their responsibility; but, we all know how they have handled this responsibility up until now.”

“It’s a tricky thing,” Ommer says. “Just hoping that this will all go away . . . it won’t.”

文章来源：数据观

作者：王婕

原文链接：https://mp.weixin.qq.com/s/xHeZanjuDPw-OFPyBzBXsw

编辑：高杰

帐号		自动登录	找回密码
密码			实名注册