人工智能 | ShowMeAI资讯日报 #2022.06.29

继续创造,加快生长!这是我参加「日新方案 6 月更文应战」的第31天,点击检查活动详情

ShowMeAI日报系列全新升级!覆盖AI人工智能 东西&结构 | 项目&代码 | 博文&共享 | 数据&资源 | 研讨&论文 等方向。点击检查 历史文章列表,在大众号内订阅论题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。

1.东西&结构

人工智能 | ShowMeAI资讯日报 #2022.06.29

预练习模型:YaLM 100B – 100B参数预练习言语模型

tags: [预练习,言语模型,大模型]

大型模型用于文本生成和处理的类GPT开源预练习神经网络

‘YaLM 100B – Pretrained language model with 100B parameters,a GPT-like neural network for generating and processing text’ by Yandex

GitHub: github.com/yandex/YaLM…

人工智能 | ShowMeAI资讯日报 #2022.06.29

东西结构:NFShmServer – 用C++开发的同享内存的插件开发结构

tags: [内存同享,插件]

轻量级、敏捷型、弹性的、散布式支持,让你更快更简略的开发服务端运用

GitHub: github.com/yigao/NFShm…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

东西:Nocturne – 数据驱动的快速2D多智能体驾驭模拟器

tags: [多智能体,驾驭模拟,模拟器]

‘Nocturne – A data-driven, fast driving simulator for multi-agent coordination under partial observability.’ by Meta Research

GitHub: github.com/facebookres…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

东西结构:FidelityFX Super Resolution(FSR) – FidelityFX超分辨率结构

tags: [超分辨率,东西]

‘FidelityFX Super Resolution 2.0.1 (FSR 2.0) – FidelityFX Super Resolution 2’ by GPUOpen Effects

GitHub: github.com/GPUOpen-Eff…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

东西库:NNCF – 神经网络紧缩结构

tags: [神经网络,网络紧缩]

‘Neural Network Compression Framework (NNCF) – Neural Network Compression Framework for enhanced OpenVINO inference’ by OpenVINO Toolkit

GitHub: github.com/openvinotoo…

人工智能 | ShowMeAI资讯日报 #2022.06.29

2.项目&代码

人工智能 | ShowMeAI资讯日报 #2022.06.29

代码完成:Recommend-System-TF2.0 – 引荐算法完成

‘于记录在学习引荐体系进程中的常识产出,主要是对经典引荐算法的原理解析及代码完成’ by jc_Lee

GitHub: github.com/jc-LeeHub/R…

人工智能 | ShowMeAI资讯日报 #2022.06.29

3.博文&共享

人工智能 | ShowMeAI资讯日报 #2022.06.29

博文:Transformer前向传达能有多快?

《How fast can we perform a forward pass?》by Jacob Steinhardt

Link: bounded-regret.ghost.io/how-fast-ca…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

4.数据&资源

人工智能 | ShowMeAI资讯日报 #2022.06.29

资源列表:Rolling-shutter-Effects – 图画畸变校正(翻滚快门/径向失真/文字失真等)相关文献资源列表

tags: [图画畸变,图画校正,资源列表]

‘Rolling-shutter-Effects – A curated list of resources on handling Rolling Shutter effects and Radial Distortions’ by Subeesh Vasu

GitHub: github.com/subeeshvasu…

人工智能 | ShowMeAI资讯日报 #2022.06.29

资源列表:ADGC – 深度图聚类相关文献资源大列表

‘ADGC: Awesome Deep Graph Clustering – Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods (papers, codes, and datasets).’ by yueliu1999

GitHub: github.com/yueliu1999/…

人工智能 | ShowMeAI资讯日报 #2022.06.29

5.研讨&论文

人工智能 | ShowMeAI资讯日报 #2022.06.29

大众号后台回复要害字 日报,免费获取收拾好的6月论文合辑。

论文:Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

论文标题:Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

论文时刻:22 Jun 2022

所属范畴:计算机视觉,天然言语处理

对应使命:Image Generation,Machine Translation,Text to image generation,Text-to-Image Generation,图画生成,机器翻译,文本到图画生成

论文地址:arxiv.org/abs/2206.10…

代码完成:github.com/lucidrains/…

论文作者:Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

论文简介:We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge./咱们提出了Pathways Autoregressive Text-to-Image (Parti)模型,它能生成高保真的传神图画,并支持触及复杂构图和世界常识的内容丰厚的组成。

论文摘要:We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge. Parti treats text-to-image generation as a sequence-to-sequence modeling problem, akin to machine translation, with sequences of image tokens as the target outputs rather than text tokens in another language. This strategy can naturally tap into the rich body of prior work on large language models, which have seen continued advances in capabilities and performance through scaling data and model sizes. Our approach is simple: First, Parti uses a Transformer-based image tokenizer, ViT-VQGAN, to encode images as sequences of discrete tokens. Second, we achieve consistent quality improvements by scaling the encoder-decoder Transformer model up to 20B parameters, with a new state-of-the-art zero-shot FID score of 7.23 and finetuned FID score of 3.22 on MS-COCO. Our detailed analysis on Localized Narratives as well as PartiPrompts (P2), a new holistic benchmark of over 1600 English prompts, demonstrate the effectiveness of Parti across a wide variety of categories and difficulty aspects. We also explore and highlight limitations of our models in order to define and exemplify key areas of focus for further improvements. See parti.research.google/ for high-resolution images

咱们提出了Pathways Autoregressive Text-to-Image (Parti)模型,它能生成高保真的传神图画,并支持触及复杂构图和世界常识的内容丰厚的组成。Parti将文本到图画的生成视为一个序列到序列的建模问题,类似于机器翻译,将图画符号的序列作为方针输出,而不是另一种言语的文本符号。这种策略可以自但是然地运用先前关于大型言语模型的丰厚作业,经过扩大数据和模型的规划,这些模型的才能和功用都有了继续的进步。咱们的办法很简略。首要,Parti运用根据Transformer的图画符号器ViT-VQGAN,将图画编码为离散的符号序列。其次,咱们经过将编码器-解码器Transformer模型扩展到20B的参数,完成了质量的继续改善,在MS-COCO上,新的最先进的零拍FID得分为7.23,微调的FID得分为3.22。咱们对Localized Narratives以及PartiPrompts(P2)的详细分析,这是一个由1600多个英语提示语组成的新的全体基准,证明了Parti在各种类别和难度方面的有用性。咱们还讨论并强调了咱们的模型的局限性,以确认和示范进一步改善的重点范畴。高分辨率图片见 parti.research.google/

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:The ArtBench Dataset: Benchmarking Generative Models with Artworks

论文标题:The ArtBench Dataset: Benchmarking Generative Models with Artworks

论文时刻:22 Jun 2022

所属范畴:计算机视觉

对应使命:Conditional Image Generation,Image Generation,Unconditional Image Generation,条件性图画生成,图画生成,无条件性图画生成

论文地址:arxiv.org/abs/2206.11…

代码完成:github.com/liaopeiyuan…

论文作者:Peiyuan Liao, Xiuyu Li, Xihui Liu, Kurt Keutzer

论文简介:We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation./咱们推出了ArtBench-10,它是第一个类平衡的、高质量的、洁净注释的和标准化的数据集,用于对艺术品生成进行基准测试

论文摘要:We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style. ArtBench-10 has several advantages over previous artwork datasets. Firstly, it is class-balanced while most previous artwork datasets suffer from the long tail class distributions. Secondly, the images are of high quality with clean annotations. Thirdly, ArtBench-10 is created with standardized data collection, annotation, filtering, and preprocessing procedures. We provide three versions of the dataset with different resolutions (3232, 256256, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks. We also conduct extensive benchmarking experiments using representative image synthesis models with ArtBench-10 and present in-depth analysis. The dataset is available at github.com/liaopeiyuan… under a Fair Use license.

咱们推出ArtBench-10,它是第一个类平衡的、高质量的、洁净注释的、标准化的艺术品生成基准数据集。它包含来自10种共同艺术风格的60,000张艺术品图画,每种风格有5,000张练习图画和1,000张测试图画。与曾经的艺术品数据集相比,ArtBench-10有几个优点。首要,它是类平衡的,而曾经的大多数艺术品数据集都有长尾类散布的问题。第二,图画质量高,注释洁净。第三,ArtBench-10是经过标准化的数据收集、注释、过滤和预处理程序创立的。咱们供给了三个不同分辨率的数据集版本(3232、256256和原始图画大小),其格式化的办法很容易被流行的机器学习结构所选用。咱们还运用ArtBench-10的代表性图画组成模型进行了广泛的基准试验,并进行了深入的分析。该数据集在Fair Use license下可在 github.com/liaopeiyuan… 获取运用。

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:Remote Sensing Change Detection (Segmentation) using Denoising Diffusion Probabilistic Models

论文标题:Remote Sensing Change Detection (Segmentation) using Denoising Diffusion Probabilistic Models

论文时刻:23 Jun 2022

所属范畴:计算机视觉

对应使命:Change Detection,Denoising,改动检测,去噪

论文地址:arxiv.org/abs/2206.11…

代码完成:github.com/wgcban/ddpm…

论文作者:Wele Gedara Chaminda Bandara, Nithin Gopalakrishnan Nair, Vishal M. Patel

论文简介:Human civilization has an increasingly powerful influence on the earth system, and earth observations are an invaluable tool for assessing and mitigating the negative impacts./人类文明对地球体系的影响越来越大,而对地观测是评价和减轻负面影响的名贵东西。

论文摘要:Human civilization has an increasingly powerful influence on the earth system, and earth observations are an invaluable tool for assessing and mitigating the negative impacts. To this end, observing precisely defined changes on Earth’s surface is essential, and we propose an effective way to achieve this goal. Notably, our change detection (CD)/ segmentation method proposes a novel way to incorporate the millions of off-the-shelf, unlabeled, remote sensing images available through different earth observation programs into the training process through denoising diffusion probabilistic models. We first leverage the information from these off-the-shelf, uncurated, and unlabeled remote sensing images by using a pre-trained denoising diffusion probabilistic model and then employ the multi-scale feature representations from the diffusion model decoder to train a lightweight CD classifier to detect precise changes. The experiments performed on four publically available CD datasets show that the proposed approach achieves remarkably better results than the state-of-the-art methods in F1, IoU, and overall accuracy. Code and pre-trained models are available at: github.com/wgcban/ddpm…

人类文明对地球体系的影响越来越大,而地球观测是评价和减轻负面影响的名贵东西。为此,观察地球外表精确认义的改变是至关重要的,咱们提出了一个有用的办法来完成这一方针。值得注意的是,咱们的改变检测(CD)/切割办法提出了一种新的办法,经过去噪分散概率模型,将数以百万计的现成的、未符号的、可经过不同地球观测项目取得的遥感图画归入练习进程。咱们首要经过运用预先练习好的去噪分散概率模型来运用这些现成的、未经收拾的和未符号的遥感图画的信息,然后选用来自分散模型解码器的多标准特征标明来练习一个轻量级的CD分类器来检测精确的改变。在四个公开的CD数据集上进行的试验标明,所提出的办法在F1、IoU和整体准确率方面取得了显着优于最先进办法的成果。代码和预练习的模型可在以下网站取得 github.com/wgcban/ddpm…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:Gender Classification and Bias Mitigation in Facial Images

论文标题:Gender Classification and Bias Mitigation in Facial Images

论文时刻:13 Jul 2020

所属范畴:计算机视觉

对应使命:Classification,General Classification,分类,通用分类

论文地址:arxiv.org/abs/2007.06…

代码完成:github.com/Developer-Y…

论文作者:Wenying Wu, Pavlos Protopapas, Zheng Yang, Panagiotis Michalatos

论文简介:We worked to increase classification accuracy and mitigate algorithmic biases on our baseline model trained on the augmented benchmark database./咱们努力提高分类精度,并减轻咱们在增强的基准数据库上练习的基线模型的算法误差。

论文摘要:Gender classification algorithms have important applications in many domains today such as demographic research, law enforcement, as well as human-computer interaction. Recent research showed that algorithms trained on biased benchmark databases could result in algorithmic bias. However, to date, little research has been carried out on gender classification algorithms’ bias towards gender minorities subgroups, such as the LGBTQ and the non-binary population, who have distinct characteristics in gender expression. In this paper, we began by conducting surveys on existing benchmark databases for facial recognition and gender classification tasks. We discovered that the current benchmark databases lack representation of gender minority subgroups. We worked on extending the current binary gender classifier to include a non-binary gender class. We did that by assembling two new facial image databases: 1) a racially balanced inclusive database with a subset of LGBTQ population 2) an inclusive-gender database that consists of people with non-binary gender. We worked to increase classification accuracy and mitigate algorithmic biases on our baseline model trained on the augmented benchmark database. Our ensemble model has achieved an overall accuracy score of 90.39%, which is a 38.72% increase from the baseline binary gender classifier trained on Adience. While this is an initial attempt towards mitigating bias in gender classification, more work is needed in modeling gender as a continuum by assembling more inclusive databases.

性别分类算法在当今许多范畴都有重要的运用,如人口统计研讨、法律以及人机交互。最近的研讨标明,在有成见的基准数据库上练习的算法可能导致算法的成见。但是,到现在为止,关于性别分类算法对性别少量群体的成见的研讨还很少,比方LGBTQ和非二元人口,他们在性别表达上有显着的特征。在本文中,咱们首要对现有的面部辨认和性别分类使命的基准数据库进行了查询。咱们发现,现在的基准数据库缺乏对性别少量群体的代表。咱们致力于扩展当时的二元性别分类器,以包含非二元性别类。咱们经过拼装两个新的面部图画数据库来做到这一点。1)一个具有LGBTQ人口子集的种族平衡的包容性数据库 2)一个由非二元性别的人组成的包容性性别数据库。咱们努力提高分类精度,并减轻在增强的基准数据库上练习的基线模型的算法误差。咱们的组合模型取得了90.39%的整体准确率,比在Adience上练习的基线二元性别分类器提高了38.72%。虽然这是减轻性别分类成见的开始测验,但在经过组建更具包容性的数据库将性别作为一个接连体进行建模方面还需求做更多的作业。

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:I M Avatar: Implicit Morphable Head Avatars from Videos

论文标题:I M Avatar: Implicit Morphable Head Avatars from Videos

论文时刻:CVPR 2022

所属范畴:计算机视觉

论文地址:arxiv.org/abs/2112.07…

代码完成:github.com/zhengyuf/im…

论文作者:Yufeng Zheng, Victoria Fernndez Abrevaya, Marcel C. Bhler, Xu Chen, Michael J. Black, Otmar Hilliges

论文简介:Traditional 3D morphable face models (3DMMs) provide fine-grained control over expression but cannot easily capture geometric and appearance details./传统的三维可变形脸部模型(3DMMs)供给了对表情的细粒度操控,但不能容易捕捉几许和外观细节。

论文摘要:Traditional 3D morphable face models (3DMMs) provide fine-grained control over expression but cannot easily capture geometric and appearance details. Neural volumetric representations approach photorealism but are hard to animate and do not generalize well to unseen expressions. To tackle this problem, we propose IMavatar (Implicit Morphable avatar), a novel method for learning implicit head avatars from monocular videos. Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, we represent the expression- and pose- related deformations via learned blendshapes and skinning fields. These attributes are pose-independent and can be used to morph the canonical geometry and texture fields given novel expression and pose parameters. We employ ray marching and iterative root-finding to locate the canonical surface intersection for each pixel. A key contribution is our novel analytical gradient formulation that enables end-to-end training of IMavatars from videos. We show quantitatively and qualitatively that our method improves geometry and covers a more complete expression space compared to state-of-the-art methods.

传统的3D可变形脸部模型(3DMMs)供给了对表情的精细操控,但不能容易捕捉几许和外观细节。神经体积表征挨近传神,但很难制作成动画,也不能很好地归纳未见过的表情。为了处理这个问题,咱们提出了IMavatar(Implicit Morphable avatar),一种从单眼视频中学习隐性头部头像的新办法。受传统3DMMs供给的细粒度操控机制的启示,咱们经过学习的混合形状和皮肤场来标明表情和姿态相关的变形。这些特点是与姿态无关的,并可用于在给定新的表情和姿态参数的状况下对典型的几许和纹理场进行变形。咱们选用射线跋涉法和迭代寻根法来定位每个像素的典型外表交点。咱们的一个要害奉献是咱们新颖的分析性梯度公式,可以从视频中进行端到端的IMavatars练习。咱们从数量和质量上标明,与最先进的办法相比,咱们的办法改善了几许学,并覆盖了更完好的表达空间。

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:A Conversational Paradigm for Program Synthesis

论文标题:A Conversational Paradigm for Program Synthesis

论文时刻:25 Mar 2022

所属范畴:天然言语处理

对应使命:Language Modelling,Program Synthesis,言语模型,代码生成

论文地址:arxiv.org/abs/2203.13…

代码完成:github.com/salesforce/…

论文作者:Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong

论文简介:We train a family of large language models, called CodeGen, on natural language and programming language data./咱们在天然言语和编程言语数据上练习一个大型言语模型系列,称为CodeGen。

论文摘要:Program synthesis strives to generate a computer program as a solution to a given problem specification. We propose a conversational program synthesis approach via large language models, which addresses the challenges of searching over a vast program space and user intent specification faced in prior approaches. Our new approach casts the process of writing a specification and program as a multi-turn conversation between a user and a system. It treats program synthesis as a sequence prediction problem, in which the specification is expressed in natural language and the desired program is conditionally sampled. We train a family of large language models, called CodeGen, on natural language and programming language data. With weak supervision in the data and the scaling up of data size and model size, conversational capacities emerge from the simple autoregressive language modeling. To study the model behavior on conversational program synthesis, we develop a multi-turn programming benchmark (MTPB), where solving each problem requires multi-step synthesis via multi-turn conversation between the user and the model. Our findings show the emergence of conversational capabilities and the effectiveness of the proposed conversational program synthesis paradigm. In addition, our model CodeGen (with up to 16B parameters trained on TPU-v4) outperforms OpenAI’s Codex on the HumanEval benchmark. We make the training library JaxFormer including checkpoints available as open source contribution: github.com/salesforce/…

代码生成旨在生成一个计算机程序作为给定问题标准的处理方案。咱们提出了一种经过大型言语模型进行对话式程序组成的办法,该办法处理了之前的办法所面临的在巨大的程序空间和用户目的标准上进行查找的应战。咱们的新办法将编写标准和程序的进程描述为用户和体系之间的多轮对话。它将代码生成视为一个序列猜测问题,其间标准是用天然言语表达的,所需的程序是有条件采样的。咱们在天然言语和编程言语数据上练习一个大型言语模型系列,称为CodeGen。随着数据中的弱监督以及数据规划和模型规划的扩大,对话才能从简略的自回归言语建模中出现。为了研讨对话式程序组成的模型行为,咱们开发了一个多轮编程基准(MTPB),其间处理每个问题需求经过用户和模型之间的多轮对话进行多过程组成。咱们的发现显现了对话才能的出现和所提出的对话式程序组成范式的有用性。此外,咱们的模型CodeGen(在TPU-v4上练习了多达16B的参数)在HumanEval基准上超过了OpenAI的Codex。咱们将包含检查点在内的练习库JaxFormer作为开放源码奉献 github.com/salesforce/…

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:Elucidating the Design Space of Diffusion-Based Generative Models

论文标题:Elucidating the Design Space of Diffusion-Based Generative Models

论文时刻:1 Jun 2022

所属范畴:计算机视觉

对应使命:Image Generation,图画生成

论文地址:arxiv.org/abs/2206.00…

代码完成:github.com/crowsonkb/k…

论文作者:Tero Karras, Miika Aittala, Timo Aila, Samuli Laine

论文简介:We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices./咱们以为,根据分散的生成模型的理论和实践现在的扑朔迷离有一些是不必要的,并试图经过提出一个明确区分具体规划选择的规划空间来纠正这种状况。

论文摘要:We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. This lets us identify several changes to both the sampling and training processes, as well as preconditioning of the score networks. Together, our improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a class-conditional setting and 1.97 in an unconditional setting, with much faster sampling (35 network evaluations per image) than prior designs. To further demonstrate their modular nature, we show that our design changes dramatically improve both the efficiency and quality obtainable with pre-trained score networks from previous work, including improving the FID of an existing ImageNet-64 model from 2.07 to near-SOTA 1.55.

咱们以为,根据分散的生成模型的理论和实践现在的扑朔迷离有一些是不必要的,并试图经过提出一个明确区分具体规划选择的规划空间来纠正这种状况。这让咱们确认了采样和练习进程的一些改变,以及得分网络的预处理。咱们的改善一起产生了新的最先进的FID,CIFAR-10在类别条件设置中为1.79,在无条件设置中为1.97,采样速度比曾经的规划快得多(每幅图画35次网络评价)。为了进一步证明它们的模块化性质,咱们标明咱们的规划改变极大地提高了曾经作业中预练习的评分网络的效率和质量,包含将现有ImageNet-64模型的FID从2.07提高到挨近SOTA的1.55。

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

论文:TAVA: Template-free Animatable Volumetric Actors

论文标题:TAVA: Template-free Animatable Volumetric Actors

论文时刻:17 Jun 2022

所属范畴:计算机视觉

论文地址:arxiv.org/abs/2206.08…

代码完成:github.com/facebookres…

论文作者:RuiLong Li, Julian Tanke, Minh Vo, Michael Zollhofer, Jurgen Gall, Angjoo Kanazawa, Christoph Lassner

论文简介:Since TAVA does not require a body template, it is applicable to humans as well as other creatures such as animals./因为 TAVA 不需求身体模板,因此它适用于人类以及动物等其他生物。

论文摘要:Coordinate-based volumetric representations have the potential to generate photo-realistic virtual avatars from images. However, virtual avatars also need to be controllable even to a novel pose that may not have been observed. Traditional techniques, such as LBS, provide such a function; yet it usually requires a hand-designed body template, 3D scan data, and limited appearance models. On the other hand, neural representation has been shown to be powerful in representing visual details, but are under explored on deforming dynamic articulated actors. In this paper, we propose TAVA, a method to create T emplate-free Animatable Volumetric Actors, based on neural representations. We rely solely on multi-view data and a tracked skeleton to create a volumetric model of an actor, which can be animated at the test time given novel pose. Since TAVA does not require a body template, it is applicable to humans as well as other creatures such as animals. Furthermore, TAVA is designed such that it can recover accurate dense correspondences, making it amenable to content-creation and editing tasks. Through extensive experiments, we demonstrate that the proposed method generalizes well to novel poses as well as unseen views and showcase basic editing capabilities.

根据坐标的体积标明法有可能从图画中生成照片般实在的虚拟化身。但是,虚拟头像也需求可控,即使是一个可能没有被观察到的新姿态。传统的技能,如LBS,供给了这样的功用;但它通常需求手艺规划的身体模板、3D扫描数据和有限的外观模型。另一方面,神经表征已被证明在代表视觉细节方面十分强壮,但在动态铰接演员的变形方面还没有得到充沛的探究。在本文中,咱们提出了TAVA,一种根据神经表征的、创立无模板的可动画化体积演员的办法。咱们只是依托多视图数据和盯梢的骨架来创立演员的体积模型,该模型可以在给定新姿态的测试时刻内被动画化。因为TAVA不需求身体模板,它适用于人类和其他生物,如动物。此外,TAVA的规划使其可以恢复准确的密布对应关系,使其适用于内容创立和修改使命。经过广泛的试验,咱们证明了所提出的办法可以很好地归纳新的姿态以及未见过的视图,并展现了基本的修改才能。

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

人工智能 | ShowMeAI资讯日报 #2022.06.29

咱们是 ShowMeAI,致力于传达AI优质内容,共享职业处理方案,用常识加快每一次技能生长!点击检查 历史文章列表,在大众号内订阅论题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。

人工智能 | ShowMeAI资讯日报 #2022.06.29

  • 作者:韩信子@ShowMeAI
  • 历史文章列表
  • 专题合辑&电子月刊
  • 声明:版权所有,转载请联系平台与作者并注明出处
  • 欢迎回复,托付点赞,留言引荐中有价值的文章、东西或主张,咱们都会赶快回复哒~