继续创作,加快生长!这是我参加「日新方案 · 6 月更文挑战」的第25天,点击检查活动详情

ShowMeAI日报系列全新升级!覆盖AI人工智能 东西&结构 | 项目&代码 | 博文&同享 | 数据&资源 | 研讨&论文 等方向。点击检查 历史文章列表,在大众号内订阅论题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。

1.东西&结构

东西库:PyTorch Lightning + Hydra深度学习项目模板

tags: [pytorch,lightning,深度学习,模板]

‘PyTorch Lightning + Hydra Template – PyTorch Lightning + Hydra + Optuna + Weights&Biases. A very general, feature-rich template for rapid and scalable Machine Learning experimentation process’ by Łukasz Zalewski

GitHub: github.com/hobogalaxy/…

东西:MockingBird——AI中文拟声: 5秒内克隆您的声音并生成任意语音内容

tags: [语音,声音克隆]

‘Clone a voice in 5 seconds to generate arbitrary speech in real-time’ by Vega

GitHub: github.com/babysor/Moc…

东西渠道:square-core——问答在线研讨渠道

tags: [问答体系,在线渠道]

‘square-core – SQuARE: Software for question answering research.’ by

UKP-SQuARE

GitHub: github.com/UKP-SQuARE/…

东西库:PosePipe: 用于临床研讨的开源人体姿势估计管道

tags: [姿势预估,姿势检测]

‘PosePipe: Open-Source Human Pose Estimation Pipeline for Clinical Research’ by peabody124

GitHub: github.com/peabody124/…

东西:PicoShare——极简的、易于托管的图画/文件同享服务

tags: [文件托管]

‘PicoShare – A minimalist, easy-to-host service for sharing images and other files’ by Michael Lynch

GitHub: github.com/mtlynch/pic…

2.博文&同享

同享:图解算法数据结构

tags: [数据结构,图解]

GitHub: github.com/girliemac/a…

同享:wtfpython——一些有趣且鲜为人知的 Python 特性

tags: [python,特性]

这个项目搜集了Python 中那些难以理解和反人类直觉的比如以及鲜为人知的功用特性, 并尝试评论这些现象背后实在的原理!

GitHub: github.com/satwikkansa…

3.数据&资源

书本:《有用 Python 项目》

tags: [python,项目]

Link: practicalpython.yasoob.me/

书本:《Spark威望攻略》中文翻译

tags: [spark,大数据,攻略]

Link: snaildove.github.io/2020/02/10/…

4.研讨&论文

大众号回复关键字 日报,免费获取整理好的6月论文合辑。

论文:LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling**

论文标题:LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling

论文时刻:14 Jun 2022

所属范畴:自然言语处理

对应使命:Language Modelling,Masked Language Modeling,Question Answering,Text to Video Retrieval,Video Captioning,Video Question Answering,Video Retrieval,言语建模,蒙面言语建模,问答,文字转视频检索,视频字幕/看图说话,视频问答,视频检索

论文地址:arxiv.org/abs/2206.07…

代码完成:github.com/microsoft/l…

论文作者:Linjie Li, Zhe Gan, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Ce Liu, Lijuan Wang

论文简介:In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling (MLM) is used as the common interface for all pre-training and downstream tasks./在这项工作中,咱们探究了一个一致的 VidL 结构 LAVENDER,其间运用掩码言语建模 (MLM) 作为一切预练习和下流使命的通用接口。

论文摘要:Unified vision-language frameworks have greatly advanced in recent years, most of which adopt an encoder-decoder architecture to unify image-text tasks as sequence-to-sequence generation. However, existing video-language (VidL) models still require task-specific designs in model architecture and training objectives for each task. In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling (MLM) is used as the common interface for all pre-training and downstream tasks. Such unification leads to a simplified model architecture, where only a lightweight MLM head, instead of a decoder with much more parameters, is needed on top of the multimodal encoder. Surprisingly, experimental results show that this unified framework achieves competitive performance on 14 VidL benchmarks, covering video question answering, text-to-video retrieval and video captioning. Extensive analyses further demonstrate the advantage of LAVENDER over existing VidL methods in: (i) supporting all downstream tasks with just a single set of parameter values when multi-task finetuned; (ii) few-shot generalization on various downstream tasks; and (iii) enabling zero-shot evaluation on video question answering tasks. Code is available at github.com/microsoft/L….

近年来,一致的视觉言语结构取得了长足的进步,其间大多数选用编码器-解码器架构将图画-文本使命一致为序列到序列的生成。但是,现有的视频言语 (VidL) 模型仍然需求在模型架构和每个使命的练习方针中进行特定于使命的规划。在这项工作中,咱们探究了一个一致的 VidL 结构 LAVENDER,其间掩码言语建模 (MLM) 用作一切预练习和下流使命的通用接口。这种一致导致了简化的模型架构,在多形式编码器之上只需求一个轻量级的 MLM 头,而不是具有更多参数的解码器。令人惊讶的是,实验成果表明,这个一致的结构在 14 个 VidL 基准上完成了具有竞赛力的功用,包含视频问答、文本到视频检索和视频字幕/看图说话。广泛的分析进一步证明晰 LAVENDER 优于现有 VidL 办法的优势在于:(i)在多使命微调时仅运用一组参数值支撑一切下流使命; (ii) 对各种下流使命的小样本泛化; (iii) 对视频问答使命进行零样本评价。代码可在 github.com/microsoft/L… 取得。

论文:OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes**

论文标题:OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes

论文时刻:5 Jun 2022

所属范畴:知识库

对应使命:知识图谱

论文地址:arxiv.org/abs/2206.02…

代码完成:github.com/astrazeneca…

论文作者:David Geleta, Andriy Nikolov, Mark ODonoghue, Benedek Rozemberczki, Anna Gogleva, Valentina Tamma, Terry R. Payne

论文简介:Duplication of nodes is a common problem encountered when building knowledge graphs (KGs) from heterogeneous datasets, where it is crucial to be able to merge nodes having the same meaning./节点重复是从异构数据集构建知识图(KG)时遇到的常见问题,其间能够兼并具有相同意义的节点至关重要。

论文摘要:Duplication of nodes is a common problem encountered when building knowledge graphs (KGs) from heterogeneous datasets, where it is crucial to be able to merge nodes having the same meaning. OntoMerger is a Python ontology integration library whose functionality is to deduplicate KG nodes. Our approach takes a set of KG nodes, mappings and disconnected hierarchies and generates a set of merged nodes together with a connected hierarchy. In addition, the library provides analytic and data testing functionalities that can be used to fine-tune the inputs, further reducing duplication, and to increase connectivity of the output graph. OntoMerger can be applied to a wide variety of ontologies and KGs. In this paper we introduce OntoMerger and illustrate its functionality on a real-world biomedical KG.

节点重复是从异构数据集中构建知识图(KG)时遇到的常见问题,其间能够兼并具有相同意义的节点至关重要。 OntoMerger 是一个 Python 本体集成库,其功用是去重 KG 节点。咱们的办法选用一组 KG 节点、映射和断开的层次结构,并生成一组兼并的节点以及连接的层次结构。此外,该库还供给分析和数据测试功用,可用于微调输入,进一步削减重复,并增加输出图的连通性。 OntoMerger 能够运用于各种本体和 KG。在本文中,咱们介绍了 OntoMerger 并在实在国际的生物医学 KG 上说明晰它的功用。

论文:Recipe for a General, Powerful, Scalable Graph Transformer

论文标题:Recipe for a General, Powerful, Scalable Graph Transformer

论文时刻:25 May 2022

所属范畴:图算法

对应使命:Graph Classification,Graph Property Prediction,Graph Regression,Graph Representation Learning,Node Classification,Representation Learning,图分类,图属性猜测,图回归,图表明学习,节点分类,表明学习

论文地址:arxiv.org/abs/2205.12…

代码完成:github.com/rampasek/Gr…

论文作者:Ladislav Rampášek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, Dominique Beaini

论文简介:We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks./咱们提出了一个关于怎么构建一个通用的、强壮的、可扩展的 (GPS) 图Transformer的办法,该Transformer具有线性杂乱性和在各种基准上的最新成果。

论文摘要:We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with a clearer definition and categorize them as being local, global or relative. Further, GTs remain constrained to small graphs with few hundred nodes, and we propose the first architecture with a complexity linear to the number of nodes and edges O(N+E) by decoupling the local real-edge aggregation from the fully-connected Transformer. We argue that this decoupling does not negatively affect the expressivity, with our architecture being a universal function approximator for graphs. Our GPS recipe consists of choosing 3 main ingredients: (i) positional/structural encoding, (ii) local message-passing mechanism, and (iii) global attention mechanism. We build and open-source a modular framework GraphGPS that supports multiple types of encodings and that provides efficiency and scalability both in small and large graphs. We test our architecture on 11 benchmarks and show very competitive results on all of them, show-casing the empirical benefits gained by the modularity and the combination of different strategies.

咱们提出了怎么构建一个通用的、强壮的、可扩展的 (GPS) 图 Transformer 的办法,它具有线性杂乱性和在各种基准上的最新成果。 Graph Transformers (GTs) 在图表明学习范畴已经取得了广泛的欢迎,最近发表了各种出版物,但它们缺少关于什么构成杰出的方位或结构编码以及它们之间的区别的一起根底。在本文中,咱们用更清晰的界说总结了不同类型的编码,并将它们分类为部分的、大局的或相对的。此外,GT 仍然受限于具有几百个节点的小图,咱们经过将部分实边聚合与全连接 Transformer 解耦,提出了第一个杂乱度与节点和边数 O(N+E) 成线性关系的架构。咱们以为这种解耦不会对表达性发生负面影响,因为咱们的架构是图的通用函数迫临器。咱们的 GPS 配方包含选择 3 个主要成分:(i) 方位/结构编码,(ii) 本地音讯传递机制,以及 (iii) 大局注意力机制。咱们构建并开源了一个模块化结构 GraphGPS,它支撑多种类型的编码,并在小型和大型图形中供给功率和可扩展性。咱们在 11 个基准上测试了咱们的架构,并在一切这些基准上展现了十分有竞赛力的成果,展现了模块化和不同战略组合所取得的经历优势。

论文:LiDAR Snowfall Simulation for Robust 3D Object Detection**

论文标题:LiDAR Snowfall Simulation for Robust 3D Object Detection

论文时刻:CVPR 2022

所属范畴:计算机视觉

对应使命:3D Object Detection,Autonomous Driving,object-detection,Object Detection,Physical Simulations,3D物体检测,自动驾驶,物体检测,物理模仿

论文地址:arxiv.org/abs/2203.15…

代码完成:github.com/syscv/lidar…

论文作者:Martin Hahner, Christos Sakaridis, Mario Bijelic, Felix Heide, Fisher Yu, Dengxin Dai, Luc van Gool

论文简介:Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds./因为在此设置中搜集和注释练习数据的困难,咱们提出了一种根据物理的办法来模仿降雪对实在气候晴天 LiDAR 点云的影响。

论文摘要:3D object detection is a central task for applications such as autonomous driving, in which the system needs to localize and classify surrounding traffic agents, even in the presence of adverse weather. In this paper, we address the problem of LiDAR-based 3D object detection under snowfall. Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds. Our method samples snow particles in 2D space for each LiDAR line and uses the induced geometry to modify the measurement for each LiDAR beam accordingly. Moreover, as snowfall often causes wetness on the ground, we also simulate ground wetness on LiDAR point clouds. We use our simulation to generate partially synthetic snowy LiDAR data and leverage these data for training 3D object detection models that are robust to snowfall. We conduct an extensive evaluation using several state-of-the-art 3D object detection methods and show that our simulation consistently yields significant performance gains on the real snowy STF dataset compared to clear-weather baselines and competing simulation approaches, while not sacrificing performance in clear weather. Our code is available at www.github.com/SysCV/LiDAR….

3D 物体检测是自动驾驶等运用的中心使命,其间体系需求对周围的交通主体进行定位和分类,即使在恶劣气候的情况下也是如此。在本文中,咱们处理了降雪下根据 LiDAR 的 3D 方针检测问题。因为在此设置中搜集和注释练习数据的困难,咱们提出了一种根据物理的办法来模仿降雪对实在晴天 LiDAR 点云的影响。咱们的办法在二维空间中为每条 LiDAR 线采样雪粒,并运用诱导几许来修改测量值相应地对每个 LiDAR 光束进行替换。此外,因为降雪经常导致地上潮湿,咱们还在 LiDAR 点云上模仿了地上湿度。咱们运用咱们的模仿生成部分合成的下雪 LiDAR 数据,并运用这些数据来练习对降雪具有鲁棒性的 3D 方针检测模型。咱们运用几种最先进的 3D 方针检测办法进行了广泛的评价,并表明与晴天基线和竞赛模仿办法比较,咱们的模仿一直在实在的雪 STF 数据集上发生显着的功用提升,一起不献身功用气候晴朗。咱们的代码可在 www.github.com/SysCV/LiDAR… 取得。

论文:SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

论文标题:SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

论文时刻:CVPR 2022

所属范畴:计算机视觉

对应使命:神经烘托

论文地址:arxiv.org/abs/2201.12…

代码完成:github.com/jby1993/sel…

论文作者:Boyi Jiang, Yang Hong, Hujun Bao, Juyong Zhang

论文简介:Meanwhile, the explicit mesh is updated periodically to adjust its topology changes, and a consistency loss is designed to match both representations./一起,显式网格会定时更新以调整其拓扑变化,并规划一致性丢失来匹配两种表明。

论文摘要:We propose SelfRecon, a clothed human body reconstruction method that combines implicit and explicit representations to recover space-time coherent geometries from a monocular self-rotating human video. Explicit methods require a predefined template mesh for a given sequence, while the template is hard to acquire for a specific subject. Meanwhile, the fixed topology limits the reconstruction accuracy and clothing types. Implicit representation supports arbitrary topology and can represent high-fidelity geometry shapes due to its continuous nature. However, it is difficult to integrate multi-frame information to produce a consistent registration sequence for downstream applications. We propose to combine the advantages of both representations. We utilize differential mask loss of the explicit mesh to obtain the coherent overall shape, while the details on the implicit surface are refined with the differentiable neural rendering. Meanwhile, the explicit mesh is updated periodically to adjust its topology changes, and a consistency loss is designed to match both representations. Compared with existing methods, SelfRecon can produce high-fidelity surfaces for arbitrary clothed humans with self-supervised optimization. Extensive experimental results demonstrate its effectiveness on real captured monocular videos. The source code is available at github.com/jby1993/Sel…

咱们提出了 SelfRecon,一种结合隐式和显式表明的穿衣人重建办法,从单目自旋转人体视频中恢复时空相干几许图形。显式办法需求为给定序列预界说模板网格,而关于特定主题很难获取模板。一起,固定的拓扑结构束缚了重建精度和服装类型。隐式表明支撑任意拓扑,并且因为其连续性,能够表明高保真几许形状。但是,很难整合多帧信息来为下流运用程序生成一致的注册序列。咱们建议结合两种表明的优点。咱们运用显式网格的微分掩模丢失来取得连贯的整体形状,而隐式外表上的细节则经过可微分的神经烘托进行细化。一起,显式网格会定时更新以调整其拓扑变化,并规划了一致性丢失来匹配两种表明。与现有办法比较,SelfRecon 能够经过自我监督优化为任意穿着衣服的人生成高保真外表。广泛的实验成果证明晰它对实在捕获的单目视频的有效性。源代码位于 github.com/jby1993/Sel…

论文:Neural Basis Models for Interpretability

论文标题:Neural Basis Models for Interpretability

论文时刻:27 May 2022

所属范畴:机器学习

对应使命:Additive models,Interpretable Machine Learning,加法模型,可解说的机器学习

论文地址:arxiv.org/abs/2205.14…

代码完成:github.com/facebookres…

论文作者:Filip Radenovic, Abhimanyu Dubey, Dhruv Mahajan

论文简介:However, these models are typically black-box deep neural networks, explained post-hoc via methods with known faithfulness limitations./但是,这些模型一般是黑盒深度神经网络,经过具有已知信度束缚的办法进行事后解说。

论文摘要:Due to the widespread use of complex machine learning models in real-world applications, it is becoming critical to explain model predictions. However, these models are typically black-box deep neural networks, explained post-hoc via methods with known faithfulness limitations. Generalized Additive Models (GAMs) are an inherently interpretable class of models that address this limitation by learning a non-linear shape function for each feature separately, followed by a linear model on top. However, these models are typically difficult to train, require numerous parameters, and are difficult to scale. We propose an entirely new subfamily of GAMs that utilizes basis decomposition of shape functions. A small number of basis functions are shared among all features, and are learned jointly for a given task, thus making our model scale much better to large-scale data with high-dimensional features, especially when features are sparse. We propose an architecture denoted as the Neural Basis Model (NBM) which uses a single neural network to learn these bases. On a variety of tabular and image datasets, we demonstrate that for interpretable machine learning, NBMs are the state-of-the-art in accuracy, model size, and, throughput and can easily model all higher-order feature interactions. Source code is available at github.com/facebookres…

因为杂乱机器学习模型在实际运用中的广泛运用,解说模型猜测变得至关重要。但是,这些模型一般是黑盒深度神经网络,经过具有已知信度束缚的办法进行事后解说。广义加法模型 (GAM) 是一类固有的可解说模型,它经过分别学习每个特征的非线性形状函数,然后在顶部学习线性模型来处理这一束缚。但是,这些模型一般难以练习、需求很多参数并且难以扩展。咱们提出了一个全新的 GAM 子家族,它运用形状函数的基分解。少量的基函数在一切特征之间同享,并针对给定使命一起学习,从而使咱们的模型对具有高维特征的大规模数据具有更好的扩展性,尤其是在特征稀疏的情况下。咱们提出了一种称为神经根底模型(NBM)的架构,它运用单个神经网络来学习这些根底。在各种表格和图画数据集上,咱们证明关于可解说的机器学习,NBM 在准确性、模型大小和吞吐量方面是最先进的,并且能够轻松地对一切高阶特征交互进行建模。源代码可在 github.com/facebookres… 取得。

论文:NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

论文标题:NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

论文时刻:CVPR 2022

所属范畴:计算机视觉

对应使命:object-detection,Object Detection,Semantic Segmentation,物体检测,物体检测,语义切割

论文地址:arxiv.org/abs/2111.12…

代码完成:github.com/tencentyout…

论文作者:Hao liu, Xinghua Jiang, Xin Li, Zhimin Bao, Deqiang Jiang, Bo Ren

论文简介:For the sake of trade-off between efficiency and performance, a group of works merely perform SA operation within local patches, whereas the global contextual information is abandoned, which would be indispensable for visual recognition tasks./为了功率和功用之间的权衡,一组作品仅在部分块内履行 SA 操作,而抛弃了大局上下文信息,这关于视觉辨认使命来说是必不可少的。

论文摘要:Recently, Vision Transformers (ViT), with the self-attention (SA) as the de facto ingredients, have demonstrated great potential in the computer vision community. For the sake of trade-off between efficiency and performance, a group of works merely perform SA operation within local patches, whereas the global contextual information is abandoned, which would be indispensable for visual recognition tasks. To solve the issue, the subsequent global-local ViTs take a stab at marrying local SA with global one in parallel or alternative way in the model. Nevertheless, the exhaustively combined local and global context may exist redundancy for various visual data, and the receptive field within each layer is fixed. Alternatively, a more graceful way is that global and local context can adaptively contribute per se to accommodate different visual data. To achieve this goal, we in this paper propose a novel ViT architecture, termed NomMer, which can dynamically Nominate the synergistic global-local context in vision transforMer. By investigating the working pattern of our proposed NomMer, we further explore what context information is focused. Beneficial from this “dynamic nomination” mechanism, without bells and whistles, the NomMer can not only achieve 84.5% Top-1 classification accuracy on ImageNet with only 73M parameters, but also show promising performance on dense prediction tasks, i.e., object detection and semantic segmentation. The code and models will be made publicly available at github.com/TencentYout…

最近,以自注意 (SA) 作为现实成分的视觉Transformers (ViT) 在计算机视觉范畴表现出了巨大的潜力。为了在功率和功用之间进行权衡,一组作品仅在部分块内履行 SA 操作,而抛弃了大局上下文信息,这关于视觉辨认使命来说是必不可少的。为了处理这个问题,随后的大局-部分 ViT 尝试在模型中以并行或代替办法将部分 SA 与大局 SA 结合起来。但是,翔实组合的部分和大局上下文可能存在各种视觉数据的冗余,并且每一层内的感触野是固定的。或者,一种更高雅的办法是大局和部分上下文本身能够自习惯地贡献以习惯不同的视觉数据。为了完成这一方针,咱们在本文中提出了一种新的 ViT 架构,称为 NomMer,它能够动态地提名视觉Transformer中的协同大局-部分上下文。经过研讨咱们提出的 NomMer 的工作形式,咱们进一步探究了关注哪些上下文信息。得益于这种“动态提名”机制,NomMer 不只能够在只要 73M 参数的 ImageNet 上完成 84.5% 的 Top-1 分类准确率,并且在密集猜测使命(即方针检测和语义)上表现出可观的功用切割。代码和模型将在 github.com/TencentYout… 揭露

论文:Distance-Sensitive Offline Reinforcement Learning

论文标题:Distance-Sensitive Offline Reinforcement Learning

论文时刻:23 May 2022

所属范畴:强化学习

对应使命:Offline RL,reinforcement-learning,离线强化学习,强化学习

论文地址:arxiv.org/abs/2205.11…

代码完成:github.com/Facebear-lj…

论文作者:Jianxiong Li, Xianyuan Zhan, Haoran Xu, Xiangyu Zhu, Jingjing Liu, Ya-Qin Zhang

论文简介:In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas./在离线强化学习 (RL) 中,战略学习的一个晦气问题是深度 Q 函数在散布外 (OOD) 区域的差错累积。

论文摘要:In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas. Unfortunately, existing offline RL methods are often over-conservative, inevitably hurting generalization performance outside data distribution. In our study, one interesting observation is that deep Q functions approximate well inside the convex hull of training data. Inspired by this, we propose a new method, DOGE (Distance-sensitive Offline RL with better GEneralization). DOGE marries dataset geometry with deep function approximators in offline RL, and enables exploitation in generalizable OOD areas rather than strictly constraining policy within data distribution. Specifically, DOGE trains a state-conditioned distance function that can be readily plugged into standard actor-critic methods as a policy constraint. Simple yet elegant, our algorithm enjoys better generalization compared to state-of-the-art methods on D4RL benchmarks. Theoretical analysis demonstrates the superiority of our approach to existing methods that are solely based on data distribution or support constraints.

在离线强化学习 (RL) 中,战略学习的一个晦气问题是深度 Q 函数在散布外 (OOD) 区域的差错累积。不幸的是,现有的离线 RL 办法一般过于保守,不可避免地会损害数据散布之外的泛化功用。在咱们的研讨中,一个有趣的调查是深度 Q 函数在练习数据的凸包内很好地近似。受此启发,咱们提出了一种新办法 DOGE(具有更好 GEneralization 的间隔敏感离线 RL)。 DOGE 将数据集几许与离线 RL 中的深度函数迫临器结合起来,并能够在可泛化的 OOD 区域中进行运用,而不是在数据散布中严厉束缚战略。具体来说,DOGE 练习了一个状态条件间隔函数,该函数能够很容易地刺进标准的actor-critic 办法作为战略束缚。与 D4RL 基准上的最新办法比较,咱们的算法简单而高雅,具有更好的泛化性。理论分析证明晰咱们的办法优于仅根据数据散布或支撑束缚的现有办法。

咱们是 ShowMeAI,致力于传播AI优质内容,同享职业处理方案,用知识加快每一次技术生长!点击检查 历史文章列表,在大众号内订阅论题 #ShowMeAI资讯日报,可接收每日最新推送。点击 专题合辑&电子月刊 快速阅读各专题全集。

  • 作者:韩信子@ShowMeAI
  • 历史文章列表
  • 专题合辑&电子月刊
  • 声明:版权一切,转载请联系渠道与作者并注明出处
  • 欢迎回复,托付点赞,留言推荐中有价值的文章、东西或建议,咱们都会赶快回复哒~