继续创造,加快生长!这是我参与「日新计划 · 6 月更文应战」的第32天,点击检查活动详情

ShowMeAI日报系列全新升级!掩盖AI人工智能 东西&结构 | 项目&代码 | 博文&共享 | 数据&资源 | 研讨&论文 等方向。点击检查 历史文章列表,在公众号内订阅论题 #ShowMeAI资讯日报,可接纳每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。

1.东西&结构

东西结构:flair – 集成最先进NLP技能的简略结构(Python)

tags: [NLP技能,NLP应用]

‘flair – A very simple framework for state-of-the-art NLP’ by Zalando Research

GitHub: github.com/flairNLP/fl…

东西库:cleanlab – 机器学习数据集过错主动发现修正东西包

tags: [机器学习,数据集过错,过错修正]

‘cleanlab – The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.’

GitHub: github.com/cleanlab/cl…

东西库:OpenFold – AlphaFold 2的PyTorch版开源复现

tags: [AlphaFold 2,pytorch]

‘OpenFold – Trainable PyTorch reproduction of AlphaFold 2’ by AQ Laboratory

GitHub: github.com/aqlaborator…

东西库:darts – Python时序处理与猜测库

tags: [时刻序列]

‘darts – A python library for easy manipulation and forecasting of time series.’ by Unit8 SA

GitHub: github.com/unit8co/dar…

东西库:RapidOCR – 根据PaddleOCR & OnnxRuntime的跨渠道OCR库

tags: [OCR,跨渠道]

‘RapidOCR (捷智OCR) – A cross platform OCR Library based on PaddleOCR & OnnxRuntime’

GitHub: github.com/RapidAI/Rap…

2.博文&共享

共享:读博申请攻略

Tutorial on PhD Application’ by Lijin Zhang

GitHub: github.com/zhanglj37/T…

课程:Go语言入门与进阶课程

‘Go Course – Master the fundamentals and advanced features of the Go programming language’ by Karan Pratap Singh

GitHub: github.com/karanpratap…

3.数据&资源

资源列表:时序AI相关资源大列表

‘AI for Time Series (AI4TS) Papers, Tutorials, and Surveys – A professional list of Papers, Tutorials, and Surveys on AI for Time Series in top AI conferences and journals.’ by Qingsong Wen

GitHub: github.com/qingsongedu…

资源列表:弱监督语义切割论文汇总

‘Awesome Weakly Supervised Semantic Segmentation – A comprehensive list of weakly supervised semantic segmentation (WSSS) works from 2014 to 2022.’ by Xiaojian Zhong

GitHub: github.com/xiaojianzho…

4.研讨&论文

公众号后台回复关键字 日报,免费获取整理好的6月论文合辑。

论文:POGEMA: Partially Observable Grid Environment for Multiple Agents

论文标题:POGEMA: Partially Observable Grid Environment for Multiple Agents

论文时刻:22 Jun 2022

论文地址:arxiv.org/abs/2206.10…

代码完成:github.com/airi-instit…

论文作者:Alexey Skrynnik, Anton Andreychuk, Konstantin Yakovlev, Aleksandr I. Panov

论文简介:We introduce POGEMA (https://github. com/AIRI-Institute/pogema) a sandbox for challenging partially observable multi-agent pathfinding (PO-MAPF) problems ./咱们推出的POGEMA(https://github. com/AIRI-Institute/pogema)是一个用于应战部分可调查的多Agent寻路(PO-MAPF)问题的沙盒。

论文摘要:We introduce POGEMA (github.com/AIRI-Instit…) a sandbox for challenging partially observable multi-agent pathfinding (PO-MAPF) problems . This is a grid-based environment that was specifically designed to be a flexible, tunable and scalable benchmark. It can be tailored to a variety of PO-MAPF, which can serve as an excellent testing ground for planning and learning methods, and their combination, which will allow us to move towards filling the gap between AI planning and learning.

咱们推出了POGEMA(github.com/AIRI-Instit…

论文:Plotly-Resampler: Effective Visual Analytics for Large Time Series

论文标题:Plotly-Resampler: Effective Visual Analytics for Large Time Series

论文时刻:17 Jun 2022

所属范畴:时刻序列

对应使命:Data Visualization,Time Series,Time Series Analysis,数据可视化,时刻序列,时刻序列剖析

论文地址:arxiv.org/abs/2206.08…

代码完成:github.com/predict-idl…

论文作者:Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost, Sofie Van Hoecke

论文简介:We observe that open source Python visualization toolkits empower data scientists in most visual analytics tasks, but lack the combination of scalability and interactivity to realize effective time series visualization./咱们调查到,开源的Python可视化东西包使数据科学家能够完成大多数可视化剖析使命,但缺少完成有用时刻序列可视化的可扩展性和互动性的组合东西。

论文摘要:Visual analytics is arguably the most important step in getting acquainted with your data. This is especially the case for time series, as this data type is hard to describe and cannot be fully understood when using for example summary statistics. To realize effective time series visualization, four requirements have to be met; a tool should be (1) interactive, (2) scalable to millions of data points, (3) integrable in conventional data science environments, and (4) highly configurable. We observe that open source Python visualization toolkits empower data scientists in most visual analytics tasks, but lack the combination of scalability and interactivity to realize effective time series visualization. As a means to facilitate these requirements, we created Plotly-Resampler, an open source Python library. Plotly-Resampler is an add-on for Plotly’s Python bindings, enhancing line chart scalability on top of an interactive toolkit by aggregating the underlying data depending on the current graph view. Plotly-Resampler is built to be snappy, as the reactivity of a tool qualitatively affects how analysts visually explore and analyze data. A benchmark task highlights how our toolkit scales better than alternatives in terms of number of samples and time series. Additionally, Plotly-Resampler’s flexible data aggregation functionality paves the path towards researching novel aggregation techniques. Plotly-Resampler’s integrability, together with its configurability, convenience, and high scalability, allows to effectively analyze high-frequency data in your day-to-day Python environment.

可视化剖析能够说是了解数据的最重要过程。这对于时刻序列来说尤其如此,因为这种数据类型很难描述,在运用例如汇总统计时无法彻底理解。为了完成有用的时刻序列可视化,有必要满意四个要求;一个东西应该是(1)互动的,(2)可扩展到数百万的数据点,(3)可集成到传统的数据科学环境中,以及(4)高度可配置。咱们调查到,开源的Python可视化东西包使数据科学家能够完成大多数可视化剖析使命,但缺少可扩展性和互动性的结合,无法完成有用的时刻序列可视化。作为促进这些要求的一种手段,咱们创建了Plotly-Resampler,一个开源的Python库。Plotly-Resampler是Plotly的Python绑定的一个插件,经过依据当前的图形视图聚合底层数据,在交互式东西包的基础上增强线图的可扩展性。Plotly-Resampler的树立是为了让它更敏捷,因为一个东西的反应性从质量上影响了剖析师对数据的视觉探究和剖析。一个基准使命强调了咱们的东西包在样本数和时刻序列方面的扩展性优于其他方案。此外,Plotly-Resampler灵敏的数据聚合功能为研讨新式聚合技能铺平了道路。Plotly-Resampler的可整合性,加上它的可配置性、便利性和高可扩展性,能够在日常的Python环境中有用地剖析高频数据。

论文:Sequencer: Deep LSTM for Image Classification

论文标题:Sequencer: Deep LSTM for Image Classification

论文时刻:4 May 2022

所属范畴:核算机视觉

对应使命:Classification,Domain Generalization,Image Classification,Inductive Bias,Natural Language Processing,图画分类,域泛化,图画分类,概括偏置,自然语言处理

论文地址:arxiv.org/abs/2205.01…

代码完成:github.com/rwightman/p… , github.com/okojoalg/se… , github.com/timeseriesA… , github.com/liuruiyang9…

论文作者:Yuki Tatsunami, Masato Taki

论文简介:Here we propose Sequencer, a novel and competitive architecture alternative to ViT that provides a new perspective on these issues./在这里,咱们提出了Sequencer,一个新颖的、有竞争力的架构,能够代替ViT,为这些问题提供一个新的视角。

论文摘要:In recent computer vision research, the advent of the Vision Transformer (ViT) has rapidly revolutionized various architectural design efforts: ViT achieved state-of-the-art image classification performance using self-attention found in natural language processing, and MLP-Mixer achieved competitive performance using simple multi-layer perceptrons. In contrast, several studies have also suggested that carefully redesigned convolutional neural networks (CNNs) can achieve advanced performance comparable to ViT without resorting to these new ideas. Against this background, there is growing interest in what inductive bias is suitable for computer vision. Here we propose Sequencer, a novel and competitive architecture alternative to ViT that provides a new perspective on these issues. Unlike ViTs, Sequencer models long-range dependencies using LSTMs rather than self-attention layers. We also propose a two-dimensional version of Sequencer module, where an LSTM is decomposed into vertical and horizontal LSTMs to enhance performance. Despite its simplicity, several experiments demonstrate that Sequencer performs impressively well: Sequencer2D-L, with 54M parameters, realizes 84.6% top-1 accuracy on only ImageNet-1K. Not only that, we show that it has good transferability and the robust resolution adaptability on double resolution-band.

在最近的核算机视觉研讨中,视觉Transformer(ViT)的呈现敏捷革新了各种架构规划工作。ViT运用自然语言处理中发现的自我留意完成了最先进的图画分类功能,而MLP-Mixer运用简略的多层感知器完成了具有竞争力的功能。比较之下,一些研讨也标明,精心从头规划的卷积神经网络(CNN)能够完成与ViT相媲美的先进功能,而无需借助这些新的想法。在此背景下,人们对什么样的概括倾向适合于核算机视觉的爱好越来越大。在这里,咱们提出了Sequencer,一个新颖的、有竞争力的架构,能够代替ViT,为这些问题提供一个新的视角。与ViTs不同,Sequencer运用LSTMs而不是自我留意层来模拟长距离的依靠联络。咱们还提出了一个二维版本的Sequencer模块,其间一个LSTM被分解成垂直和水平LSTM以提高功能。尽管它很简略,但一些试验证明Sequencer的表现令人惊奇。Sequencer2D-L,有54M个参数,仅在ImageNet-1K上就完成了84.6%的最高精度。不仅如此,咱们还标明它具有良好的可转移性和对双分辨率段的强大的分辨率适应性。

论文:Re-parameterizing Your Optimizers rather than Architectures

论文标题:Re-parameterizing Your Optimizers rather than Architectures

论文时刻:30 May 2022

所属范畴:机器学习

对应使命:优化算法

论文地址:arxiv.org/abs/2205.15…

代码完成:github.com/dingxiaoh/r…

论文作者:Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding

论文简介:In this paper, we propose a novel paradigm of incorporating model-specific prior knowledge into optimizers and using them to train generic (simple) models./在本文中,咱们提出了一种新颖的范式,即把特定模型的先验常识归入优化器,并运用它们来练习通用(简略)模型。

论文摘要:The well-designed structures in neural networks reflect the prior knowledge incorporated into the models. However, though different models have various priors, we are used to training them with model-agnostic optimizers (e.g., SGD). In this paper, we propose a novel paradigm of incorporating model-specific prior knowledge into optimizers and using them to train generic (simple) models. As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper-parameters, which is referred to as Gradient Re-parameterization, and the optimizers are named RepOptimizers. For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with the recent well-designed models. From a practical perspective, RepOpt-VGG is a favorable base model because of its simple structure, high inference speed and training efficiency. Compared to Structural Re-parameterization, which adds priors into models via constructing extra training-time structures, RepOptimizers require no extra forward/backward computations and solve the problem of quantization. The code and models are publicly available at github.com/dingxiaoh/r…

神经网络中精心规划的结构反映了归入模型中的先验常识。然而,虽然不同的模型有不同的先验常识,但咱们习惯于用与模型无关的优化器(如SGD)来练习它们。在本文中,咱们提出了一种新的范式,将特定模型的先验常识归入优化器,并运用它们来练习通用(简略)模型。作为一种完成办法,咱们提出了一种新的办法,经过依据一组特定模型的超参数修改梯度来添加先验常识,这被称为梯度再参数化,优化器被称为RepOptimizers。为了使模型结构极端简略,咱们把要点放在一个VGG风格的一般模型上,并展现了这样一个用RepOptimizer练习的简略模型,咱们称之为RepOpt-VGG,其功能与最近规划好的模型相当。从有用的视点来看,RepOpt-VGG是一个有利的基础模型,因为它结构简略,推理速度快,练习效率高。与结构性从头参数化比较,结构性从头参数化是经过构建额定的练习时结构将先验要素加入到模型中,RepOptimizers不需要额定的前向/后向核算,并处理了量化问题。代码和模型可在 github.com/dingxiaoh/r… 揭露。

论文:WALT: Watch and Learn 2D Amodal Representation From Time-Lapse Imagery

论文标题:WALT: Watch and Learn 2D Amodal Representation From Time-Lapse Imagery

论文时刻:CVPR 2022

所属范畴:核算机视觉

对应使命:Amodal Instance Segmentation,object-detection,Object Detection,实例切割,物体检测,方针检测

论文地址:openaccess.thecvf.com/content/CVP…

代码完成:github.com/dineshreddy…

论文作者:N. Dinesh Reddy, Robert Tamburo, Srinivasa G. Narasimhan

论文简介:Labeled real data of occlusions is scarce (even in large datasets) and synthetic data leaves a domain gap, making it hard to explicitly model and learn occlusions./符号的阻塞物实在数据很少(即使在大型数据会集),而组成数据留下了范畴空白,因此很难明确地对阻塞物进行建模和学习。

论文摘要:Current methods for object detection, segmentation, and tracking fail in the presence of severe occlusions in busy urban environments. Labeled real data of occlusions is scarce (even in large datasets) and synthetic data leaves a domain gap, making it hard to explicitly model and learn occlusions. In this work, we present the best of both the real and synthetic worlds for automatic occlusion supervision using a large readily available source of data: time-lapse imagery from stationary webcams observing street intersections over weeks, months, or even years. We introduce a new dataset, Watch and Learn Time-lapse (WALT), consisting of 12 (4K and 1080p) cameras capturing urban environments over a year. We exploit this real data in a novel way to automatically mine a large set of unoccluded objects and then composite them in the same views to generate occlusions. This longitudinal self-supervision is strong enough for an amodal network to learn object-occluder-occluded layer representations. We show how to speed up the discovery of unoccluded objects and relate the confidence in this discovery to the rate and accuracy of training occluded objects. After watching and automatically learning for several days, this approach shows significant performance improvement in detecting and segmenting occluded people and vehicles, over human-supervised amodal approaches.

现在用于物体检测、切割和跟踪的办法在繁忙的城市环境中存在严重的遮挡现象时失利了。符号的阻塞物的实在数据很少(甚至在大型数据会集),而组成数据则留下了一个范畴的空白,因此很难对阻塞物进行明确的建模和学习。在这项工作中,咱们提出了实在国际和组成国际中最好的主动阻塞监督,运用了大量现成的数据源:来自固定网络摄像机的延时图画,在几周、几个月甚至几年内调查街道交叉口。咱们引入了一个新的数据集–调查和学习延时(WALT),由12个(4K和1080p)摄像头组成,在一年内捕捉城市环境。咱们以一种新颖的办法运用这些实在的数据,主动挖掘出一大批未被遮挡的物体,然后在同一视图中对它们进行组成,生成遮挡物。这种纵向的自我监督足以让一个模态网络学习物体-阻塞物-阻塞层的表征。咱们展现了怎么加快发现未被遮挡的物体,并将这种发现的决心与练习被遮挡物体的速度和准确性联络起来。经过几天的调查和主动学习,这种办法在检测和切割被遮挡的人和车辆方面显示出显着的功能改进,超过了人类监督的模数办法。

论文:AiTLAS: Artificial Intelligence Toolbox for Earth Observation

论文标题:AiTLAS: Artificial Intelligence Toolbox for Earth Observation

论文时刻:21 Jan 2022

所属范畴:核算机视觉

对应使命:Semantic Segmentation,Type prediction,语义切割,类型预估

论文地址:arxiv.org/abs/2201.08…

代码完成:github.com/biasvarianc…

论文作者:Ivica Dimitrovski, Ivan Kitanovski, Panče Panov, Nikola Simidjievski, Dragi Kocev

论文简介:The AiTLAS toolbox (Artificial Intelligence Toolbox for Earth Observation) includes state-of-the-art machine learning methods for exploratory and predictive analysis of satellite imagery as well as repository of AI-ready Earth Observation (EO) datasets./AiTLAS东西箱(地球观测人工智能东西箱)包含最先进的机器学习办法,用于卫星图画的探究和猜测剖析,以及可用于人工智能的地球观测(EO)数据集的储存库。

论文摘要:The AiTLAS toolbox (Artificial Intelligence Toolbox for Earth Observation) includes state-of-the-art machine learning methods for exploratory and predictive analysis of satellite imagery as well as repository of AI-ready Earth Observation (EO) datasets. It can be easily applied for a variety of Earth Observation tasks, such as land use and cover classification, crop type prediction, localization of specific objects (semantic segmentation), etc. The main goal of AiTLAS is to facilitate better usability and adoption of novel AI methods (and models) by EO experts, while offering easy access and standardized format of EO datasets to AI experts which further allows benchmarking of various existing and novel AI methods tailored for EO data.

AiTLAS东西箱(地球观测人工智能东西箱)包含最先进的机器学习办法,用于对卫星图画进行探究性和猜测性剖析,以及可用于人工智能的地球观测(EO)数据集库。它能够很容易地应用于各种地球观测使命,如土地运用和掩盖分类、作物类型猜测、特定物体的定位(语义切割)等。AiTLAS的主要方针是促进EO专家更好地运用和选用新的人工智能办法(和模型),同时为人工智能专家提供方便的EO数据集和标准化的格式,这进一步答应为EO数据定制的各种现有和新的人工智能办法进行基准测试。

论文:MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

论文标题:MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

论文时刻:17 Apr 2022

所属范畴:核算机视觉

对应使命:Image Restoration,Spectral Reconstruction,Spectral Super-Resolution,图画修正,光谱重建,光谱超分辨率

论文地址:arxiv.org/abs/2204.07…

代码完成:github.com/caiyuanhao1…

论文作者:Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Zhang, Hanspeter Pfister, Radu Timofte, Luc van Gool

论文简介:Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI)./现有的光谱重建(SR)的抢先办法侧重于规划更深或更广的卷积神经网络(CNN)来学习从RGB图画到其高光谱图画(HSI)的端到端映射。

论文摘要:Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI). These CNN-based methods achieve impressive restoration performance while showing limitations in capturing the long-range dependencies and self-similarity prior. To cope with this problem, we propose a novel Transformer-based method, Multi-stage Spectral-wise Transformer (MST++), for efficient spectral reconstruction. In particular, we employ Spectral-wise Multi-head Self-attention (S-MSA) that is based on the HSI spatially sparse while spectrally self-similar nature to compose the basic unit, Spectral-wise Attention Block (SAB). Then SABs build up Single-stage Spectral-wise Transformer (SST) that exploits a U-shaped structure to extract multi-resolution contextual information. Finally, our MST++, cascaded by several SSTs, progressively improves the reconstruction quality from coarse to fine. Comprehensive experiments show that our MST++ significantly outperforms other state-of-the-art methods. In the NTIRE 2022 Spectral Reconstruction Challenge, our approach won the First place. Code and pre-trained models are publicly available at github.com/caiyuanhao1…

现有的光谱重建(SR)的抢先办法侧重于规划更深或更广的卷积神经网络(CNN)来学习从RGB图画到其高光谱图画(HSI)的端对端映射。这些根据CNN的办法取得了令人印象深入的修正功能,但在捕捉长距离依靠性和自相似性的先验方面显示出局限性。为了应对这一问题,咱们提出了一种新的根据Transformer的办法,即多级逐光谱Transformer(MST++),用于高效的光谱重建。特别是,咱们选用了根据HSI空间稀少而光谱自相似性质的逐光谱多头自留意(S-MSA)来组成基本单元,即逐光谱留意块(SAB)。然后,SABs树立了单级逐频谱Transformer(SST),运用U形结构来提取多分辨率的上下文信息。最后,咱们的MST++,经过几个SST的级联,逐渐提高重建质量,从粗到细。综合试验标明,咱们的MST++显着优于其他最先进的办法。在NTIRE 2022年的光谱重建应战赛中,咱们的办法赢得了第一名。代码和预练习的模型在 github.com/caiyuanhao1… 揭露。

论文:Ensembling Off-the-shelf Models for GAN Training

论文标题:Ensembling Off-the-shelf Models for GAN Training

论文时刻:CVPR 2022

所属范畴:核算机视觉

对应使命:Image Generation,图画生成

论文地址:arxiv.org/abs/2112.09…

代码完成:github.com/nupurkmr9/v…

论文作者:Nupur Kumari, Richard Zhang, Eli Shechtman, Jun-Yan Zhu

论文简介:Can the collective “knowledge” from a large bank of pretrained vision models be leveraged to improve GAN training?/大型预练习视觉模型库的团体 “常识 “能否被用来改进GAN练习?

论文摘要:The advent of large-scale training has produced a cornucopia of powerful visual recognition models. However, generative models, such as GANs, have traditionally been trained from scratch in an unsupervised manner. Can the collective “knowledge” from a large bank of pretrained vision models be leveraged to improve GAN training? If so, with so many models to choose from, which one(s) should be selected, and in what manner are they most effective? We find that pretrained computer vision models can significantly improve performance when used in an ensemble of discriminators. Notably, the particular subset of selected models greatly affects performance. We propose an effective selection mechanism, by probing the linear separability between real and fake samples in pretrained model embeddings, choosing the most accurate model, and progressively adding it to the discriminator ensemble. Interestingly, our method can improve GAN training in both limited data and large-scale settings. Given only 10k training samples, our FID on LSUN Cat matches the StyleGAN2 trained on 1.6M images. On the full dataset, our method improves FID by 1.5x to 2x on cat, church, and horse categories of LSUN.

大规模练习的呈现产生了一个强大的视觉识别模型的宝库。然而,生成模型,如GANs,传统上是以无监督的办法从头开始练习。能否运用来自大型预练习视觉模型库的团体 “常识 “来改进GAN练习?如果能够的话,有这么多模型可供挑选,应该挑选哪一个(几个),以及它们以何种办法最有用?咱们发现,预练习的核算机视觉模型在用于判别器的调集时能够显著提高功能。值得留意的是,所选模型的特定子集对功能影响很大。咱们提出了一个有用的挑选机制,经过探测预练习模型嵌入中真假样本之间的线性可分离性,挑选最准确的模型,并逐渐将其加入到判别器调集中。风趣的是,咱们的办法在有限的数据和大规模的环境中都能改进GAN的练习。鉴于只有1万个练习样本,咱们在LSUN Cat上的FID与在1.6M图画上练习的StyleGAN2匹配。在完整的数据集上,咱们的办法在LSUN的猫、教堂和马的类别上提高了1.5倍到2倍的FID。

咱们是 ShowMeAI,致力于传达AI优质内容,共享行业处理方案,用常识加快每一次技能生长!点击检查 历史文章列表,在公众号内订阅论题 #ShowMeAI资讯日报,可接纳每日最新推送。点击 专题合辑&电子月刊 快速浏览各专题全集。

  • 作者:韩信子@ShowMeAI
  • 历史文章列表
  • 专题合辑&电子月刊
  • 声明:版权所有,转载请联络渠道与作者并注明出处
  • 欢迎回复,托付点赞,留言引荐中有价值的文章、东西或建议,咱们都会尽快回复哒~