没有合适的资源?快使用搜索试试~ 我知道了~
深度学习在遥感中的应用综述
4星 · 超过85%的资源 需积分: 49 161 下载量 170 浏览量
2017-10-24
17:35:48
上传
评论 12
收藏 9.85MB PDF 举报
温馨提示
深度学习作为一个领域的重大突破已经被证明是一个非常强大的工具在许多领域。我们是否应该把深度学习作为一切的关键?或者,我们应该抵制黑箱解决方案吗?在遥感社区中存在着一些有争议的观点。在本文中,我们分析了对遥感数据分析的深度学习的挑战,回顾了最近的进展,并提供了资源,使遥感的深度学习从一开始就非常简单。更重要的是,我们提倡遥感科学家将他们的专业知识带进深度学习,并将其作为一种隐含的一般模式,以应对气候变化和城市化等前所未有的大规模有影响的挑战。
资源推荐
资源详情
资源评论
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, IN PRESS. 1
Deep Learning in Remote Sensing: A Review
Xiao Xiang Zhu, Devis Tuia, Lichao Mou, Gui-Song Xia, Liangpei Zhang, Feng
Xu, Friedrich Fraundorfer
Abstract
This is the pre-acceptance version, to read the final version please go to IEEE Geoscience and
Remote Sensing Magazine on IEEE XPlore.
Standing at the paradigm shift towards data-intensive science, machine learning techniques are
becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has
proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to
all? Or, should we resist a “black-box” solution? There are controversial opinions in the remote sensing
community. In this article, we analyze the challenges of using deep learning for remote sensing data
analysis, review the recent advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their
expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale
influential challenges, such as climate change and urbanization.
X. Zhu and L. Mou are with the Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR), Germany
and with Signal Processing in Earth Observation (SiPEO), Technical University of Munich (TUM), Germany, E-mails:
xiao.zhu@dlr.de; lichao.mou@dlr.de.
D. Tuia was with the Department of Geography, University of Zurich, Switzerland. He is now with the Laboratory of
GeoInformation Science and Remote Sensing, Wageningen University of Research, the Netherlands. E-mail: devis.tuia@wur.nl.
G.-S Xia and L. Zhang are with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote
Sensing (LIESMARS), Wuhan University. E-mail:guisong.xia@whu.edu.cn; zlp62@whu.edu.cn.
F. Xu is with the Key Laboratory for Information Science of Electromagnetic Waves (MoE), Fudan Univeristy. E-mail:
fengxu@fudan.edu.cn.
F. Fraundorfer is with the Institute of Computer Graphics and Vision, TU Graz, Austria and with the Remote Sensing
Technology Institute (IMF), German Aerospace Center (DLR), Germany. E-mail: fraundorfer@icg.tugraz.at.
The work of X. Zhu and L. Mou are supported by the European Research Council (ERC) under the European Unions
Horizon 2020 research and innovation programme (grant agreement No [ERC-2016-StG-714087], Acronym: So2Sat), Helmholtz
Association under the framework of the Young Investigators Group “SiPEO” (VH-NG-1018, www.sipeo.bgu.tum.de) and China
Scholarship Council. The work of D. Tuia is supported by the Swiss National Science Foundation (SNSF) under the project
NO. PP0P2 150593. The work of G.-S. Xia and L. Zhang are supported by the National Natural Science Foundation of China
(NSFC) projects with grant No. 41501462 and No. 41431175. The work of F. Xu are supported by the National Natural Science
Foundation of China (NSFC) projects with grant No. 61571134.
October 12, 2017 DRAFT
arXiv:1710.03959v1 [cs.CV] 11 Oct 2017
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, IN PRESS. 2
Index Terms
Deep learning, remote sensing, machine learning, big data, Earth observation
I. MOTIVATION
Deep learning is the fastest-growing trend in big data analysis and has been deemed one
of the 10 breakthrough technologies of 2013 [1]. It is characterized by neural networks (NNs)
involving usually more than two layers (for this reason, they are called deep). As their shallow
counterpart, deep neural networks exploit feature representations learned exclusively from data,
instead of hand-crafting features that are mostly designed based on domain-specific knowledge.
Deep learning research has been extensively pushed by Internet companies, such as Google,
Baidu, Microsoft, and Facebook for several image analysis tasks, including image indexing,
segmentation, and object detection. Recent advances in the field have proven deep learning a
very successful set of tools, sometimes even able to surpass human ability to solve highly com-
putational tasks (see, for instance, the highly mediatized Go match between Google’s AlphaGo
AI and the World Go Champion Lee Sedol. Motivated by those exciting advances, deep learning
is becoming the model of choice in many fields of application. For instance, convolutional neural
networks (CNNs) have proven to be good at extracting mid- and high-level abstract features from
raw images, by interleaving convolutional and pooling layers, (i.e., spatially shrinking the feature
maps layer by layer). Recent studies indicate that the feature representations learned by CNNs
are greatly effective in large-scale image recognition [2–4], object detection [5, 6], and semantic
segmentation [7, 8]. Furthermore, as an important branch of the deep learning family, recurrent
neural networks (RNNs) have been shown to be very successful on a variety of tasks involved
in sequential data analysis, such as action recognition [9, 10] and image captioning [11].
Following this wave of success and thanks to the increased availability of data and computa-
tional resources, the use of deep learning in remote sensing is finally taking off in remote sensing
as well. Remote sensing data bring some new challenges for deep learning, since satellite image
analysis raises some unique questions that translate into challenging new scientific questions:
• Remote sensing data are often multi-modal, e.g. from optical (multi- and hyperspectral)
and synthetic aperture radar (SAR) sensors, where both the imaging geometries and the
content are completely different. Data and information fusion uses these complementary
data sources in a synergistic way. Already prior to a joint information extraction, a crucial
October 12, 2017 DRAFT
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, IN PRESS. 3
step is to develop novel architectures for the matching of images taken from different
perspectives and even different imaging modality, preferably without requiring an existing
3D model. Also, besides conventional decision fusion, an alternative is to investigate the
transferability of trained networks to other imaging modalities.
• Remote sensing data are geo-located, i.e., they are naturally located in the geographical
space. Each pixel corresponds to a spatial coordinate, which facilitates the fusion of pixel
information with other sources of data, such as GIS layers, geo-tagged images from social
media, or simply other sensors (as above). On one hand, this fact allows tackling of data
fusion with non-traditional data modalities while, on the other hand, it opens the field to new
applications, such as pictures localization, location-based services or reality augmentation.
• Remote Sensing data are geodetic measurements with controlled quality. This enables us
to retrieve geo-parameters with confidence estimates. However, differently from purely
data-driven approaches, the role of prior knowledge about the sensors adequacy and data
quality becomes even more crucial. For example, to retrieve topographic information, even
at the same spatial resolution, interferograms acquired using single-pass SAR system are
considered to be more important than the ones acquired in repeat-pass manner.
• The time variable is becoming increasingly in the field. The Copernicus program guarantees
continuous data acquisition for decades. For instances, Sentinel-1 images the entire Earth
every six days. This capability is triggering a shift from individual image analysis to time-
series processing. Novel network architectures must be developed for optimally exploiting
the temporal information jointly with the spatial and spectral information of these data.
• Remote sensing also faces the big data challenge. In the Copernicus era, we are dealing
with very large and ever-growing data volumes, and often on a global scale. For example,
even if they were launched in 2014, Sentinel satellites have already acquired about 25 Peta
Bytes of data. The Copernicus concept calls for global applications, i.e., algorithms must
be fast enough and sufficiently transferrable to be applied for the whole Earth surface. On
the other hand, these data are well annotated and contain plenty of metadata. Hence, in
some cases, large training data sets might be generated (semi-) automatically.
• In many cases remote sensing aims at retrieving geo-physical or bio-chemical quantities
rather than detecting or classifying objects. These quantities include mass movement rates,
mineral composition of soils, water constituents, atmospheric trace gas concentrations, and
terrain elevation of biomass. Often process models and expert knowledge exist that is
October 12, 2017 DRAFT
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, IN PRESS. 4
traditionally used as priors for the estimates. This particularity suggests that the so-far dogma
of expert-free fully automated deep learning should be questioned for remote sensing and
physical models should be re-introduced into the concept, as, for example, in the concept
of emulators [12].
Remote sensing scientists have exploited the power of deep learning to tackle these different
challenges and started a new wave of promising research. In this paper, we review these advances.
After the introductory Section II detailing deep learning models (with emphasis put on convolu-
tional neural networks), we enter sections dedicated to advances in hyperspectral image analysis
(Section III-A), synthetic aperture radar (Section III-B), very high resolution (Section III-C, data
fusion (Section III-D), and 3D reconstruction (Section III-E). Section IV then provides the tools
of the trade for scientists willing to explore deep learning in their research, including open codes
and data repositories. Section V concludes the paper by giving an overview of the challenges
ahead.
II. FROM PERCEPTRON TO DEEP LEARNING
Perceptron is the basic of the earliest NNs [13]. It is a bio-inspired model for binary classifi-
cation that aims to mathematically formalize how a biological neuron works. In contrast, deep
learning has provided more sophisticated methodologies to train deep NN architectures. In this
section, we recall the classic deep learning architectures used in visual data processing.
A. Autoencoder models
1) Autoencoder and Stacked Autoencoder (SAE): An autoencoder [14] takes an input x ∈ R
D
and, first, maps it to a latent representation h ∈ R
M
via a nonlinear mapping:
h = f(Θx + β) , (1)
where Θ is a weight matrix to be estimated during training, β is a bias vector, and f stands for
a nonlinear function, such as the logistic sigmoid function or a hyperbolic tangent function. The
encoded feature representation h is then used to reconstruct the input x by a reverse mapping
leading to the reconstructed input y:
y = f (Θ
0
h + β
0
) , (2)
where Θ
0
is usually constrained to be the form of Θ
0
= Θ
T
, i.e., the same weight is used for
encoding the input and decoding the latent representation. The reconstruction error is defined
October 12, 2017 DRAFT
IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, IN PRESS. 5
as the Euclidian distance between x and y that is constrained to approximate the input data x
(i.e., making kx − yk
2
2
→ 0). The parameters of the autoencoder are generally optimized by
stochastic gradient descent (SGD).
An SAE is a neural network consisting of multiple layers of autoencoders in which the outputs
of each layer are wired to the inputs of the following one.
2) Sparse Autoencoder: The conventional autoencoder relies on the dimension of the latent
representation h being smaller than that of input x, i.e., M < D, which means that it tends
to learn a low-dimensional, compressed representation. However, when M > D, one can still
discover interesting structures by enforcing a sparsity constraint on the hidden units. Formally,
given a set of unlabeled data X = {x
1
, x
2
, · · · , x
N
}, training a sparse autoencoder [15] boils
down to finding the optimal parameters by minimizing the following loss function:
E =
1
N
N
X
i=1
(J(x
i
, y
i
; Θ, β) + λ
M
X
j=1
KL(ρkˆρ
j
)) , (3)
where J(x
i
, y
i
; Θ, β) is an average sum-of-squares error term, which represents the reconstruc-
tion error between the input x
i
and its reconstruction y
i
. KL(ρkˆρ
j
) is the Kullback-Leibler (KL)
divergence between a Bernoulli random variable with mean ρ and a Bernoulli random variable
with mean ˆρ
j
. KL-divergence is a standard function for measuring how similar two distributions
are:
KL(ρkˆρ
j
) = ρ log
ρ
ˆρ
j
+ (1 − ρ) log
1 − ρ
1 − ˆρ
j
. (4)
In the sparse autoencoder model, the KL-divergence is a sparsity penalty term, and λ controls
its importance. ρ is a free parameter corresponding to a desired average activation
1
value, and ˆρ
indicates the average activation value of hidden neuron h
j
over the training samples. Similar to
the autoencoder, the optimization of a sparse autoencoder can be achieved via back-propagation
and SGD.
3) Restricted Boltzmann Machine (RBM) & Deep Belief Network (DBN): Unlike the deter-
ministic network architectures, such as autoencoders or sparse autoencoders, an RBM (cf. Fig. 1)
is a stochastic undirected graphical model consisting of a visible layer and a hidden layer, and
1
An activation corresponds to how much a region of the image reacts when convolved with a filter. In the first layer, for
example, each location in the image receives a value that corresponds to a linear combination of the original bands and the filter
applied. The higher such value, the more ‘activated’ this filter is on that region. When convolved over the whole image, a filter
produces an activation map, which is the activation at each location where the filter has been applied.
October 12, 2017 DRAFT
剩余59页未读,继续阅读
资源评论
- Wizholy2017-12-27很好的综述文档
- yyyayy2018-01-11很好的资源
- SpatialEquilibrium2018-05-03非常不错。
- 北视界2018-01-27很好的资源
- giscnu2018-02-27就是这篇英文文章Deep Learning in Remote Sensing: A Review,网上都可以免费下载,浪费了我的C币-_-||
u012940171
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- LabVIEW与罗克韦尔PLC的Ethernet/IP通讯实战:源码解析与标签管理,实测通过,功能全面覆盖,LabVIEW 通讯与罗克韦尔 AB Allen Bradley PLC通讯 Ehernet
- 猪场鸡场反渗透恒压供水电气控制系统-远程监控功能及应用分析,反渗透和一拖3恒压供水,程序注释完善,在山东某猪场和鸡场运行正常 可以手机app远程监控,plc远程监控程序 1.西门子SMART和海
- IMG_1735639180938.png
- 西门子新能源四轴自动排列机:基于RFID与MES通讯的V90控制程序,详细注释与CAD电气图,适合初学者进阶参考,西门子1200四轴程序新能源自动排列机,真实项目 4个V90 走PN 口控制4轴4个
- 基于S7-200 PLC和组态王的智能交通灯控制系统设计手册:梯形图程序解析、接线与原理图详解及组态界面展示,No.865 基于S7-200 PLC和组态王智能交通灯控制系统 带解释的梯形图程序,接线
- 基于S7-200 PLC和组态王的喷泉智能控制系统:梯形图程序、接线图与组态画面全解析,基于S7-200 PLC和组态王喷泉控制系统 带解释的梯形图程序,接线图原理图图纸,io分配,组态画面 ,核心关
- 基于S7-1200 PLC与Wincc组态的电厂直流屏监测系统:梯形图程序详解、接线图与IO分配及组态画面全解析,基于S7-1200 PLC和Wincc组态的电厂直流屏监测系统的 带解释的梯形图程序
- 基于PLC的门禁系统电气控制设计及组态实现:梯形图、接线图、IO分配与可视化组态画面设计,基于PLC的门禁系统设计自动门禁电气控制设计门禁组态设计 带解释的梯形图程序,接线图原理图图纸,io分配,组态
- 开关磁阻电机多策略控制仿真模型(Matlab 2016b版):电流斩波、电压PWM、角度位置传统控制结合智能控制方法及离线迭代算法,开关磁阻电机控制仿真(matlab 2016b版本仿真模型 自用)
- 基于PMSM数学模型与EKF算法的Simulink矢量控制仿真模型:电机转子电角度与机械转速估算的无传感器控制,基于EKF扩展卡尔曼滤波算法的永磁同步电机PMSM无传感器矢量控制Simulink仿真模
- jss-4.4.9-3.el7.x64-86.rpm.tar.gz
- MATLAB Yalmip结合CPLEX和Gurobi编程实现综合能源系统规划入门学习方案,基于MATLAB yalmip cplex gurobi编程实现综合能源系统规划、优化调度等 含风电、光伏
- jss-javadoc-4.4.9-3.el7.x64-86.rpm.tar.gz
- 基于S7-200 PLC与MCGS组态技术的灌装生产线自动化系统设计与实现:梯形图程序、接线图、IO分配及组态画面全解析,基于S7-200 PLC和MCGS组态的灌装生产线系统 带解释的梯形图程序,接
- juk-4.10.5-3.el7.x64-86.rpm.tar.gz
- 西门子S7-1200控制五轴伺服程序案例:模块化设计、脉冲定位与速度扭矩模式应用,多功能块可重复调用,具备断电保持与报警功能,西门子S7-1200控制5轴伺服程序案例 S7-1200控 制5轴伺服
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功