HSMM人工智能文档资源-CSDN文库

4星 · 超过85%的资源需积分: 25 116 浏览量 2014-04-07 20:59:36 上传评论收藏 558KB PDF 举报

人工智能是一门研究和开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学。其中，隐马尔可夫模型（Hidden Markov Model，简称HMM）和隐半马尔可夫模型（Hidden Semi-Markov Model，简称HSMM）是人工智能领域中用于处理序列数据的重要算法。隐马尔可夫模型（HMM）是一种统计模型，用来描述一个含有隐含未知参数的马尔可夫过程。HMM是由马尔可夫链演变而来的一种双重随机过程，其中隐含的随机变量序列无法被直接观察到，但可以通过观察变量序列间接地获得。HMM定义了一个离散时间有限状态齐次马尔可夫链，状态转移和观测概率是隐马尔可夫模型的两个核心部分，分别通过状态转移矩阵和观测概率矩阵（或称为发射矩阵）来表示。隐半马尔可夫模型（HSMM）是HMM的一种扩展形式。HSMM允许隐状态持续时间具有变化性，即每个状态有可变的持续时间，并在状态期间产生一定数量的观测值。相比于传统的HMM，HSMM的状态持续时间是随机变量，这使得HSMM更适用于表示那些在状态内持续时间具有不同分布的现实世界过程。例如，在语音识别、自然语言处理等实际应用中，状态的持续时间往往不是固定的，HSMM能更加准确地描述这类情况。 HSMM的主要优势在于它引入了状态持续时间的概率分布，这不仅使模型能够描述更加丰富的动态过程，而且也使得它能够更好地处理那些状态持续时间有明显分布特性的数据。HSMM的引入为隐状态模型带来了更多的灵活性和表达能力。 HSMM的核心算法包括前向-后向（Forward-Backward，简称FB）算法和维特比（Viterbi）算法。前向-后向算法用于估计和更新模型参数，确定预测、滤波和平滑的概率，评估观测序列与模型的拟合度，并找到最佳状态序列。维特比算法是一种动态规划算法，用于找到给定观测序列下最有可能的状态序列。 HSMM自1980年首次被引入用于机器对语音的识别以来，已经被广泛应用于包括语音识别/合成、人体活动识别/预测、手写识别、功能性磁共振成像(fMRI)脑图绘制、网络异常检测在内的30多个科学与工程领域。HSMM已经成为这些领域的研究热点，并在文献中发表了约三百篇文章。文章还提供了一个关于HSMM的全面综述，内容包括模型的建模、推理、估计、实现和应用。文章给出了各种HSMM的统一描述，并讨论了HSMM背后的一般性问题。接着，扩展了HSMM的边界条件。然后，讨论了包括显式持续时间、变转移动态和停留时间的HSMM的传统模型，提出了各种持续时间分布和观测模型。文章概述了HSMM的应用，并对HSMM的应用前景进行了展望。 HSMM与HMM相比，提供了更多的灵活性，使其在处理现实世界复杂数据序列时，能够更加准确地建模和分析。HSMM在人工智能领域的研究和发展，对于推动相关学科的进步具有重要的理论和实践意义。

资源推荐

资源详情

资源评论

Artiﬁcial Intelligence 174 (2010) 215–243

Contents lists available at ScienceDirect

Artiﬁcial Intelligence

www.elsevier.com/locate/artint

Hidden semi-Markov models

Shun-Zheng Yu

Department of Electronics and Communication Engineering, Sun Yat-Sen University, Guangzhou 510275, PR China

article info abstract

Article history:

Received 14 April 2009

Available online 17 November 2009

Keywords:

Hidden Markov model (HMM)

Hidden semi-Markov model (HSMM)

Explicit duration HMM

Variable duration HMM

Forward–backward (FB) algorithm

Viterbi algorithm

As an extension to the popular hidden Markov model (HMM), a hidden semi-Markov

model (HSMM) allows the underlying stochastic process to be a semi-Markov chain.

Each state has variable duration and a number of observations being produced while in

the state. This makes it suitable for use in a wider range of applications. Its forward–

backward algorithms can be used to estimate/update the model parameters, determine

the predicted, ﬁltered and smoothed probabilities, evaluate goodness of an observation

sequence ﬁtting to the model, and ﬁnd the best state sequence of the underlying stochastic

process. Since the HSMM was initially introduced in 1980 for machine recognition of

speech, it has been applied in thirty scientiﬁc and engineering areas, such as speech

recognition/synthesis, human activity recognition/prediction, handwriting recognition,

functional MRI brain mapping, and network anomaly detection. There are about three

hundred papers published in the literature. An overview of HSMMs is presented in this

paper, including modelling, inference, estimation, implementation and applications. It ﬁrst

provides a uniﬁed description of various HSMMs and discusses the general issues behind

them. The boundary conditions of HSMM are extended. Then the conventional models,

including the explicit duration, variable transition, and residential time of HSMM, are

discussed. Various duration distributions and observation models are presented. Finally,

the paper draws an outline of the applications.

1. Introduction (History)

A hidden Markov model (HMM) is deﬁned as a doubly stochastic process. The underlying stochastic process is a discrete-

time ﬁnite-state homogeneous Markov chain. The state sequence is not observable and so is called hidden. It inﬂuences

another stochastic process that produces a sequence of observations. An excellent tutorial of HMMs can be found in Rabiner

[150], a theoretic overview of HMMs can be found in Ephraim and Merhav [57] and a discussion on learning and inference

in HMMs in understanding of Bayesian networks is presented in Ghahramani [66]. The HMMs are an important class of

models that are successful in many application areas. However, due to the non-zero probability of self-transition of a non-

absorbing state, the state duration of an HMM is implicitly a geometric distribution. This makes the HMM has limitations

in some applications.

As an extension of the HMM, a hidden semi-Markov model (HSMM) is traditionally deﬁned by allowing the underlying

process to be a semi-Markov chain. Each state has a variable duration, which is associated with the number of observations

produced while in the state. The HSMM is also called “explicit duration HMM” [60,150], “variable-duration HMM” [107,

155,150], “HMM with explicit duration” [124], “hidden semi-Markov model” [126], generalized HMM [94], segmental HMM

[157] and segment model [135,136] in the literature, depending on their assumptions and their application areas.

E-mail address: syu@mail.sysu.edu.cn.

0004-3702/$ – see front matter

doi:10.1016/j.artint.2009.11.011

216 S.-Z. Yu / Artiﬁcial Intelligence 174 (2010) 215–243

The ﬁrst approach to hidden semi-Markov model was proposed by Ferguson [60], which is partially included in the

survey paper by Rabiner [150]. This approach is called the explicit duration HMM in contrast to the implicit duration of

the HMM. It assumes that the state duration is generally distributed depending on the current state of the underlying

semi-Markov process. It also assumes the “conditional independence” of outputs. Levinson [107] replaced the probability

mass functions of duration with continuous probability density functions to form a continuously variable duration HMM. As

Ferguson [60] pointed out, an HSMM can be realized in the HMM framework in which both the state and its sojourn time

since entering the state are taken as a complex HMM state. This idea was exploited in 1991 by a 2-vector HMM [93] and a

duration-dependent state transition model [179]. Since then, similar approaches were proposed in many applications. They

are called in different names such as inhomogeneous HMM [151], non-stationary HMM [164], and recently triplet Markov

chains [144]. These approaches, however, have the common problem of computational complexity in some applications.

A more eﬃcient algorithm was proposed in 2003 by Yu and Kobayashi [199], in which the forward–backward variables are

deﬁned using the notion of a state together with its remaining sojourn (or residual life) time. This makes the algorithm

practical in many applications.

The HSMM has been successfully applied in many areas. The most successful application is in speech recognition. The

ﬁrst application of HSMM in this area was made by Ferguson [60]. Since then, there have been more than one hundred such

papers published in the literature. It is the application of HSMM in speech recognition that enriches the theory of HSMM

and develops many algorithms for HSMM.

Since the beginning of 1990’s, the HSMM started being applied in many other areas such as electrocardiograph (ECG)

[174], printed text recognition [4] or handwritten word recognition [95], recognition of human genes in DNA [94], language

identiﬁcation [118], ground target tracking [88], document image comparison and classiﬁcation at the spatial layout level

[81], etc.

In recent years from 2000 to present, the HSMM has been obtained more and more attentions from vast application

areas such as change-point/end-point detection for semi-conductor manufacturing [64], protein structure prediction [162],

mobility tracking in cellular networks [197], analysis of branching and ﬂowering patterns in plants [69], rain events time se-

ries model [159], brain functional MRI sequence analysis [58], satellite propagation channel modelling [112], Internet traﬃc

modelling [198], event recognition in videos [79], speech synthesis [204,125], image segmentation [98], semantic learning

for a mobile robot [167], anomaly detection for network security [201], symbolic plan recognition [54], terrain modelling

[185], adaptive cumulative sum test for change detection in non-invasive mean blood pressure trend [193], equipment

prognosis [14], ﬁnancial time series modelling [22], remote sensing [147], classiﬁcation of music [113], and prediction of

particulate matter in the air [52], etc.

The rest of the paper is organized as follows: Section 2 is the major part of this paper that deﬁnes a uniﬁed HSMM and

addresses important issues related to inference, estimation and implementation. Section 3 then presents three conventional

HSMMs that are applied vastly in practice. Section 4 discusses the speciﬁc modelling issues, regarding duration distributions,

observation distributions, variants of HSMMs, and the relationship to the conventional HMM. Finally, Section 5 highlights

major applications of HSMMs and concludes the paper in Section 6.

2. Hidden semi-Markov model

This section provides a uniﬁed description of HSMMs. A general HSMM is deﬁned without speciﬁc assumptions on the

state transitions, duration distributions and observation distributions. Then the important issues related to inference, esti-

mation and implementation of the HSMM are discussed. A general expression of the explicit-duration HMMs and segment

HMMs can be found in Murphy [126], and a uniﬁed view of the segment HMMs can be found in Ostendorf et al. [136].

Detailed review for the conventional HMM can be found in the tutorial by Rabiner [150], the overview by Ephraim and

Merhav [57], the Bayesian networks-based discussion by Ghahramani [66], and the book by Cappe et al. [29].

2.1. General model

A hidden semi-Markov model (HSMM) is an extension of HMM by allowing the underlying process to be a semi-Markov

chain with a variable duration or sojourn time for each state. Therefore, in addition to the notation deﬁned for the HMM,

the duration d of a given state is explicitly deﬁned for the HSMM. State duration is a random variable and assumes an

integer value in the set

D ={1, 2,...,D}. The important difference between HMM and HSMM is that one observation per

state is assumed in HMM while in HSMM each state can emit a sequence of observations. The number of observations

produced while in state i is determined by the length of time spent in state i, i.e., the duration d. Now we provide a uniﬁed

description of HSMMs.

Assume a discrete-time Markov chain with the set of (hidden) states

S ={1,...,M}. The state sequence is denoted by

1:T

 S

,...,S

, where S

∈ S is the state at time t. A realization of S

1:T

is denoted as s

1:T

. For simplicity of notation in

the following sections, we denote:

• S

= i –statei that the system stays in during the period from t

to t

. In other words, it means S

= i, S

= i,...,

and S

= i. Note that the previous state S

−1

and the next state S

may or may not be i.

S.-Z. Yu / Artiﬁcial Intelligence 174 (2010) 215–243 217

Fig. 1. General HSMM.

• S

]

= i

–state

which starts at time

and ends at

with duration

−t

+1. This implies that the previous

state

−1

and the next state

must not be

• S

= i

–state

that starts at time

and lasts till

,with

= i

= i,...,

= i

, where

= i

means that

the system switched from some other state to

, i.e., the previous state

−1

must not be

. The next state

may or may not be

• S

]

= i

–state

that lasts from

and ends at

with

= i

= i,...,

]

= i

, where

]

= i

means that

at time

the state will end and transit to some other state, i.e., the next state

must not be

. The previous state

−1

may or may not be

Based on these deﬁnitions,

[t]

= i

means state

starting and ending at

with duration 1,

= i

means state

starting

= i

means state

ending at

, and

= i

means the state at

being state

Denote the observation sequence by

1:T

 O

,...,O

, where

∈ V is the observable at time

and V =

{

, v

,...,v

} is the set of observable values. For observation sequence o

1:T

, the underlying state sequence is S

1:d

]

= i

+1:d

]

= i

,..., S

+···+d

n−1

+1:d

+···+d

= i

, and the state transitions are (i

, d

) → (i

m+1

, d

m+1

),form =1,...,n −1,

where



= T , i

,...,i

∈ S , and d

,...,d

∈ D. Note that the ﬁrst state i

is not necessary starting at time 1

associated with the ﬁrst observation o

and the last state i

is not necessary ending at time T associated with the last

observation o

. Detailed discussion about the censoring issues can be found in Section 2.2.1. Deﬁne the state transition

probability from (i, d



) → ( j, d) for i = j by

(i,d



)( j,d)

 P [S

[t+1:t+d]

= j|S

[t−d



+1:t]

= i],

subject to



j∈S\{i}



d∈D

(i,d



)( j,d)

= 1 with zero self-transition probabilities a

(i,d



)(i,d)

= 0, where i, j ∈ S and d, d



∈ D.

From the deﬁnition we can see that the previous state i started at t

− d



+ 1 and ended at t, with duration d



.Thenit

transits to state j having duration d, according to the state transition probability a

(i,d



)( j,d)

.State j will start at t + 1and

end at t

+d. This means both the state and the duration are dependent on both the previous state and its duration. While

in state j, there will be d observations o

t+1:t+d

being emitted. Denote this emission probability by

j,d

t+1:t+d

)  P [o

t+1:t+d

[t+1:t+d]

= j]

which is assumed to be independent to time t. Let the initial distribution of the state be

j,d

 P [S

[t−d+1:t]

= j], t  0, d ∈ D.

It represents the probability of the initial state and its duration before time t = 1 or before the ﬁrst observation o

obtained.

Then the set of the model parameters for the HSMM is deﬁned by

λ 



(i,d



)( j,d)

, b

j,d

), π

i,d



where i, j ∈ S, d, d



∈D, and v

represents v

...v

∈V ×···×V . This general HSMM is shown in Fig. 1.

The general HSMM is reduced to speciﬁc models of HSMM depending on the assumptions they made. For instance:

• If the state duration is assumed to be independent to the previous state, then the state transition probability can be

further speciﬁed as a

(i,d



)( j,d)

(i,d



) j

(d), where

(i,d



) j

 P [S

[t+1

= j|S

[t−d



+1:t]

= i] (1)

218 S.-Z. Yu / Artiﬁcial Intelligence 174 (2010) 215–243

is the transition probability from state

that has stayed for duration



to state

that will start at

+1, and

(d)  P [S

t+1:t+d]

= j|S

[t+1

= j] (2)

is the probability of duration

that state

takes. This is the model proposed by Marhasev et al. [119].

• If a state transition is assumed to be independent to the duration of the previous state, then the state transition

probability becomes

(i,d



)( j,d)

i( j,d)

, where

i( j,d)

 P [S

[t+1:t+d]

= j|S

= i] (3)

is the transition probability that state

ended at

and transits to state

having duration

. This is the residential time

HMM (see Section 3.3 for details). In this model, a state transition for

= j

(i, 1) → ( j, τ ) and a self-transition is

assumed to be

(i, τ ) →(i, τ −1) for τ > 1, where τ represents the residential time of the state.

• If a self-transition is allowed and is assumed to be independent to the previous state, then the state transition proba-

bility becomes

(i,d



)( j,d)

(i,d



) j

−1



τ =1

(τ )



1 −a

(d)



where

(d)  P [S

t+d+1

= j|S

[t−d



+1:t]

= i, S

[t+1:t+d

= j]=P[S

t+d+1

= j|S

[t+1:t+d

= j] is the self-transition probability

when state

has stayed for

time units, and 1

−a

(d) = P[S

t+d]

= j|S

[t+1:t+d

= j] is the probability state

ends with

duration

. This is the variable transition HMM (see Section 3.2 for details). In this model, a state transition is either

(i, d) →( j, 1) for

= j

(i, d) →(i, d +1) for a self-transition.

• If a transition to the current state is independent to the duration of the previous state and the duration is only condi-

tioned on the current state, then

(i,d



)( j,d)

(d), where

 P [S

[t+1

= j|S

= i] is the transition probability from

state

, with the self-transition probability

=0. This is the explicit duration HMM (see Section 3.1 for details).

Besides, the state duration distributions,

(d), can be parametric or non-parametric. The detailed discussion on various

duration distributions can be found in Section 4.1. Similarly, the observation distributions

j,d

) can be parametric or

non-parametric, discrete or continuous, and dependent or independent on the state durations. It can also be a mixture of

distributions. The detailed discussion on various observation distributions can be found in Section 4.2.

2.2. Inference

In this subsection we discuss the issues related to inference, including the forward–backward algorithm, calculation of

probabilities and expectations, maximum a posteriori (MAP) estimate of states, maximum likelihood estimate (MLE) of state

sequence, and constrained estimate of states.

2.2.1. The forward–backward algorithm

We deﬁne the forward variables for HSMM by:

( j,d)  P [S

[t−d+1:t]

= j,o

1:t

|λ]

and the backward variables by

( j,d)  P [o

t+1:T

[t−d+1:t]

= j,λ].

Similar to deriving the formulas for the HMM (see e.g., Rabiner [150], Ephraim and Merhav [57]), it is easy to obtain the

forward–backward algorithm for a general HSMM:

( j,d) =



i∈S\{j}





∈D

t−d



i, d





(i,d



)( j,d)

·b

j,d

t−d+1:t

), (4)

for t > 0, d ∈ D, j ∈S , and

( j,d) =



i∈S\{j}





∈D

( j,d)(i,d



)

·b

i,d



t+1:t+d



) · β

t+d





i, d





(5)

for t < T .

The initial conditions generally can have two different assumptions:

• The general assumption: assumes that the ﬁrst state begins at or before observation o

and the last state ends at or after

observation o

. In this case, we can assume that the process starts at −∞ and terminates at +∞. The observations out

of the sampling period

[1, T ] can be any possible values, i.e., b

j,d

(·) =1forany j ∈ S,d ∈ D. Therefore, in the forward

S.-Z. Yu / Artiﬁcial Intelligence 174 (2010) 215–243 219

formula (4)

j,d

t−d+1:t

) is replaced with the marginal distribution

j,d

1:t

) if

− d + 1  1and

 1, and in the

backward formula (5)

i,d



t+1:t+d



) is replaced with

i,d



t+1:T

) if

+1  T

and

t +d



 T

. We then have the initial

conditions for the forward recursion formula given by (4) as follows:

( j,d) = P [S

[τ −d+1:τ ]

= j|λ]=π

j,d

, τ  0, d ∈D,

where {π

j,d

} can be the equilibrium distribution of the underlying semi-Markov process. Because, for



 T

P [S

[t+1:t+d



]

= i, o

t+1:T

[t−d+1:t]

= j,λ]=a

( j,d)(i,d



)

i,d







then from the backward recursion formula (5) we can see that β

t+d



(i, d



) = 1, for

+ d



 T

. Therefore, the initial

conditions for the backward recursion formula given by (5) are as follows:

(i, d) =1, τ  T , d ∈D.

If the model assumes that the ﬁrst state begins at

=1 and the last state ends at or after observation

,itisaright-

censored HSMM introduced by Guedon [70]. Because this is desirable for many applications, it is taken as a basis for

an R package for analyzing HSMMs [23].

• The simplifying assumption: assumes that the ﬁrst state begins at time 1 and the last state ends at time

. This is the

most popular assumption one can ﬁnd in the literature. In this case, the initial conditions for the forward recursion

formulagivenby(4)are:

( j,d) =π

j,d

, d ∈ D,

( j,d) =0, τ < 0, d ∈ D,

and the initial conditions for the backward recursion formula given by (5) are:

(i, d) =1, d ∈ D,

(i, d) =0, τ > T , d ∈D.

Note that the initial distribution of states can be assumed as π



j,d

 P[S

[1:d]

= j|λ], which obviously equals



i,d



i,d



(i,d



)( j,d)

. Therefore, the initial conditions for the forward recursion formula can also be α

( j, d) =



j,d

1:d

),ford ∈D.

2.2.2. Probabilities and expectations

After the forward variables

{α

( j, d)} and the backward variables {β

( j, d)} are determined, all other probabilities of

interest can be computed. For instance, the ﬁltered probability that state j started at t

−d +1 and ends at t, with duration d,

given partial observed sequence o

1:t

can be determined by

P [S

[t−d+1:t]

= j|o

1:t

,λ]=

( j,d)



j,d

( j,d)

and the predicted probability that state j will start at t +1 and end at t +d, with duration d, given partial observed sequence

1:t

P [S

[t+1:t+d]

= j|o

1:t

,λ]=



i=j,d



(i, d



(i,d



)( j,d)



i,d



(i, d



)

These readily yield the ﬁltered probability of state j ending at t, P [S

= j|o

1:t

,λ]=



P [S

[t−d+1:t]

= j|o

1:t

,λ], and the

predicted probability of state j starting at t

+1, P [S

[t+1

= j|o

1:t

,λ]=



P [S

[t+1:t+d]

= j|o

1:t

,λ].

The posterior probabilities P

= j|o

1:T

,λ], P [S

= i, S

t+1

= j|o

1:T

,λ] and P[S

[t−d+1:t]

= j|o

1:T

,λ] for given entire obser-

vation sequence o

1:T

can be determined by the following equations

( j,d)  P [S

[t−d+1:t]

= j,o

1:T

|λ]=α

( j,d)β

( j,d), (6)



i, d



; j,d





P [S

[t−d



+1:t]

= i, S

[t+1:t+d]

= j,o

1:T

|λ]=α



i, d





(i,d



)( j,d)

j,d

t+1:t+d

)β

t+d

( j,d),

(i, j)  P [S

= i, S

[t+1

= j,o

1:T

|λ]=





∈D



d∈D



i, d



; j,d



(7)

( j)  P [S

= j,o

1:T

|λ]=



τ t



d=τ −t+1

( j,d) (8)

and

剩余28页未读，继续阅读

评论收藏

内容反馈

Zoe_Life

2014-06-30

比较详细可以参考
qq_38950170

2018-03-15

英文文档，暂时看不懂。

tiandaxiaoxin

粉丝: 0
资源: 2

HSMM人工智能文档

人工智能文档

hsmm_0.3-5.tar.gz

HSMM R包

HSMM:隐马尔可夫模型作为隐马尔可夫模型的扩展，描述了每个潜在状态的持续时间可变

HSMM程序（matlab）

hsmm程序代码

DNN-HSMM:TTS的DNN-HSMM的pytorch实现

HSMM matlab代码

HSMM matlab

HSMM_HSMM_状态识别_寿命预测

智能家居系统演示文档

智能api使用文档

开发智能文档

AI帮助文档

hsmm

HSMM_HSMM_状态识别_寿命预测_源码.zip

LR-HSMM法分割心音

Bayesian and HSMM

Bayesian and HSMM.zip_Bayesian and HSMM_GY7Z_HMM_HSMM_HSMM 改进

智能家居说明文档

未来制造业与工业人工智能PPT文档.pptx

人工智能学习示例文件

本源码是HSMM的Python包，HSMM是HMM的一个变种，主要用于模式识别，机器学习等.zip

AI入门 基础文档最基础的 的的

基于EMD-HSMM的采煤机轴承故障诊断

231738171039948.rar_HSMM_HSMM算法

HSMM_HSMM_状态识别_寿命预测.zip

最新资源

AI入门基础文档最基础的的的