Training Restricted Boltzmann Machines on Word Observations

Dahl, George E.; Adams, Ryan P.; Larochelle, Hugo

Computer Science > Machine Learning

arXiv:1202.5695 (cs)

[Submitted on 25 Feb 2012 (v1), last revised 5 Jul 2012 (this version, v2)]

Title:Training Restricted Boltzmann Machines on Word Observations

Authors:George E. Dahl, Ryan P. Adams, Hugo Larochelle

View PDF

Abstract:The restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds of thousands. The conventional approach to training RBMs on word observations is limited because it requires sampling the states of K-way softmax visible units during block Gibbs updates, an operation that takes time linear in K. In this work, we address this issue by employing a more general class of Markov chain Monte Carlo operators on the visible units, yielding updates with computational complexity independent of K. We demonstrate the success of our approach by training RBMs on hundreds of millions of word n-grams using larger vocabularies than previously feasible and using the learned features to improve performance on chunking and sentiment classification tasks, achieving state-of-the-art results on the latter.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1202.5695 [cs.LG]
	(or arXiv:1202.5695v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1202.5695

Submission history

From: Hugo Larochelle [view email]
[v1] Sat, 25 Feb 2012 20:23:37 UTC (116 KB)
[v2] Thu, 5 Jul 2012 12:15:40 UTC (132 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2012-02

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

George E. Dahl
Ryan Prescott Adams
Hugo Larochelle

export BibTeX citation

Computer Science > Machine Learning

Title:Training Restricted Boltzmann Machines on Word Observations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Training Restricted Boltzmann Machines on Word Observations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators