


default search action
Hirokazu Kameoka
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j34]Hirokazu Kameoka
, Takuhiro Kaneko
, Kou Tanaka
, Nobukatsu Hojo, Shogo Seki
:
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion With Annealed Langevin Dynamics. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2213-2226 (2024) - [c146]Chihiro Watanabe, Hirokazu Kameoka:
GE2E-AC: Generalized End-to-End Loss Training for Accent Classification. APSIPA 2024: 1-6 - [c145]Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Noboru Harada:
Learning to Assess Subjective Impressions from Speech. EUSIPCO 2024: 381-385 - [c144]Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
Selecting N-Lowest Scores for Training MOS Prediction Models. ICASSP 2024: 1451-1455 - [c143]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka:
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator. ICASSP 2024: 12561-12565 - [i36]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka:
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator. CoRR abs/2403.16464 (2024) - [i35]Chihiro Watanabe, Hirokazu Kameoka:
GE2E-AC: Generalized End-to-End Loss Training for Accent Classification. CoRR abs/2407.14021 (2024) - [i34]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo:
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation. CoRR abs/2409.02245 (2024) - 2023
- [j33]Shogo Seki
, Hirokazu Kameoka
, Takuhiro Kaneko, Kou Tanaka:
Non-Parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder. IEEE Access 11: 44590-44599 (2023) - [j32]Li Li
, Hirokazu Kameoka
, Shoji Makino
:
FastMVAE2: On Improving and Accelerating the Fast Variational Autoencoder-Based Source Separation Algorithm for Determined Mixtures. IEEE ACM Trans. Audio Speech Lang. Process. 31: 96-110 (2023) - [c142]Chihiro Watanabe, Hirokazu Kameoka:
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion. APSIPA ASC 2023: 1167-1171 - [c141]Keisuke Takazawa, Hirokazu Kameoka, Masahiro Yukawa:
Multiple Sound Source Tracking Based on Generative Modeling and Recursive Bayesian Filtering of Spatial Gradient Spectra. APSIPA ASC 2023: 2019-2023 - [c140]Shogo Seki, Kanami Imamura, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Noboru Harada
:
W2N-AVSC: Audiovisual Extension For Whisper-To-Normal Speech Conversion. EUSIPCO 2023: 296-300 - [c139]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis. ICASSP 2023: 1-5 - [c138]Shogo Seki, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
JSV-VC: Jointly Trained Speaker Verification and Voice Conversion Models. ICASSP 2023: 1-5 - [c137]Kou Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Shogo Seki:
CFVC: Conditional Filtering for Controllable Voice Conversion. INTERSPEECH 2023: 2058-2062 - [c136]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN. INTERSPEECH 2023: 4369-4373 - [c135]Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko:
PRVAE-VC: Non-Parallel Many-to-Many Voice Conversion with Perturbation-Resistant Variational Autoencoder. SSW 2023: 88-93 - [i33]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis. CoRR abs/2303.13909 (2023) - [i32]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN. CoRR abs/2308.07117 (2023) - 2022
- [j31]Ryuji Hamamoto
, Ken Takasawa, Hidenori Machino
, Kazuma Kobayashi, Satoshi Takahashi, Amina Bolatkan, Norio Shinkai, Akira Sakai
, Rina Aoyama, Masayoshi Yamada, Ken Asada, Masaaki Komatsu, Koji Okamoto
, Hirokazu Kameoka, Syuzo Kaneko:
Application of non-negative matrix factorization in oncology: one approach for establishing precision medicine. Briefings Bioinform. 23(4) (2022) - [c134]Natsuki Ueno, Hirokazu Kameoka:
Multiple Sound Source Localization Based on Stochastic Modeling of Spatial Gradient Spectra. EUSIPCO 2022: 31-35 - [c133]Shogo Seki, Hirokazu Kameoka, Li Li:
Investigation And Comparison of Optimization Methods for Variational Autoencoder-Based Underdetermined Multichannel Source Separation. ICASSP 2022: 511-515 - [c132]Li Li, Hirokazu Kameoka, Shogo Seki:
HBP: An Efficient Block Permutation Solver Using Hungarian Algorithm and Spectrogram Inpainting for Multichannel Audio Source Separation. ICASSP 2022: 516-520 - [c131]Hirokazu Kameoka, Shogo Seki, Li Li, Chihiro Watanabe:
Attentionpit: Soft Permutation Invariant Training for Audio Source Separation with Attention Mechanism. ICASSP 2022: 706-710 - [c130]Takuhiro Kaneko, Kou Tanaka, Hirokazu Kameoka, Shogo Seki:
ISTFTNET: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform. ICASSP 2022: 6207-6211 - [c129]Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki, Kou Tanaka:
CAUSE: Crossmodal Action Unit Sequence Estimation from Speech. INTERSPEECH 2022: 506-510 - [c128]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks. INTERSPEECH 2022: 1631-1635 - [c127]Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki:
Distilling Sequence-to-Sequence Voice Conversion Models for Streaming Conversion Applications. SLT 2022: 1022-1028 - [i31]Takuhiro Kaneko, Kou Tanaka, Hirokazu Kameoka, Shogo Seki:
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform. CoRR abs/2203.02395 (2022) - [i30]Kohei Suzuki, Shoki Sakamoto, Tadahiro Taniguchi, Hirokazu Kameoka:
Speak Like a Dog: Human to Non-human creature Voice Conversion. CoRR abs/2206.04780 (2022) - [i29]Chihiro Watanabe, Hirokazu Kameoka:
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion. CoRR abs/2210.11059 (2022) - 2021
- [j30]Chihiro Watanabe, Hirokazu Kameoka:
X-DC: Explainable Deep Clustering Based on Learnable Spectrogram Templates. Neural Comput. 33(7): 1853-1885 (2021) - [j29]Tomohiko Nakamura
, Hirokazu Kameoka
:
Harmonic-Temporal Factor Decomposition for Unsupervised Monaural Separation of Harmonic Sounds. IEEE ACM Trans. Audio Speech Lang. Process. 29: 68-82 (2021) - [j28]Hirokazu Kameoka
, Wen-Chin Huang
, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda:
Many-to-Many Voice Transformer Network. IEEE ACM Trans. Audio Speech Lang. Process. 29: 656-670 (2021) - [j27]Wen-Chin Huang
, Tomoki Hayashi
, Yi-Chiao Wu
, Hirokazu Kameoka
, Tomoki Toda:
Pretraining Techniques for Sequence-to-Sequence Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 29: 745-755 (2021) - [c126]Asuka Moritani, Shoki Sakamoto, Ryo Ozaki, Hirokazu Kameoka, Tadahiro Taniguchi:
StarGAN-based Emotional Voice Conversion for Japanese Phrases. APSIPA ASC 2021: 836-840 - [c125]Shota Inoue, Hirokazu Kameoka, Li Li, Shoji Makino
:
SepNet: A Deep Separation Matrix Prediction Network for Multichannel Audio Source Separation. ICASSP 2021: 191-195 - [c124]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
Maskcyclegan-VC: Learning Non-Parallel Voice Conversion with Filling in Frames. ICASSP 2021: 5919-5923 - [c123]Shoki Sakamoto, Akira Taniguchi
, Tadahiro Taniguchi, Hirokazu Kameoka:
StarGAN-VC+ASR: StarGAN-Based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition. Interspeech 2021: 1359-1363 - [i28]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames. CoRR abs/2102.12841 (2021) - [i27]Asuka Moritani, Ryo Ozaki, Shoki Sakamoto, Hirokazu Kameoka, Tadahiro Taniguchi:
StarGAN-based Emotional Voice Conversion for Japanese Phrases. CoRR abs/2104.01807 (2021) - [i26]Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion. CoRR abs/2104.06900 (2021) - [i25]Shoki Sakamoto, Akira Taniguchi, Tadahiro Taniguchi, Hirokazu Kameoka:
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition. CoRR abs/2108.04395 (2021) - [i24]Li Li, Hirokazu Kameoka, Shoji Makino:
FastMVAE2: On improving and accelerating the fast variational autoencoder-based source separation algorithm for determined mixtures. CoRR abs/2109.13496 (2021) - 2020
- [j26]Li Li
, Hirokazu Kameoka
, Shoji Makino
:
Majorization-Minimization Algorithm for Discriminative Non-Negative Matrix Factorization. IEEE Access 8: 227399-227408 (2020) - [j25]Li Li
, Hirokazu Kameoka
, Shota Inoue, Shoji Makino
:
FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method. IEEE Access 8: 228740-228753 (2020) - [j24]Hirokazu Kameoka
, Kou Tanaka, Damian Kwasny
, Takuhiro Kaneko, Nobukatsu Hojo:
ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1849-1863 (2020) - [j23]Hirokazu Kameoka
, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
Nonparallel Voice Conversion With Augmented Classifier Star Generative Adversarial Networks. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2982-2995 (2020) - [c122]Mohammad Eshghi, Kazuhiro Kobayashi, Kou Tanaka, Hirokazu Kameoka, Tomoki Toda:
Phoneme Embeddings on Predicting Fundamental Frequency Pattern for Electrolaryngeal Speech. APSIPA 2020: 572-577 - [c121]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-Spectrogram Conversion. INTERSPEECH 2020: 2017-2021 - [c120]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda
:
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining. INTERSPEECH 2020: 4676-4680 - [c119]Li Li, Hirokazu Kameoka, Shoji Makino
:
Determined Audio Source Separation with Multichannel Star Generative Adversarial Network. MLSP 2020: 1-6 - [i23]Hirokazu Kameoka, Wen-Chin Huang, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda:
Many-to-Many Voice Transformer Network. CoRR abs/2005.08445 (2020) - [i22]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Pretraining Techniques for Sequence-to-Sequence Voice Conversion. CoRR abs/2008.03088 (2020) - [i21]Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Shogo Seki:
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics. CoRR abs/2010.02977 (2020) - [i20]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion. CoRR abs/2010.11672 (2020)
2010 – 2019
- 2019
- [j22]Shogo Seki
, Hirokazu Kameoka, Li Li, Tomoki Toda
, Kazuya Takeda:
Underdetermined Source Separation Based on Generalized Multichannel Variational Autoencoder. IEEE Access 7: 168104-168115 (2019) - [j21]Shinichi Mogami, Yoshiki Mitsui, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo, Hiroaki Nakajima, Hirokazu Kameoka:
Independent Low-Rank Matrix Analysis Based on Generalized Kullback-Leibler Divergence. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 102-A(2): 458-463 (2019) - [j20]Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino
:
Supervised Determined Source Separation with Multichannel Variational Autoencoder. Neural Comput. 31(9): 1891-1914 (2019) - [j19]Hirokazu Kameoka
, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder. IEEE ACM Trans. Audio Speech Lang. Process. 27(9): 1432-1443 (2019) - [c118]Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda:
Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation. EUSIPCO 2019: 1-5 - [c117]Shota Inoue, Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino:
Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder. ICASSP 2019: 96-100 - [c116]Li Li, Hirokazu Kameoka, Shoji Makino:
Fast MVAE: Joint Separation and Classification of Mixed Sources Based on Multichannel Variational Autoencoder with Auxiliary Classifier. ICASSP 2019: 546-550 - [c115]Go Irie, Mirela Ostrek, Haochen Wang, Hirokazu Kameoka, Akisato Kimura, Takahito Kawanishi, Kunio Kashino:
Seeing through Sounds: Predicting Visual Semantic Segmentation Results from Multichannel Audio Signals. ICASSP 2019: 3961-3964 - [c114]Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo:
ATTS2S-VC: Sequence-to-sequence Voice Conversion with Attention and Context Preservation Mechanisms. ICASSP 2019: 6805-6809 - [c113]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion. ICASSP 2019: 6820-6824 - [c112]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion. INTERSPEECH 2019: 679-683 - [c111]Dongxiao Wang, Hirokazu Kameoka, Koichi Shinoda:
A Modified Algorithm for Multiple Input Spectrogram Inversion. INTERSPEECH 2019: 4569-4573 - [c110]Mohammad Eshghi, Kou Tanaka, Kazuhiro Kobayashi, Hirokazu Kameoka, Tomoki Toda:
An Investigation of Features for Fundamental Frequency Pattern Prediction in Electrolaryngeal Speech Enhancement. SSW 2019: 251-256 - [i19]Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform. CoRR abs/1903.12392 (2019) - [i18]Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo:
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation. CoRR abs/1904.02892 (2019) - [i17]Hirokazu Kameoka, Kou Tanaka, Aaron Valero Puche, Yasunori Ohishi, Takuhiro Kaneko:
Crossmodal Voice Conversion. CoRR abs/1904.04540 (2019) - [i16]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion. CoRR abs/1904.04631 (2019) - [i15]Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo:
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion. CoRR abs/1907.12279 (2019) - [i14]Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling:
The ASVspoof 2019 database. CoRR abs/1911.01601 (2019) - [i13]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining. CoRR abs/1912.06813 (2019) - 2018
- [j18]Hirokazu Kameoka
, Takuya Higuchi
, Mikihiro Tanaka
, Li Li:
Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization. IEEE ACM Trans. Audio Speech Lang. Process. 26(6): 1025-1036 (2018) - [c109]Takuhiro Kaneko, Hirokazu Kameoka:
CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks. EUSIPCO 2018: 2100-2104 - [c108]Nobukatsu Hojo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
Automatic Speech Pronunciation Correction with Dynamic Frequency Warping-Based Spectral Conversion. EUSIPCO 2018: 2310-2314 - [c107]Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando:
Generative adversarial network-based approach to signal reconstruction from magnitude spectrogram. EUSIPCO 2018: 2514-2518 - [c106]Li Li, Hirokazu Kameoka:
Deep Clustering with Gated Convolutional Networks. ICASSP 2018: 16-20 - [c105]Hideaki Kagami, Hirokazu Kameoka, Masahiro Yukawa:
Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization. ICASSP 2018: 31-35 - [c104]Lauri Juvela
, Bajibabu Bollepalli, Xin Wang
, Hirokazu Kameoka, Manu Airaksinen
, Junichi Yamagishi, Paavo Alku
:
Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks. ICASSP 2018: 5679-5683 - [c103]Kou Tanaka, Hirokazu Kameoka, Kazuho Morikawa:
Vae-Space: Deep Generative Model of Voice Fundamental Frequency Contours. ICASSP 2018: 5779-5783 - [c102]Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks. SLT 2018: 266-273 - [c101]Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka:
Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks. SLT 2018: 632-639 - [i12]Lauri Juvela, Bajibabu Bollepalli, Xin Wang, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku:
Speech waveform synthesis from MFCC sequences with generative adversarial networks. CoRR abs/1804.00920 (2018) - [i11]Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando:
Generative adversarial network-based approach to signal reconstruction from magnitude spectrograms. CoRR abs/1804.02181 (2018) - [i10]Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks. CoRR abs/1806.02169 (2018) - [i9]Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino:
Semi-blind source separation with multichannel variational autoencoder. CoRR abs/1808.00892 (2018) - [i8]Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder. CoRR abs/1808.05092 (2018) - [i7]Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka:
WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks. CoRR abs/1809.10288 (2018) - [i6]Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda:
Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation. CoRR abs/1810.00223 (2018) - [i5]Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo:
ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion. CoRR abs/1811.01609 (2018) - [i4]Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo:
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms. CoRR abs/1811.04076 (2018) - [i3]Li Li, Hirokazu Kameoka, Shoji Makino:
Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier. CoRR abs/1812.06391 (2018) - 2017
- [c100]Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Hiroyasu Ando, Kaoru Hiramatsu, Kunio Kashino:
Non-native speech conversion with consistency-aware recursive network and generative adversarial network. APSIPA 2017: 182-188 - [c99]Patrick Lumban Tobing
, Hirokazu Kameoka, Tomoki Toda
:
Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling. APSIPA 2017: 1274-1277 - [c98]Li Li, Hirokazu Kameoka, Shoji Makino:
Discriminative non-negative matrix factorization with majorization-minimization. HSCMA 2017: 141-145 - [c97]Hirokazu Kameoka, Hideaki Kagami, Masahiro Yukawa:
Complex NMF with the generalized Kullback-Leibler divergence. ICASSP 2017: 56-60 - [c96]Hideaki Kagami, Hirokazu Kameoka, Masahiro Yukawa:
A majorization-minimization algorithm with projected gradient updates for time-domain spectrogram factorization. ICASSP 2017: 561-565 - [c95]Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino:
Generative adversarial network-based postfilter for statistical parametric speech synthesis. ICASSP 2017: 4910-4914 - [c94]Yusuke Tajiri, Hirokazu Kameoka, Tomoki Toda
:
A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals. ICASSP 2017: 4960-4964 - [c93]Ryotaro Sato
, Hirokazu Kameoka, Kunio Kashino:
Fast algorithm for statistical phrase/accent command estimation based on generative model incorporating spectral features. ICASSP 2017: 5595-5599 - [c92]Kou Tanaka, Hirokazu Kameoka, Tomoki Toda
, Satoshi Nakamura:
Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement. INTERSPEECH 2017: 1069-1073 - [c91]Nobukatsu Hojo, Yasuhito Ohsugi, Yusuke Ijima, Hirokazu Kameoka:
DNN-SPACE: DNN-HMM-Based Generative Model of Voice F0 Contours for Statistical Phrase/Accent Command Estimation. INTERSPEECH 2017: 1074-1078 - [c90]Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Direct Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis. INTERSPEECH 2017: 1128-1132 - [c89]Takuhiro Kaneko, Hirokazu Kameoka, Kaoru Hiramatsu, Kunio Kashino:
Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks. INTERSPEECH 2017: 1283-1287 - [c88]Li Li, Hirokazu Kameoka, Tomoki Toda
, Shoji Makino
:
Speech Enhancement Using Non-Negative Spectrogram Models with Mel-Generalized Cepstral Regularization. INTERSPEECH 2017: 1998-2002 - [c87]Takuhiro Kaneko, Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Generative Adversarial Network-Based Postfilter for STFT Spectrograms. INTERSPEECH 2017: 3389-3393 - [c86]Li Li, Hirokazu Kameoka, Shoji Makino:
Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization. MLSP 2017: 1-6 - [c85]Shogo Seki, Hirokazu Kameoka, Tomoki Toda
, Kazuya Takeda:
Missing component restoration for masked speech signals based on time-domain spectrogram factorization. MLSP 2017: 1-6 - [i2]Takuhiro Kaneko, Hirokazu Kameoka:
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. CoRR abs/1711.11293 (2017) - 2016
- [j17]Daichi Kitamura, Nobutaka Ono
, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari:
Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization. IEEE ACM Trans. Audio Speech Lang. Process. 24(9): 1626-1641 (2016) - [c84]Aki Hayashi, Hirokazu Kameoka, Tatsushi Matsubayashi, Hiroshi Sawada:
Non-negative periodic component analysis for music source separation. APSIPA 2016: 1-9 - [c83]Nobutaka Ono
, Kazuaki Shibata, Hirokazu Kameoka:
Self-localization and channel synchronization of smartphone arrays using sound emissions. APSIPA 2016: 1-5 - [c82]Naoki Murata, Hirokazu Kameoka, Keisuke Kinoshita
, Shoko Araki
, Tomohiro Nakatani, Shoichi Koyama
, Hiroshi Saruwatari:
Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution. EUSIPCO 2016: 1648-1652 - [c81]Naoki Murata, Shoichi Koyama
, Hirokazu Kameoka, Norihiro Takamune, Hiroshi Saruwatari:
Sparse sound field decomposition with multichannel extension of complex NMF. ICASSP 2016: 345-349 - [c80]Tomohiko Nakamura
, Hirokazu Kameoka:
Shifted and convolutive source-filter non-negative matrix factorization for monaural audio source separation. ICASSP 2016: 489-493 - [c79]Kou Tanaka, Hirokazu Kameoka, Tomoki Toda
, Satoshi Nakamura:
Statistical F0 prediction for electrolaryngeal speech enhancement considering generative process of F0 contours within product of experts framework. ICASSP 2016: 5665-5669 - [c78]Patrick Lumban Tobing
, Tomoki Toda
, Hirokazu Kameoka, Satoshi Nakamura:
Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model. INTERSPEECH 2016: 953-957 - [c77]Lauri Juvela
, Hirokazu Kameoka, Manu Airaksinen
, Junichi Yamagishi, Paavo Alku
:
Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering. INTERSPEECH 2016: 968-972 - [c76]Li Li, Hirokazu Kameoka, Takuya Higuchi, Hiroshi Saruwatari:
Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech. INTERSPEECH 2016: 3753-3757 - 2015
- [j16]Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada
, Hirokazu Kameoka, Takehiro Moriya:
Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder. IEEE ACM Trans. Audio Speech Lang. Process. 23(2): 288-299 (2015) - [j15]Daichi Kitamura, Hiroshi Saruwatari, Hirokazu Kameoka, Yu Takahashi
, Kazunobu Kondo, Satoshi Nakamura:
Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration. IEEE ACM Trans. Audio Speech Lang. Process. 23(4): 654-669 (2015) - [j14]Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Kento Kadowaki, Yasunori Ohishi, Kunio Kashino:
Generative Modeling of Voice Fundamental Frequency Contours. IEEE ACM Trans. Audio Speech Lang. Process. 23(6): 1042-1053 (2015) - [j13]Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada
, Hirokazu Kameoka, Takehiro Moriya:
Optimal Coding of Generalized-Gaussian-Distributed Frequency Spectra for Low-Delay Audio Coder With Powered All-Pole Spectrum Estimation. IEEE ACM Trans. Audio Speech Lang. Process. 23(8): 1309-1321 (2015) - [c75]Daichi Kitamura, Nobutaka Ono
, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari:
Relaxation of rank-1 spatial constraint in overdetermined blind source separation. EUSIPCO 2015: 1261-1265 - [c74]Takuya Higuchi, Hirokazu Kameoka:
Unified approach for audio source separation with multichannel factorial HMM and DOA mixture model. EUSIPCO 2015: 2043-2047 - [c73]Hirokazu Kameoka:
Multi-resolution signal decomposition with time-domain spectrogram factorization. ICASSP 2015: 86-90 - [c72]Daichi Kitamura, Nobutaka Ono
, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari:
Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model. ICASSP 2015: 276-280 - [c71]Tomohiko Nakamura
, Hirokazu Kameoka:
Lp-norm non-negative matrix factorization and its application to singing voice enhancement. ICASSP 2015: 2115-2119 - [c70]Hirokazu Kameoka:
Modeling speech parameter sequences with latent trajectory Hidden Markov model. MLSP 2015: 1-6 - 2014
- [j12]Hideyuki Tachibana
, Nobutaka Ono
, Hirokazu Kameoka, Shigeki Sagayama:
Harmonic/percussive sound separation based on anisotropic smoothness of spectrograms. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 2059-2073 (2014) - [c69]Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka:
Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration. APSIPA 2014: 1-10 - [c68]Tomohiko Nakamura, Hirokazu Kameoka:
Fast Signal Reconstruction from Magnitude Spectrogram of Continuous Wavelet Transform Based on Spectrogram Consistency. DAFx 2014: 129-135 - [c67]Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya:
Representation of spectral envelope with warped frequency resolution for audio coder. EUSIPCO 2014: 51-55 - [c66]Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya:
Direct linear conversion of LSP parameters for perceptual control in speech and audio coding. EUSIPCO 2014: 56-60 - [c65]Takuya Higuchi, Hirokazu Kameoka:
Unified approach for underdetermined BSS, VAD, dereverberation and DOA estimation with multichannel factorial HMM. GlobalSIP 2014: 562-566 - [c64]Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada
, Hirokazu Kameoka, Takehiro Moriya:
Golomb-rice coding optimized via LPC for frequency domain audio coder. GlobalSIP 2014: 1024-1028 - [c63]Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka:
Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation. HSCMA 2014: 92-96 - [c62]Masahiro Nakano, Yasunori Ohishi, Hirokazu Kameoka, Ryo Mukai, Kunio Kashino:
Mondrian hidden Markov model for music signal processing. ICASSP 2014: 2405-2409 - [c61]Takuya Higuchi, Norihiro Takamune, Tomohiko Nakamura
, Hirokazu Kameoka:
Underdetermined blind separation and tracking of moving sources based ONDOA-HMM. ICASSP 2014: 3191-3195 - [c60]Yasunori Ohishi, Daichi Mochihashi, Hirokazu Kameoka, Kunio Kashino:
Mixture of Gaussian process experts for predicting sung melodic contour with expressive dynamic fluctuations. ICASSP 2014: 3714-3718 - [c59]Tomohiko Nakamura
, Hirokazu Kameoka, Kazuyoshi Yoshii
, Masataka Goto
:
Timbre replacement of harmonic and drum components for music audio signals. ICASSP 2014: 7470-7474 - [c58]Takuya Higuchi, Hirofumi Takeda, Tomohiko Nakamura, Hirokazu Kameoka:
A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models. INTERSPEECH 2014: 850-854 - [c57]Kento Kadowaki, Tatsuma Ishihara, Nobukatsu Hojo, Hirokazu Kameoka:
Speech prosody generation for text-to-speech synthesis based on generative model of F0 contours. INTERSPEECH 2014: 2322-2326 - [c56]Tomohiko Nakamura, Kotaro Shikata, Norihiro Takamune, Hirokazu Kameoka:
Harmonic-Temporal Factor Decomposition Incorporating Music Prior Information for Informed Monaural Source Separation. ISMIR 2014: 623-628 - [c55]Takuya Higuchi, Hirokazu Kameoka:
Joint audio source separation and dereverberation based on multichannel factorial hidden Markov model. MLSP 2014: 1-6 - [c54]Hirokazu Kameoka, Norihiro Takamune:
Training Restricted Boltzmann Machines with auxiliary function approach. MLSP 2014: 1-6 - [c53]Norihiro Takamune, Hirokazu Kameoka:
Maximum reconstruction probability training of Restricted Boltzmann machines with auxiliary function approach. MLSP 2014: 1-6 - 2013
- [j11]Hirokazu Kameoka, Misa Sato, Takuma Ono, Nobutaka Ono
, Shigeki Sagayama:
Bayesian Nonparametric Approach to Blind Separation of Infinitely Many Sparse Sources. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 96-A(10): 1928-1937 (2013) - [j10]Akisato Kimura, Masashi Sugiyama, Takuho Nakano, Hirokazu Kameoka, Hitoshi Sakano, Eisaku Maeda, Katsuhiko Ishiguro:
SemiCCA: Efficient Semi-supervised Learning of Canonical Correlations. Inf. Media Technol. 8(2): 311-318 (2013) - [j9]Akisato Kimura, Masashi Sugiyama, Hitoshi Sakano, Hirokazu Kameoka:
Designing Various Multivariate Analysis at Will via Generalized Pairwise Expression. Inf. Media Technol. 8(2): 319-328 (2013) - [j8]Gen Hori, Hirokazu Kameoka, Shigeki Sagayama:
Input-Output HMM Applied to Automatic Arrangement for Guitars. Inf. Media Technol. 8(2): 477-484 (2013) - [j7]Gen Hori, Hirokazu Kameoka, Shigeki Sagayama:
Input-Output HMM Applied to Automatic Arrangement for Guitars. J. Inf. Process. 21(2): 264-271 (2013) - [j6]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki
, Naonori Ueda:
Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data. IEEE Trans. Speech Audio Process. 21(5): 971-982 (2013) - [c52]Masato Tsuchiya, Kazuki Ochiai, Hirokazu Kameoka, Shigeki Sagayama:
Probabilistic model of two-dimensional rhythm tree structure representation for automatic transcription of polyphonic MIDI signals. APSIPA 2013: 1-6 - [c51]Yasunori Ohishi, Daichi Mochihashi, Tomoko Matsui
, Masahiro Nakano, Hirokazu Kameoka, Tomonori Izumitani, Kunio Kashino:
Bayesian semi-supervised audio event transcription based on Markov indian buffet process. ICASSP 2013: 3163-3167 - [c50]Tatsuma Ishihara, Hirokazu Kameoka, Kota Yoshizato, Daisuke Saito, Shigeki Sagayama:
Probabilistic speech F0 contour model incorporating statistical vocabulary model of phrase-accent command sequence. INTERSPEECH 2013: 1017-1021 - [c49]Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Yasunori Ohishi, Kunio Kashino, Shigeki Sagayama:
Generative modeling of speech F0 contours. INTERSPEECH 2013: 1826-1830 - [c48]Nobukatsu Hojo, Kota Yoshizato, Hirokazu Kameoka, Daisuke Saito, Shigeki Sagayama:
Text-to-speech synthesizer based on combination of composite wavelet and hidden Markov models. SSW 2013: 129-134 - 2012
- [c47]Kazuki Ochiai, Hirokazu Kameoka, Shigeki Sagayama:
Explicit beat structure modeling for non-negative matrix factorization-based multipitch analysis. ICASSP 2012: 133-136 - [c46]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki
, Naonori Ueda:
Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. ICASSP 2012: 261-264 - [c45]Masahiro Nakano, Yasunori Ohishi, Hirokazu Kameoka, Ryo Mukai, Kunio Kashino:
Bayesian nonparametric music parser. ICASSP 2012: 461-464 - [c44]Hideyuki Tachibana
, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Comparative evaluations of various harmonic/percussive sound separation algorithms based on anisotropic continuity of spectrogram. ICASSP 2012: 465-468 - [c43]Hirokazu Kameoka, Masahiro Nakano, Kazuki Ochiai, Yutaka Imoto, Kunio Kashino, Shigeki Sagayama:
Constrained and regularized variants of non-negative matrix factorization incorporating music-specific constraints. ICASSP 2012: 5365-5368 - [c42]Akisato Kimura, Hitoshi Sakano, Hirokazu Kameoka, Masashi Sugiyama:
Designing various component analysis at will. ICPR 2012: 2959-2962 - [c41]Kota Yoshizato, Hirokazu Kameoka, Daisuke Saito, Shigeki Sagayama:
Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech. INTERSPEECH 2012: 390-393 - [c40]Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Kunio Kashino:
A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components. INTERSPEECH 2012: 474-477 - [c39]Hirokazu Kameoka, Kazuki Ochiai, Masahiro Nakano, Masato Tsuchiya, Shigeki Sagayama:
Context-free 2D Tree Structure Model of Musical Notes for Bayesian Modeling of Polyphonic Spectrograms. ISMIR 2012: 307-312 - [c38]Hirokazu Kameoka, Misa Sato, Takuma Ono, Nobutaka Ono, Shigeki Sagayama:
Blind Separation of Infinitely Many Sparse Sources. IWAENC 2012 - [i1]Akisato Kimura, Masashi Sugiyama, Hitoshi Sakano, Hirokazu Kameoka:
Designing various component analysis at will. CoRR abs/1207.3554 (2012) - 2011
- [j5]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné
, Shigeki Sagayama:
Computational auditory induction as a missing-data model-fitting problem with Bregman divergence. Speech Commun. 53(5): 658-676 (2011) - [c37]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki
, Naonori Ueda:
Formulations and algorithms for multichannel complex NMF. ICASSP 2011: 229-232 - [c36]Naoki Yasuraoka, Hirokazu Kameoka, Takuya Yoshioka, Hiroshi G. Okuno
:
I-Divergence-based dereverberation method with auxiliary function approach. ICASSP 2011: 369-372 - [c35]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Infinite-state spectrum model for music signal analysis. ICASSP 2011: 1972-1975 - [c34]Jun Takagi, Yasunori Ohishi, Akisato Kimura, Masashi Sugiyama, Makoto Yamada
, Hirokazu Kameoka:
Automatic audio tag classification via semi-supervised canonical density estimation. ICASSP 2011: 2232-2235 - [c33]Takuho Nakano, Akisato Kimura, Hirokazu Kameoka, Shigeki Miyabe, Shigeki Sagayama, Nobutaka Ono
, Kunio Kashino, Takuya Nishimoto:
Automatic video annotation via Hierarchical Topic Trajectory Model considering cross-modal correlations. ICASSP 2011: 2380-2383 - [c32]Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki
, Naonori Ueda:
New formulations and efficient algorithms for multichannel NMF. WASPAA 2011: 153-156 - [c31]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Tomohiko Nakamura
, Nobutaka Ono
, Shigeki Sagayama:
Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model. WASPAA 2011: 325-328 - 2010
- [j4]Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Speech Spectrum Modeling for Joint Estimation of Spectral Envelope and Fundamental Frequency. IEEE Trans. Speech Audio Process. 18(6): 1507-1516 (2010) - [c30]Jonathan Le Roux, Emmanuel Vincent, Yuu Mizuno, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Consistent Wiener Filtering: Generalized Time-Frequency Masking Respecting Spectrogram Consistency. LVA/ICA 2010: 89-96 - [c29]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Yu Kitano, Nobutaka Ono
, Shigeki Sagayama:
Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms. LVA/ICA 2010: 149-156 - [c28]Hirokazu Kameoka, Takuya Yoshioka, Mariko Hamamura, Jonathan Le Roux, Kunio Kashino:
Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation. LVA/ICA 2010: 245-253 - [c27]Yu Kitano, Hirokazu Kameoka, Yosuke Izumi, Nobutaka Ono
, Shigeki Sagayama:
A sparse component model of source signals and its application to blind source separation. ICASSP 2010: 4122-4125 - [c26]Akisato Kimura, Hirokazu Kameoka, Masashi Sugiyama, Takuho Nakano, Eisaku Maeda, Hitoshi Sakano, Katsuhiko Ishiguro:
SemiCCA: Efficient Semi-supervised Learning of Canonical Correlations. ICPR 2010: 2933-2936 - [c25]Hirokazu Kameoka, Jonathan Le Roux, Yasunori Ohishi:
A statistical model of speech F0 contours. SAPA@INTERSPEECH 2010: 43-48 - [c24]Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Hidehisa Nagano, Kunio Kashino:
Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases. INTERSPEECH 2010: 2598-2601 - [c23]Takuho Nakano, Shigeki Sagayama, Nobutaka Ono, Akisato Kimura, Hirokazu Kameoka, Kunio Kashino:
SEMANTIC INDEXING AND KNOWN ITEM SEARCH BASED ON A UNIFIED MODEL WITH TOPIC TRANSITION REPRESENTATION. TRECVID 2010 - [p1]Nobutaka Ono
, Kenichi Miyamoto, Hirokazu Kameoka, Jonathan Le Roux, Yuuki Uchiyama, Emiru Tsunoo, Takuya Nishimoto, Shigeki Sagayama:
Harmonic and Percussive Sound Separation and Its Application to MIR-Related Tasks. Advances in Music Information Retrieval 2010: 213-236
2000 – 2009
- 2009
- [c22]Hirokazu Kameoka, Tomohiro Nakatani, Takuya Yoshioka:
Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms. ICASSP 2009: 45-48 - [c21]Hirokazu Kameoka, Nobutaka Ono
, Kunio Kashino, Shigeki Sagayama:
Complex NMF: A new sparse representation for acoustic signals. ICASSP 2009: 3437-3440 - [c20]Hirokazu Kameoka, Kunio Kashino:
Composite Autoregressive System for Sparse Source-filter Representation of speech. ISCAS 2009: 2477-2480 - [c19]Tatsuya Kako, Yasunori Ohishi, Hirokazu Kameoka, Kunio Kashino, Kazuya Takeda:
Automatic Identification for Singing Style based on Sung Melodic Contour Characterized in Phase Plane. ISMIR 2009: 393-398 - [c18]Takuya Yoshioka, Hirokazu Kameoka, Tomohiro Nakatani, Hiroshi G. Okuno
:
Statistical models for speech dereverberation. WASPAA 2009: 145-148 - 2008
- [j3]Shoichiro Saito, Hirokazu Kameoka, Keigo Takahashi, Takuya Nishimoto, Shigeki Sagayama:
Specmurt Analysis of Polyphonic Music Signals. IEEE Trans. Speech Audio Process. 16(3): 639-650 (2008) - [c17]Nobutaka Ono, Kenichi Miyamoto, Jonathan Le Roux, Hirokazu Kameoka, Shigeki Sagayama:
Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram. EUSIPCO 2008: 1-4 - [c16]Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Auxiliary function approach to parameter estimation of constrained sinusoidal model for monaural speech separation. ICASSP 2008: 29-32 - [c15]Kenichi Miyamoto, Hirokazu Kameoka, Takuya Nishimoto, Nobutaka Ono
, Shigeki Sagayama:
Harmonic-Temporal-Timbral Clustering (HTTC) for the analysis of multi-instrument polyphonic music signals. ICASSP 2008: 113-116 - [c14]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama, Alain de Cheveigné
:
Modulation analysis of speech through orthogonal FIR filterbank optimization. ICASSP 2008: 4189-4192 - [c13]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama:
Computational auditory induction by missing-data non-negative matrix factorization. SAPA@INTERSPEECH 2008: 1-6 - [c12]Yasunori Ohishi, Hirokazu Kameoka, Kunio Kashino, Kazuya Takeda:
Parameter estimation method of F0 control model for singing voices. INTERSPEECH 2008: 139-142 - [c11]Nobutaka Ono, Kenichi Miyamoto, Hirokazu Kameoka, Shigeki Sagayama:
A Real-time Equalizer of Harmonic and Percussive Components in Music Signals. ISMIR 2008: 139-144 - 2007
- [j2]Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama:
A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering. IEEE Trans. Speech Audio Process. 15(3): 982-994 (2007) - [j1]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné
, Shigeki Sagayama:
Single and Multiple F0 Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments. IEEE Trans. Speech Audio Process. 15(4): 1135-1145 (2007) - [c10]Kenichi Miyamoto, Hirokazu Kameoka, Haruto Takeda, Takuya Nishimoto, Shigeki Sagayama:
Probabilistic Approach to Automatic Music Transcription from Audio Signals. ICASSP (2) 2007: 697-700 - [c9]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné, Shigeki Sagayama:
Harmonic-Temporal Clustering of Speech for Single and Multiple F0 Contour Estimation in Noisy Environments. ICASSP (4) 2007: 1053-1056 - [c8]Yuichiro Yonebayashi, Hirokazu Kameoka, Shigeki Sagayama:
Automatic Decision of Piano Fingering Based on a Hidden Markov Models. IJCAI 2007: 2915-2921 - 2006
- [c7]Hirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama:
Speech analyzer using a joint estimation model of spectral envelope and fine structure. INTERSPEECH 2006 - 2005
- [c6]Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama:
Audio stream segregation of multi-pitch music signal based on time-space clustering using Gaussian kernel 2-dimensional model. ICASSP (3) 2005: 5-8 - [c5]Shoichiro Saito, Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama:
Specmurt Analysis of Multi-Pitch Music Signals with Adaptive Estimation of Common Harmonic Structure . ISMIR 2005: 84-91 - [c4]Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama:
Harmonic-Temporal Clustering via Deterministic Annealing EM Algorithm for Audio Feature Extraction. ISMIR 2005: 115-122 - 2004
- [c3]Hirokazu Kameoka, Takuya Nishimoto, Shigeki Sagayama:
Separation of harmonic structures based on tied Gaussian mixture model and information criterion for concurrent sounds. ICASSP (4) 2004: 297-300 - [c2]Shigeki Sagayama, Keigo Takahashi, Hirokazu Kameoka, Takuya Nishimoto:
Specmurt anasylis: a piano-roll-visualization of polyphonic music signal by deconvolution of log-frequency spectrum. SAPA@INTERSPEECH 2004: 128 - [c1]Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka:
Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering. INTERSPEECH 2004: 2433-2436
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-04 21:13 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint