國立中央大學教師履歷平台

教師個人簡歷

回上一頁

王家慶

Wang, Jia-Ching

jiacwang@gmail.com

其他補助

智能語音及語言處理技術研發人才培育計畫
1140801~1150731
[人工智慧,語音辨識,人才培育 , Artificial Intelligence,Speech Recognition,Talent Cultivation]
智能口語處理技術研發人才培育計畫
1120801~1130731
[智能口語 , Intelligent Spoken Language Processing]
智能口語處理技術研發人才培育計畫
1110801~1120731
[智能口語 , Intelligent Spoken Language Processing]
智慧晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫
1110401~1120331
[智慧晶片 , Intelligent Chip]
應用於客語友善交流及輔助學習之客語翻譯APP
1110110~1110831
[客家,翻譯 , Hakka,translate]
應用於客家民族植物知識語言之智慧型語音及影像辨識功能建置
1100801~1110731
[語音辨識,影像辨識,客家,植物 , Speech recognition,Image recognition,Hakka,Plant]
智慧終端裝置晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫
1100701~1110331
[智慧晶片 , Intelligent Chip]

國科會計畫統計

基於生成式模型之語音及文字惡意攻擊的防禦技術
1160801~1170731
基於生成式模型之語音及文字惡意攻擊的防禦技術
1150801~1160731
AI語音技術與產業應用聯盟(3/3)
1150201~1160131
科研創業計畫：結合唇形之語音辨識商業化個案
1150101~1151231
基於生成式模型之語音及文字惡意攻擊的防禦技術
1140801~1150731
低資源語言之開放式口語問答系統
1140801~1150731
適於邊緣裝置之輕量化多模態大型語言模型與異質加速平台設計及其於生命科學之應用(2/2)
1140501~1150430
[多模態 , Multimodal]
AI語音技術與產業應用聯盟(2/3)
1140201~1150131
[語音降噪、語音辨識 , Speech denoising, speech recognition]
低資源語言之開放式口語問答系統
1130801~1140731
多模態語音辨識技術
1130601~1140531
[多模態語音辨識,聲學模型,,掩碼語言模型, , Multimodal Speech Recognition,Acoustic Model,Masked Language Modeling]
適於邊緣裝置之輕量化多模態大型語言模型與異質加速平台設計及其於生命科學之應用(1/2)
1130501~1140731
[多模態 , Multimodal]
低資源語言之開放式口語問答系統
1120801~1130731
基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1120801~1130731
[語音辨識 , Speech recognition]
建置轉譯導向乳癌輛資料及生物資料庫平台(3/4)
1120501~1130430
[乳癌資料庫平台 , Breas Cancer Database Platform.]
基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1110801~1120731
[語音辨識 , Speech recognition]
SMA-SRGAN:適用於生物辨識影像生成之基於空間域遮罩注意力機制的超解析度生成對抗式網路
1100801~1110731
[生物辨識 ]
基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1100801~1110731
[語音辨識 , Speech recognition]
智慧型照護互動系統-基於深度智能之口語處理技術(4/4)
1100101~1110331
[智慧型照護互動系統 ]
智慧型照護互動系統-基於深度智能之口語處理技術(3/4)
1090101~1091231
智慧型照護互動系統－基於深度智能之口語處理技術(2/4)
1080101~1081231
智慧型照護互動系統-基於深度智能之口語處理技術(1/4)
1070101~1071231
[口語處理,語音分離,混語辨識,口語翻譯,語音情緒辨識,對話系統,深度學習 , Spoken language processing,speech separation,code-switching speech recognition,spoken language translation,speech emotion recognition,dialogue system,deep learning]
基於深度學習之多媒體資料融合與分析–子計畫六：基於深度學習之音樂情緒辨識研究
1060801~1070228
基於深層學習於盲訊號源分離及語音增強之研究
1060801~1070731
[Deep Learning, Speech Enhancement, Blind Source Separation ]
銀髮族口語互動式居家陪伴及推薦系統
1051101~1061031
基於深層學習於盲訊號源分離及語音增強之研究
1050801~1060731
[Deep Learning, Speech Enhancement, Blind Source Separation ]
基於深層學習於盲訊號源分離及語音增強之研究
1040801~1050731
[Deep Learning, Speech Enhancement, Blind Source Separation ]

產學合作計畫統計

口語處理人工智慧研究計畫書
1150428~1151027
[生成式人工智慧,自然語言處理 , Generative AI,Natural Language Processing]
電子製造產業智慧化共通應用模組導入驗證
1140919~1141130
[智慧製造,人工智慧應用 , Smart Manufacturing,AI Applications]
凌陽中央聯合實驗室
1130901~1150630
[聲紋複製 , Voice print]
AI 模型測試
1130501~1131031
[AI 模型測試 , Testing AI Models]
語音辨識
1120901~1130610
[語音辨識 , Speech recognition]
人工智慧技術
1120801~1131231
[人工智慧 , Artificial Intelligence Technology]
群創中央聯合實驗室-指紋辨識技術及其開發平台建置
1110801~1120930
[指紋辨識技術 , Fingerprint Recognition Technique]
110中大聯新聯合研發中心計畫-一般X光影像自動核對系統
1100801~1110731
[影像核對,左右顛倒,QC流程 , image verification,incorrect left and right,QC process]

期刊著作

Wind Noise Reduction Based on the Double Masking and Permutation-Invariant Training
Electronics (Switzerland), 15, 5, 2026-03-01
[ dual masking learning,non-stationary noise,permutation-invariant training,wind noise reduction ]
YOWOv3: An Efficient and Generalized Framework for Spatiotemporal Action Detection
IEEE Intelligent Systems, 41, 1, 75-85, 2026-01-01
Hardware Implementation of Improved Oriented FAST and Rotated BRIEF-Simultaneous Localization and Mapping Version 2
Sensors, 25, 20, 2025-10-01
[ machine vision,ORB-SLAM2,Raspberry Pi 3,SLAM,spatial scene construction ]
A Hybrid Deep Learning and Feature Descriptor Approach for Partial Fingerprint Recognition
Electronics (Switzerland), 14, 9, 2025-05-01
[ biometric authentication,convolutional neural networks,deep learning,feature descriptor,partial fingerprint recognition ]
MLSS: Mandarin English Code-Switching Speech Recognition via Mutual Learning-Based Semi-Supervised Method
IEEE Signal Processing Letters, 32, 1510-1514, 2025-01-01
[ Automatic speech recognition,code-switching speech recognition,mutual learning-based semi-supervised learning,semi-supervised learning ]
HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining
Visual Computer, 40, 11, 7945-7960, 2024-11-01
[ Object contextual representation,Pixel-level attention,Pixel-level contrastive learning,Self-supervised learning,Visual representation learning ]
Editorial for Special Issue on Invited Papers from APSIPA ASC 2023
APSIPA Transactions on Signal and Information Processing, 13, 5, 2024-10-07
Implementation of Sound Direction Detection and Mixed Source Separation in Embedded Systems
Sensors, 24, 13, 2024-07-01
[ embedded systems,hybrid sound source separation,position detection,signal-to-interference ratio (SIR),speech recognition ]
Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network
Electronics (Switzerland), 13, 13, 2024-07-01
[ few-shot image segmentation,few-shot learning,meta-learning,semantic segmentation ]
Audio Pre-Processing and Beamforming Implementation on Embedded Systems
Electronics (Switzerland), 13, 14, 2024-07-01
[ audio pre-processing,beamforming,embedded system,enhance speech ]
Semantic-Based Public Opinion Analysis System
Electronics (Switzerland), 13, 11, 2024-06-01
[ K-nearest neighbor algorithm,sentence analysis,support vector machines,topic input and commentary ]
Zero-FVeinNet: Optimizing Finger Vein Recognition with Shallow CNNs and Zero-Shuffle Attention for Low-Computational Devices
Electronics (Switzerland), 13, 9, 2024-05-01
[ attention,biometrical verification,convolution neural network,finger vein,lightweight model ]
Multi-view and multi-augmentation for self-supervised visual representation learning
Applied Intelligence, 54, 1, 629-656, 2024-01-01
[ Data augmentation policies,Metric learning,Multi-augmentation,Nuisance factors,Scale-invariant representation learning,SSL augmentation pipelines ]
Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network
Electronics (Switzerland), 13, 2, 2024-01-01
[ automatic speech recognition (ASR),convolutional neural network (CNN),deep learning,target speaker extraction,temporal convolutional network (TCN) ]
Enhancing Breast Cancer Detection: A Novel Training Strategy and Batch Scheduler Method
Proceedings - IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS, 2024, 2024-01-01
[ Batch Scheduler,ConvNext,Dynamic Batch Size,F1 Score,Pretrain ]
LCSL: Long-Tailed Classification via Self-Labeling
IEEE Transactions on Circuits and Systems for Video Technology, 34, 11, 12048-12058, 2024-01-01
[ Image classification,imbalance classification,long-tailed problem,self-labeling ]
Anti-Aliasing Attention U-net Model for Skin Lesion Segmentation
Diagnostics, 13, 8, 2023-04-01
[ computer-aided diagnosis,deep learning,light-weight model,medical internet of things,skin lesion segmentation ]
Electrocardiogram Heartbeat Classification for Arrhythmias and Myocardial Infarction
Sensors, 23, 6, 2023-03-01
[ deep learning,electrocardiogram (ECG) classification,MIT-BIH dataset,PTB dataset ]
The COVIDTW study: Clinical predictors of COVID-19 mortality and a novel AI prognostic model using chest X-ray
Journal of the Formosan Medical Association, 122, 3, 267-275, 2023-03-01
[ Artificial intelligence,Chest X-rays,COVID-19,Intensive care unit,Mortality,Prognosis ]
Anti-aliasing convolution neural network of finger vein recognition for virtual reality (VR) human–robot equipment of metaverse
Journal of Supercomputing, 79, 3, 2767-2782, 2023-02-01
[ Anti-aliasing,Biometrics,Convolution network,Deep learning,Finger vein recognition,Image processing,Metaverse,Pre-processing,Virtual reality (VR) human–robot ]
Deep Learning for Human Action Recognition: A Comprehensive Review
APSIPA Transactions on Signal and Information Processing, 12, 1, 2023-01-01
[ Action recognition,deep learning,deep neural networks,self-supervised learning,supervised learning ]
Editorial for the Special Issue on Learning, Security, AIoT for Emerging Communication/Networking Systems
APSIPA Transactions on Signal and Information Processing, 12, 2, 2023-01-01
Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
IEEE Signal Processing Letters, 30, 1387-1391, 2023-01-01
[ code-switching speech recognition,cyclic transfer learning,Speech recognition,transfer learning ]
A DEEP LEARNING-BASED FAKE NEWS DETECTING SYSTEM
IET Conference Proceedings, 2023, 35, 172-173, 2023-01-01
[ data augmentation,deep learning,Fake news detection,word embedding ]
Fast Gated Recurrent Network for Speech Synthesis
IEICE Transactions on Information and Systems, E105D, 9, 1634-1638, 2022-09-01
[ acoustic modelling,gated recurrent neural network,long short-term memory,speech synthesis ]
Heuristic Attention Representation Learning for Self-Supervised Pretraining
Sensors, 22, 14, 2022-07-10
[ computer vision,deep learning,heuristic attention,perceptual grouping,self-supervised learning,visual representation learning ]
Spectral-Temporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification
IEEE Transactions on Cybernetics, 52, 5, 3684-3695, 2022-05-01
Self-Supervised Learning Framework toward State-of-the-Art Iris Image Segmentation
Sensors, 22, 6, 2022-03-01
Convolutional Blur Attention Network for Cell Nuclei Segmentation
Sensors, 22, 4, 2022-02-01
[ Cell nuclei,Convolutional neural network,Deep learning,Nucleus segmentation ]
Speech Separation Using Augmented-Discrimination Learning on Squash-Norm Embedding Vector and Node Encoder
IEEE Access, 10, 102048-102063, 2022-01-01
[ deep clustering,monophonic source separation,Speaker separation,speech enhancement,supervised speech separation,time frequency masking ]
Antialiasing Attention Spatial Convolution Model for Skin Lesion Segmentation with Applications in the Medical IoT
Wireless Communications and Mobile Computing, 2022, 2022-01-01
Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques
Injury, 52, 3, 616-624, 2021-03-01
[ computer-aided classification system ]
Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition
IEEE Access, 9, 105711-105723, 2021-01-01
[ action recognition,convolutional neural network,deep learning,knowledge distillation,Self-knowledge distillation,self-learning ]
Ensemble and Multimodal Learning for Pathological Voice Classification
IEEE Sensors Letters, 2021-01-01
[ acoustic signal,Acoustics,binary classification,ensemble learning,Medical diagnostic imaging,Neoplasms,pathological voice,Pathology,Stacking,Standards,Support vector machines ]
A Calibration-Free 14-b 0.7-mW 100-MS/s Pipelined-SAR ADC Using a Weighted- Averaging Correlated Level Shifting Technique
IEEE Journal of Solid-State Circuits, 55, 12, 3271-3280, 2020-12-01
Sound Events Recognition and Retrieval Using Multi-Convolutional-Channel Sparse Coding Convolutional Neural Networks
IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 1875-1887, 2020-01-01
[ deep learning,Sound event recognition,sound event retrieval,sparse coding convolutional neural network ]
Embedded draw-down constraint using ensemble learning for stock trading
Journal of Intelligent and Fuzzy Systems, 38, 5, 5651-5659, 2020-01-01
[ ensemble learning,Kelly criterion,money managemen,Monte Carlo simulation ]
Large Basic Cone and Sparse Subspace Constrained Nonnegative Matrix Factorization with Kullback-Leibler Divergence for Data Representation
IEEE Intelligent Systems, 34, 4, 39-47, 2019-07-01
[ Data representation , face recognition , facial expression recognition , nonnegative matrix factorization , projected gradient descent ]
Deep learning and SURF for automated classification and detection of calcaneus fractures in CT images
Computer Methods and Programs in Biomedicine, 171, 27-37, 2019-04-01
[ Calcaneus fracture,Computed tomography image,Convolutional neural networks,Residual network,Visual geometry group ]
Locality preserved joint nonnegative matrix factorization for speech emotion recognition
IEICE Transactions on Information and Systems, E102D, 4, 821-825, 2019-04-01
[ Information extraction , Joint dictionary learning , Locality preserving , NMF , Speech emotion recognition ]
Projective complex matrix factorization for facial expression recognition
Eurasip Journal on Advances in Signal Processing, 2018, 1, 2018-12-01
[ Complex matrix factorization , Facial expression recognition , Nonnegative matrix factorization , Projected gradient descent ]
Predicting the Probability Density Function of Music Emotion Using Emotion Space Mapping
IEEE Transactions on Affective Computing, 9, 4, 541-549, 2018-10-01
[ Emotion in music , emotion recognition from audio , predictive model and algorithm , valence-Arousal space ]
Sound Event Recognition Using Auditory-Receptive-Field Binary Pattern and Hierarchical-Diving Deep Belief Network
IEEE/ACM Transactions on Audio Speech and Language Processing, 26, 8, 1336-1351, 2018-08-01
[ Auditory receptive fields binary patterns , environmental sound , hierarchical diving deep belief network , spectrogram image feature ]
A new approach of matrix factorization on complex domain for data representation
IEICE Transactions on Information and Systems, E100D, 12, 3059-3063, 2017-12-01
[ Complex matrix factorization,Data representation,Gradient descent method,Image clustering ]
Maximum volume constrained graph nonnegative matrix factorization for facial expression recognition
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E100A, 12, 3081-3085, 2017-12-01
[ Facial expression recognition,Feature extraction,Graph regularized,Nonnegative matrix factorization,Projected gradient descent ]
Music emotion recognition using PSO-based fuzzy hyper-rectangular composite neural networks
IET Signal Processing, 11, 7, 884-891, 2017-09-01
Speaker Identification Using Discriminative Features and Sparse Representation
IEEE Transactions on Information Forensics and Security, 12, 8, 1979-1987, 2017-08-01
[ Sparse representation classifier (SRC) , speaker identification ]
Program Guardian: screening system with a novel speaker recognition approach for smart TV
Multimedia Tools and Applications, 76, 12, 13881-13896, 2017-06-01
[ Robust principal component analysis , Sparse representation classifier , Speaker recognition , Supervector ]
Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition
Multimedia Tools and Applications, 76, 3, 4055-4068, 2017-02-01
[ Feature extraction , Speaker authentication , Speaker recognition , STRF ]
Single channel source separation using graph sparse NMF and adaptive dictionary learning
Intelligent Data Analysis, 21, S1, S5-S19, 2017-01-01
[ adaptive dictionary learning , Graph regularization , non-negative matrix factorization , source separation ]
Compressive Sensing-Based Speech Enhancement
IEEE/ACM Transactions on Audio Speech and Language Processing, 24, 11, 2122-2131, 2016-11-01
[ Compressive sensing (CS) , denoising , sparse representation , speech enhancement ]
Ensemble based speaker recognition using unsupervised data selection
APSIPA Transactions on Signal and Information Processing, 5, 2016-05-10
[ Ensemble classifier , Speaker recognition , Unsupervised data selection ]
VLSI Design for Convolutive Blind Source Separation
IEEE Transactions on Circuits and Systems II: Express Briefs, 63, 2, 196-200, 2016-02-01
[ Blind source separation , convolutive blind source separation , convolutive mixing , information maximization , VLSI ]
Speaker Identification with Whispered Speech for the Access Control System
IEEE Transactions on Automation Science and Engineering, 12, 4, 1191-1199, 2015-10-01
[ Empirical mode decomposition (EMD) , Hilbert-Huang transform , instantaneous frequency , speaker recognition , whispered speech ]
Robust Environmental Sound Recognition with Fast Noise Suppression for Home Automation
IEEE Transactions on Automation Science and Engineering, 12, 4, 1235-1242, 2015-10-01
[ Environmental sound recognition , noise suppression , probability product kernel , support vector machine , wavelet transform ]
Speech emotion verification using emotion variance modeling and discriminant scale-frequency maps
IEEE Transactions on Audio, Speech and Language Processing, 23, 10, 1552-1562, 2015-10-01
[ Emotional speech recognition , Gaussian-modeled residual error , scale-frequency map , Sparse representation ]
Hierarchical Dirichlet Process Mixture Model for Music Emotion Recognition
IEEE Transactions on Affective Computing, 6, 3, 261-271, 2015-07-01
[ Discriminant method , Hierarchical Dirichlet process mixture model , Music annotation and retrieval , Music emotion recognition ]
VLSI Design for SVM-Based Speaker Verification System
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23, 7, 1355-1359, 2015-07-01
[ Cepstral coefficient , speaker verification , support vector machine (SVM) , VLSI ]
Mixed sound event verification on wireless sensor network for home automation
IEEE Transactions on Industrial Informatics, 10, 1, 803-812, 2014-01-01
[ Blind source separation (BSS) , home automation , sound verification , support vector machine (SVM) , wireless sensor network (WSN) ]
Music emotion detection using hierarchical sparse kernel machines
The Scientific World Journal, 2014, 2014-01-01
Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation
IEEE Transactions on Automation Science and Engineering, 11, 2, 607-613, 2014-01-01
[ Environmental sound classification , feature extraction , Gabor function , home automation , matching pursuit (MP) , nonuniform scale-frequency map ]
Emotion identification using extremely low frequency components of speech feature contours
The Scientific World Journal, 2014, 2014-01-01
A novel fast mode decision algorithm for h.264/AVC using particle swarm optimization
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E96-A, 11, 2154-2160, 2013-01-01
[ H.264/AVC , Mode decision , Particle swarm optimization , PSO , Video coding ]
A new hybrid and dynamic fusion of multiple experts for intelligent porch system
Expert Systems with Applications, 39, 10, 9288-9296, 2012-08-01
[ Audio-visual expert , Authentication , Automation , Fusion , Intelligent system ]
Fast mode decision for H.264/AVC based on rate-distortion clustering
IEEE Transactions on Multimedia, 14, 3 PART 2, 693-702, 2012-05-22
[ H.264/AVC , mode classification , mode decision , nearest neighbor , video coding ]
VLSI design of an SVM learning core on sequential minimal optimization algorithm
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 20, 4, 673-683, 2012-04-01
[ Field-programmable gate array (FPGA) , sequential minimal optimization (SMO) , support vector machine (SVM) , VLSI design ]
Video content summarization and augmentation based on structural semantic processing and social network analysis
Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an, 33, 5, 737-750, 2010-01-01
[ Content augmentation , Graph clustering , Social network analysis , Structural contents ]
Improving direction of arrival estimation based on directivity pattern analysis and adaptive cascaded classifiers
Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an, 33, 5, 751-760, 2010-01-01
[ Adaptive algorithm , Direction of arrival (DOA) , Directivity pattern analysis , Sound localization ]
Personal spoken sentence retrieval using two-level feature matching and MPEG-7 audio LLDs
Journal of Information Science and Engineering, 25, 4, 1221-1238, 2009-07-01
[ Audio low level descriptors , Feature-based comparison , Matching algorithm , MPEG-7 , Spoken sentence retrieval ]
A novel video summarization based on mining the story-structure and semantic relations among concept entities
IEEE Transactions on Multimedia, 11, 2, 295-312, 2009-02-01
[ Concept expansion tree , Graph entropy , Graph mining , Structural video contents , Video browsing , Video indexing , Video summarization ]
The design of a speech interactivity embedded module and its applications for mobile consumer devices
IEEE Transactions on Consumer Electronics, 54, 2, 870-876, 2008-05-01
[ Embedded system design , Speech interactivity , Speech processing ]
Intensity gradient technique for efficient intra-prediction in H.264/AVC
IEEE Transactions on Circuits and Systems for Video Technology, 18, 5, 694-698, 2008-05-01
[ H.264 , Intensity gradient , Intra-prediction , Orientation , Rate distortion ]
Design and implementation of subspace-based speech enhancement under in-car noisy environments
IEEE Transactions on Vehicular Technology, 57, 3, 1466-1479, 2008-01-01
[ In-car noise , Psychoacoustic model (PAM) , Speech enhancement , Subspace tracking , System-on-a-programmable-chip (SoPC) ]
Robust environmental sound recognition for home automation
IEEE Transactions on Automation Science and Engineering, 5, 1, 25-31, 2008-01-01
[ Home automation , Independent component analysis (ICA) , Mel-frequency cepstral coefficients (MFCCs) , Signal enhancement , Sound recognition , Support vector machines (SVMs) , Wavelet transform ]
A fast mode decision algorithm and its vlsi design for H.264/AVC intra-prediction
IEEE Transactions on Circuits and Systems for Video Technology, 17, 10, 1414-1422, 2007-10-01
[ Advanced video coding (AVC) , Dominant edge extraction , H264 , Intra-mode decision , Intra-prediction ]
Design and portable device implementation of feature-based partial matching algorithms for personal spoken sentence retrieval
IET Signal Processing, 1, 3, 139-149, 2007-09-28
Robust speaker identification and verification
IEEE Computational Intelligence Magazine, 2, 2, 52-59, 2007-05-01
An ARM-based system-on-a-programmable-chip architecture for spoken language translation
IEEE Transactions on Circuits and Systems II: Express Briefs, 54, 9, 765-769, 2007-01-01
[ ARM , language translation , speech processing speech recognition , system-on-a-programmable-chip (SoPC) ]
Critical band subspace-based speech enhancement using SNR and auditory masking aware technique
IEICE Transactions on Information and Systems, E90-D, 7, 1055-1062, 2007-01-01
[ Human auditory system , In-car noise , Karhunen-loeve transform (KLT) , Perceptual filterbank , Signal subspace , Speech enhancement , Wavelet transform ]
A block-based architecture for lifting scheme discrete wavelet transform
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E90-A, 5, 1062-1071, 2007-01-01
[ Discrete wavelet transform , JPEG2000 , Lifting scheme , Linebased DWT , VLSI ]
Unsupervised speaker change detection using SVM training misclassification rate
IEEE Transactions on Computers, 56, 9, 1234-1244, 2007-01-01
[ Speaker change detection , Speaker segmentation , Support vector machine ]
Design and implementation of a single-chip speech-to-speech translation system
IEE Proceedings: Circuits, Devices and Systems, 153, 5, 416-426, 2006-11-06
Multiband subspace tracking speech enhancement for in-car human computer speech interaction
Journal of Information Science and Engineering, 22, 5, 1093-1107, 2006-09-01
[ Human computer interaction , In-car noise , PASTd algorithm , Perceptual filterbank , Speech enhancement , Subspace tracking ]
Efficient news video querying and browsing based on distributed news video servers
IEEE Transactions on Multimedia, 8, 2, 257-269, 2006-04-01
[ Distributed server , Maximum difference measure , News video , Querying/browsing , Semantic analysis ]
Projection based adaptive window size selection for efficient motion estimation in H.264/AVC
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E89-A, 11, 2970-2976, 2006-01-01
[ 1D projection , Adaptive window size selection , Block based motion estimation ]
Efficient Coding Translation of GSM and G.729 Speech Coders across Mobile and IP Networks
IEICE Transactions on Information and Systems, E87-D, 2, 444-452, 2004-01-01
[ Decode-then-encode (DTE) , Internet protocol (IP) , Parameter translation , Speech coding , Transcoder ]
Chip design of portable speech memopad suitable for persons with visual disabilities
IEEE Transactions on Speech and Audio Processing, 10, 8, 644-658, 2002-11-01
[ Intellectual property (IP) , Low bit-rate speech coding , Speech compression , Speech recognition , Visual disability , VLSI ]
Chip design of MFCC extraction for speech recognition
Integration, the VLSI Journal, 32, 1-2, 111-131, 2002-11-01
[ FPGA , Mel frequency , MFCC , Speech recognition , VLSI ]
A voicing-driven packet loss recovery algorithm for analysis-by-synthesis predictive speech coders over Internet
IEEE Transactions on Multimedia, 3, 1, 98-107, 2001-03-01
[ Analysis-by-synthesis , CELP , Packet loss recovery , Speech coder , Voice over IP ]
Large area Mylar foil vacuum-silver-plated equipment of RIBLL
Hedianzixue Yu Tance Jishu/Nuclear Electronics and Detection Technology, 21, 1, 2001-01-01
On the real-time implementation and packet loss recovery of the G.723.1 speech coder
Journal of the Chinese Institute of Electrical Engineering, Transactions of the Chinese Institute of Engineers, Series E/Chung KuoTien Chi Kung Chieng Hsueh K'an, 7, 2, 101-112, 2000-01-01
Experimental study on the effects of chemoembolization and intra- radiotherapeutic embolization in rats with implanted hepatic tumor
Journal of Xi'an Medical University, Chinese Edition, 20, 3, 381-383+419, 1999-01-01
[ Hepatic tumor , Implantation , Rat , Therapy ]
VLSI implementation of 3-D sound generator
IEEE Transactions on Consumer Electronics, 43, 3, 679-688, 1997-12-01
Lung volume deduction surgery in the treatment of pulmonary emphysema
, 34, 11, 697-699, 1996-11-01

研討會著作

A Key to Effective Multi-task Learning: Separate Query Selection for Task-Synergized Handling and Node Utilization
2025-01-01
[ GNN,Multi-tasks,query selection ]
Anti-Aliased Convolutional with Data Augmentation for Speech Emotion Recognition
2025-01-01
[ Audio Classification,Data augmentation,Deep learning,Neural Networks,Speech emotion recognition ]
Impact of Glyph Information on Latent Space Diffusion Models for Accurate Handwritten Text Generation
2025-01-01
[ Diffusion models,Handwritten text generation,Image synthesis ]
HistoFS: Non-IID Histopathologic Whole Slide Image Classification via Federated Style Transfer with RoI-Preserving
30251-30260, 2025-01-01
Mixture of Ordered Scoring Experts for Cross-prompt Essay Trait Scoring
18071-18084, 2025-01-01
DIFFUSION TO CONFUSION: NATURALISTIC ADVERSARIAL PATCH GENERATION BASED ON DIFFUSION MODEL FOR OBJECT DETECTOR
2378-2383, 2025-01-01
[ Adjoint method,Adversarial patch,Diffusion model,Object detection ]
A Hybrid Attention Mechanism to Improve Tacotron 2 Performance for Indonesian Text-to-Speech Synthesis
579-582, 2025-01-01
Adversarial Learning for Duration Prediction in Indonesian Text-to-Speech: Modification to Stochastic and Deterministic Predictors
1986-1990, 2025-01-01
Atayal Speech Recognition Based on Transfer Learning
822-825, 2025-01-01
[ Speech Recognition,Speech Transcription ]
Convolutional Kolmogorov-Arnold Networks for Image Classification: Overview and Improvement
516-521, 2025-01-01
[ Computer Vision,Convolutional Kolmogorov-Arnold Networks,Kolmogorov-Arnold Networks ]
Cross-Lingual Transfer for Low-Resource Translation: Fine-Tuning mBART50 on Indonesian to Aid Atayal - Chinese Tasks
1195-1198, 2025-01-01
[ Low-Resource Translation,mBART50 ]
User-Customizable Voice Anonymization Through Personalized Style Transfer
2025-01-01
3D mapping of fire hotspot in East Rinjani forest area using GIS and remote sensing
2024-03-18
[ Geographic Information System,Normalized Difference Vegetation Index,Remote Sensing,Wildfire ]
Attention-Guided Prototype Mixing: Diversifying Minority Context on Imbalanced Whole Slide Images Classification Learning
7609-7618, 2024-01-01
[ Algorithms,and algorithms,Applications,Biomedical / healthcare / medicine,formulations,Machine learning architectures ]
Advancing Robust Few-shot Surface Defect Detection through Meta-learning
45-48, 2024-01-01
[ few-shot detection,object detection,surface defect detection ]
EVA-ASCA: Enhancing Voice Anti-Spoofing through Attention-based Similarity Weights and Contrastive Negative Attractors
537-540, 2024-01-01
[ Automatic Speaker Verification,Security,Spoofing Attacks,Voice Anti-Spoofing ]
Self-supervised Learning and Masked Language Model for Code-switching Automatic Speech Recognition
387-391, 2024-01-01
[ code-switching,masked language modeling,self-supervised learning,speech recognition ]
Self-Supervised Learning via Multi-Transformation Classification for Action Recognition
2024-01-01
[ 3D ResNet,Action Recognition,C3D,multi-transformation,Self-supervised learning ]
Lightweight Brain Tumor Diagnosis via Knowledge Distillation
2024-01-01
[ Brain tumor,deep learning,Knowledge distillation,medical ]
SCENE TEXT RECOGNITION USING PROGRESSIVE RECTIFICATION NETWORK AND SPELLING ERROR CORRECTION LANGUAGE MODEL
2008-2014, 2024-01-01
[ ]
Leveraging Attention Mechanisms for Breast Cancer Diagnosis
2024-01-01
Detecting Abnormal Machine Sounds Using An Ensemble Approach with Data Augmentation Techniques
2024-01-01
Enhanced Detection of Illegally Parked Vehicles Using YOLO and Good Feature to Track Methods
2024-01-01
[ Computer Vision,Illegal Parking,Object Detection,Optical Flow,YOLO ]
Integrating VGGSK and BEATs for Enhanced Sound Event Detection: A Semi-Supervised GRU-Based System with Weak Labels and Synthetic Soundscapes
2024-01-01
Seismic-ionospheric Precursor Prediction Using Deep Learning
2024-01-01
Knowledge Sharing via Mimicking Attention Guided-Discriminative Features in Whole Slide Image Classification
2024-01-01
[ Attention,Classification,Knowledge sharing,Multiple instance learning,Whole slide image ]
Optimizing Acoustic Echo Cancellation with Variable Step Size in Adaptive Filtering
329-332, 2024-01-01
[ Acoustic Echo Cancellation,Adaptive Filter,Echo Path,Normalized Least Mean Square ]
Ensemble Learning Technique with A Novelty Multiĝ€'Source Information for Stock Price Movements
707-714, 2023-12-07
[ Ensemble learning,Natural learning processing,Stock trend forecasting,Technical analysis,Time series analysis ]
EMIX: A Data Augmentation Method for Speech Emotion Recognition
2023-01-01
[ data augmentation,EMix,Speech emotion recognition ]
Discriminative Vector Learning with Application to Single Channel Speech Separation
2023-01-01
Selinet: A Lightweight Model for Single Channel Speech Separation
2023-01-01
3D Face Reconstruction Based on Weakly-Supervised Learning Morphable Face Model
3523-3527, 2023-01-01
[ 3D Face Reconstruction,3D Morphable Face Model,Convolutional Neural Network,Deep Learning ]
Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech
370-375, 2023-01-01
[ Automatic Speech Recognition,Data extension,Multi-Speaker Text-to-Speech ]
Zero-Shot Voice Conversion Based on Speaker Embedding Domain Generalization
585-589, 2023-01-01
[ domain generalization,speaker embedding,speech synthesis,Zero-shot voice conversion ]
EMIX: A Data Augmentation Method for Speech Emotion Recognition
2023-01-01
[ data augmentation,EMix,Speech emotion recognition ]
Selinet: A Lightweight Model for Single Channel Speech Separation
2023-01-01
Discriminative Vector Learning with Application to Single Channel Speech Separation
2023-01-01
Mask Generation with Meta-Learning Classifier Weight Transformer Network for Few-Shot Image Segmentation
457-458, 2023-01-01
[ few-shot image segmentation,few-shot learning,meta-learning,semantic segmentation ]
Code-Switching Speech Synthesis Based on Self-Supervised Learning and Domain Adaptive Speaker Encoder
2023-01-01
[ Code Switching,Domain Adaptation,Self-Supervised Learning,Speech synthesis ]
Dense Adversarial Transfer Learning Based On Class-Invariance
2023-01-01
[ adversarial-based transfer learning,deep learning,Domain adaption,transfer learning ]
CNEG-VC: Contrastive Learning Using Hard Negative Example In Non-Parallel Voice Conversion
2023-01-01
[ contrastive learning,generative adversarial networks,hard negative example,non-parallel data,Voice conversion ]
On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data
2168-2172, 2023-01-01
STUA-Net: A Fingerprint Reconstruction with Swin Transformer and Soft Collective Attention
2209-2212, 2023-01-01
[ Corrupted Region,Fingerprint Reconstruction,Soft Collective Attention,Swin Transformer ]
QUESTION ANSWERING SYSTEM BASED ON PRE-TRAINING MODEL AND RETRIEVAL RERANKING FOR INDUSTRY 4.0
2178-2181, 2023-01-01
[ Pre-training model,QA system,Reranking ]
The Asian Monsoon Precipitation Classification based on the ISOMAP Analysis
2022-06-07
[ ISOMAP ]
Fingerprint Liveness Detection Using Denoised-Bayes Shrink Wavelet and Aggregated Local Spatial and Frequency Features
103-108, 2022-01-01
[ Denoised Wavelet Approach,Fingerprints,Liveness Detection,Spatial and Frequency Feature ]
Code-switched Text Data Augmentation for Chinese-English Mixed Speech Recognition
922-923, 2022-01-01
[ BERT,Code-Switching,NLP,Speech Recognition ]
Fingerprint Liveness Detection Using Handcrafted Feature Descriptors and Neural Network
619-621, 2022-01-01
[ Fingerprint Liveness Detection,LPQ,MLP,Multi-radius LBP,Spoof,Wavelet ]
SELECTIVE MUTUAL LEARNING: AN EFFICIENT APPROACH FOR SINGLE CHANNEL SPEECH SEPARATION
3678-3682, 2022-01-01
[ monophonic source separation,Supervised speech separation,time domain audio separation ]
Single-Channel Target Speaker Extraction System with Attention Enhancement
433-434, 2022-01-01
NCU1415 at ROCLING 2022 Shared Task: A light-weight transformer-based approach for Biomedical Name Entity Recognition 基於 Transformer 的生醫輕量化命名實體識別系統
316-320, 2022-01-01
[ Biomedical Science,Name Entity Recognition,ROCLING 2022 Shared Task ]
Fingerprint Liveness Detection with Voting Ensemble Classifier
105-110, 2022-01-01
[ component,Ensemble Learning,Fingerprint Liveness Detection,Local Binary Pattern,Local Phase Quantitation,Wavelet transform ]
Lightweight End-To-End Deep Learning Model For Music Source Separation
315-318, 2022-01-01
[ Deep learning,lightweight,music source separation ]
Low-Resource Speech Recognition Based on Transfer Learning
145-149, 2022-01-01
[ computational paralinguistics,human-computer interaction,speech recognition ]
A Comparative Study of Cross-Model Universal Adversarial Perturbation for Face Forgery
2022-01-01
[ adversarial example,deep generative model,generative adversarial network,universal adversarial perturbation ]
Multi-loss Function in Robust Convolutional Autoencoder for Reconstruction Low-quality Fingerprint Image
428-431, 2022-01-01
[ convolution autoencoder,fingerprint,loss function,reconstruction ]
(2+1)D Distilled ShuffleNet: A Lightweight Unsupervised Distillation Network for Human Action Recognition
3197-3203, 2022-01-01
A Wise Matrix Factorization Model for Image Representation
2022-01-01
[ complex matrix,face recognition,matrix factorization,projected gradient descent ]
Semi-supervised Subspace Learning Via Constrained Matrix Factorization
2021-01-01
[ data representation,face recognition,matrix factorization,subspace learning ]
A Fusion Methodology of AKAZE and Neural Network for Fingerprint Recognition
1602-1606, 2021-01-01
[ AKAZE feature,fingerprint recognition,neural network ]
Partial Fingerprint on Combined Evaluation using Deep Learning and Feature Descriptor
1611-1614, 2021-01-01
[ combined matching evaluation,convolutional neural network,deep learning,feature descriptor,partial fingerprint ]
Face Anti-Spoofing Using Multi-Branch CNN
170-173, 2021-01-01
A Survey of Finger Vein Recognition
2021-01-01
[ Biometrics,deep learning,finger vein,image processing,recognition ]
Emotional Speech Analysis Based on Convolutional Neural Networks
2021-01-01
[ Convolutional Neural Network (CNN),Emotion classification,Speech detection ]
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition
2021-01-01
Occluded Face Recognition Using Sparse Complex Matrix Factorization with Ridge Regularization
2021-01-01
[ Complex matrix factorization,Nonnegative matrix factorization,Occluded face recognition ]
Modified Attention Spatial Convolution Model for Skin Lesion Segmentation
2021-01-01
Single Channel Speech Separation using Enhanced Learning on Embedding Features
430-431, 2021-01-01
[ deep clustering,monaural source separation,speaker separation,Supervised speech separation ]
Sound Event Localization and Detection Based on Time-Frequency Separable Convolutional Compression Network
432-433, 2021-01-01
[ dual-branch tracking,multi-head self-attention,sound event localization and detection,time-frequency separable convolutional compression network ]
Facial Expression Recognition Using Sparse Complex Matrix Factorization with Ridge Term Regularization
45-46, 2021-01-01
[ complex matrix factorization,facial expression recognition,feature extraction,non-negative matrix factorization ]
Evaluation of Attention Mechanisms on Text to Speech
2021-01-01
Deep Residual and Deep Dense Attentions in English Chinese Translation
2021-01-01
A Survey of Vietnamese Automatic Speech Recognition
2021-01-01
An Implementation of FastAI Tabular Learner Model for Parkinson's Disease Identification
2021-01-01
Dual-Masking Wind Noise Reduction System Based on Recurrent Neural Network
2021-01-01
[ Deep learning,Dual mask,Noise reduction,Speech separation,Wind noise ]
Learning to Remember Beauty Products
4728-4732, 2020-10-12
[ attention mechanism,beauty product image retrieval,memory,triplet loss ]
Transfer Learning for Gender and Age Prediction
2020-09-28
[ age and gender prediction,convolutional neural network,neural networks,transfer learning ]
Encoder-Recurrent Decoder Network for Single Image Dehazing
4432-4436, 2020-05-01
[ encoder-recurrent decoder network,ERDN,single image dehazing ]
A Calibration-Free 71.7dB SNDR 100MS/s 0.7mW Weighted-Averaging Correlated Level Shifting Pipelined SAR ADC with Speed-Enhancement Scheme
256-258, 2020-02-01
Two-phase instance segmentation for whiteleg shrimp larvae counting
2020-01-01
Comparative Study of Masking and Mapping Based on Hierarchical Extreme Learning Machine for Speech Enhancement
2019-12-01
Sentiment Analysis Using Residual Learning with Simplified CNN Extractor
335-338, 2019-12-01
[ convolutional neural network , neural network , recurrent neural network , sentiment analysis ]
Deep Learning Based Vietnamese Diacritics Restoration
331-334, 2019-12-01
[ convolutional neural network , diacritics , diacritics restoration , neural networks , recurrent neural network ]
Age and gender recognition using multi-task CNN
1937-1941, 2019-11-01
Convolutional attention model for retinal edema segmentation
1929-1932, 2019-11-01
Compressed multimodal hierarchical extreme learning machine for speech enhancement
678-683, 2019-11-01
Audio-visual speech enhancement using hierarchical extreme learning machine
2019-09-01
[ Audio-Visual , Hierarchical Extreme Learning Machine , Multi-Modal , Speech Enhancement ]
Object Bounding Transformed Network for End-to-End Semantic Segmentation
3217-3221, 2019-09-01
[ Doman Transform Network , image semantic segmentation , Object Boundary Guide , ResNet 101 ]
Video captioning based on joint image-audio deep learning techniques
127-131, 2019-09-01
[ Acoustic scene classification , Convolutional neural networks , Long short-term memory , Sound event detection , Video captioning , Word embedding ]
Orthogonal Non-Negative Matrix Factorization using Ridge Term for Classifying Expressed Gene
2019-09-01
[ feature extraction , gene expression data , nonnegative matrix factorization , orthogonal constraint , ridge term ]
CoNet: Compact and Low-Cost CNN for Image Classification
2019-05-01
Speaker Characterization Using TDNN-LSTM Based Speaker Embedding
6211-6215, 2019-05-01
[ NIST SRE2018 , speaker embedding , TDNN-LSTM ]
A 15-bit 20 MS/s SHA-Less Pipelined ADC Achieving 73.7 dB SNDR with averaging correlated level shifting technique
2019-04-01
[ Analog-to-digital converter (ADC) , Averaging correlated level shifting (Averaging-CLS) , Correlated level shifting (CLS) , Pipelined ADC , Ring amplifier ]
Occluded Image Recognition with Extended Nonnegative Matrix Factorization
200-204, 2019-01-09
[ Face recognition , Facial expression recognition , Nonnegative matrix factorization ]
Single-channel speech separation based on Gaussian process regression
275-278, 2019-01-04
[ Gaussian process regression , single-channel speech separation ]
Locality Preserving Discriminative Complex-Valued Latent Variable Model
1169-1174, 2018-11-26
Learning a Hierarchical Latent Semantic Model for Multimedia Data
2995-3000, 2018-11-26
Playing Technique Classification Based on Deep Collaborative Learning of Variational Auto-Encoder and Gaussian Process
2018-10-08
[ collaborative learning , Gaussian process , playing technique classification , Variational autoencoder ]
Locality-preserving complex-valued Gaussian process latent variable model for robust face recognition
2696-2700, 2018-09-10
[ Complex-valued representation , Gaussian process latent variable model , Occlusion , Robust face recognition ]
Complex-Valued Gaussian Process Latent Variable Model for Phase-Incorporating Speech Enhancement
5439-5443, 2018-09-10
[ Binary mask , Complex-valued Gaussian process latent variable model , Phase ]
Image Representation Using Supervised and Unsupervised Learning Methods on Complex Domain
1248-1252, 2018-09-10
[ Complex matrix factorization , Discriminant feature , Image representation , LDA , NMF ]
Depth Human Action Recognition Based on Convolution Neural Networks and Principal Component Analysis
1543-1547, 2018-08-29
[ Convolution neural network , Feature representation , Human action recognition , Principal component analysis , View invariance ]
Asymmetric kernel convolutional neural network for acoustic scenes classification
11-12, 2018-05-04
[ Acoustic scenes classification , Convolutional neural network , Deep learning ]
A survey of deep learning for polyphonic sound event detection
75-78, 2018-04-10
[ Convolutional neural networks , Deep learning , Neural networks , Recurrent neural networks , Sound event detection ]
A new constrained nonnegative matrix factorization for facial expression recognition
79-82, 2018-04-10
[ Facial expression recognition , Graph regularization , Nonnegative matrix factorization , Projectedgradient descent ]
Fast-LSTM acoustic model for distant speech recognition
1-4, 2018-03-26
[ long short-term memory , Speech recognition , time delay neural networks ]
Acoustic scene classification using convolutional neural networks and multi-scale multi-feature extraction
1-4, 2018-03-26
Hand gesture recognition based on Bayesian sensing hidden Markov models and Bhattacharyya divergence
3535-3539, 2018-02-20
[ Bayesian sensing hidden Markov model , Bhattacharyya divergence , Hand gesture recognition ]
Automatic vehicle classification using center strengthened convolutional neural network
1075-1078, 2018-02-05
[ Convolutional Neural Network , Deep learning , ROI pooling , Vehicle classification ]
Acoustic scene classification using self-determination convolutional neural network
19-22, 2018-02-05
NMF/NTF-based methods applied for user-guided audio source separation: An overview
80-83, 2018-02-01
[ Nonnegative matrix factorization , Nonnegative tensor factorization , Temporal annotation , Time-frequency annotaion , User-guided constraints ]
A survey of deep face recognition in the wild
76-79, 2018-02-01
[ Deep learning , Face recognition , LFW dataset ]
Repairing IR depth image with 2D RGB image
2018-01-01
[ 3D-image , depth image complementation , depth information loss , Holistically-Nested Edge Detection method , IR noise , iterative low-pass pervasion , RGB image , RGBD image ]
Self-gated recurrent neural networks for human activity recognition on wearable devices
179-185, 2017-10-23
[ Human activity recognition , Recurrent neural network , Self-gated recurrent neural network , Wearable devices ]
Hierarchical representation based on Bayesian nonparametric tree-structured mixture model for playing technique classification
537-543, 2017-10-23
[ Hierarchical representation , Playing technique classification ]
Discriminative training of complex-valued deep recurrent neural network for singing voice separation
1327-1335, 2017-10-23
[ Deep neural networks , Phase information , Sing voice separation ]
Improved convolutional neural network based scene classification using long short-term memory and label relations
429-434, 2017-09-05
[ Convolutional neural network , long short-term memory , machine learning , pattern recognition , scene classification ]
Recognition and retrieval of sound events using sparse coding convolutional neural network
589-594, 2017-08-28
[ Sound event recognition , Sound event retrieval , Sparse coding convolutional neural network ]
Source separation using dictionary learning and deep recurrent neural network with locality preserving constraint
151-156, 2017-08-28
[ Deep recurrent neural network , Locality preserving constraint , Nonnegative matrix factorization ]
Real time validating the accuracy of physiotherapy exercises
329-330, 2017-07-25
Action recognition using three dimension convolution and long short term memory
83-84, 2017-07-25
Multi-pitch streaming of interwoven streams
311-315, 2017-06-16
[ automatic music transcription , multi-channel , Multipitch streaming , particle swarm optimization ]
Dynamic tracking attention model for action recognition
1617-1621, 2017-06-16
[ Action recognition , attention model , convolutional neural network , deep learning , long short-term memory (LSTM) ]
Kernel weighted Fisher sparse analysis on multiple maps for audio event recognition
6010-6014, 2017-06-16
[ audio event classification , damping-frequency map , kernel sparse classification , Kernel weighted Fisher sparse analysis , scale-frequency map ]
Fully complex deep neural network for phase-incorporating monaural source separation
281-285, 2017-06-16
[ Deep neural network , phase information ]
Hierarchical joint-guided networks for semantic image segmentation
1887-1891, 2017-06-16
[ hierachical joint learning convolutional networks , hierachical joint-guided networks , joint-guided and masking network , semantic image segmentation ]
Exemplar-embed complex matrix factorization for facial expression recognition
1837-1841, 2017-06-16
[ Complex matrix factorization , facial expression , nonnegative matrix factorization , optimization ]
Spatial dispersion constrained NMF for monaural source separation
2017-05-02
[ Graph regularization , Multiple update rule , Non-negative matrix factorization , Source separation ]
Incorporating local environment information with ensemble neural networks to robust automatic speech recognition
2017-05-02
[ Ensemble neural network,Environment clustering,Mixture of local experts,Robust ASR ]
A Survey of Polyphonic Sound Event Detection Based on Non-Negative Matrix Factorization
351-354, 2017-02-16
[ convolutive non-negative matrix factorization , Non-negative matrix factorization , sound event detection ]
Speech emotion classification using multiple kernel Gaussian process
2017-01-17
[ multiple kernel Gaussian process , semi-nonnegative matrix factorization , Speech emotion classification ]
Robust face verification via Bayesian sparse representation
2017-01-17
Locality-preserving K-SVD based joint dictionary and classifier learning for object recognition
481-485, 2016-10-01
[ D-KSVD , Joint dictionary learning , K-SVD , Locality-preserving , Object recognition ]
Transportation mode detection on mobile devices using recurrent nets
392-396, 2016-10-01
[ CGRNN , Control gate-based recurrent neural network ]
Multi-channel source clustering of polyphonic music
131-141, 2016-10-01
Improving iris image segmentation in unconstrained environments using NMF-based approach
2016-07-25
A novel approach for single channel source separation
2016-07-25
NMF-based image segmentation
2016-07-25
[ clustering , k-means , segmentation , Superpixels ]
Human action recognition system for elderly and children care using three stream ConvNet
5-9, 2016-06-22
[ action recognition , convolutional neural network , deep learning , moving , spatial , temporal , three stream ConvNet ]
A review on speech separation using NMF and its extensions
26-29, 2016-06-22
[ bilevel optimization , graph regularization , group lasso , non-negative matrix factorization , single channel source separation ]
A survey of visual lip reading and lip-password verification
22-25, 2016-06-22
[ lip biometric , Lip-password , visual lip reading , visual speaker identification , visual speech recognition ]
Gaussian process based text categorization for healthy information
30-33, 2016-06-22
[ classification , feature learning , Gaussian process , Latent Dirichlet Allocation , text categorization ]
Message from general and program co-chairs
vii-viii, 2016-06-22
Music emotion recognition using deep Gaussian process
495-498, 2016-02-19
[ classification , deep Gaussian process , feature extraction , Music emotion recognition ]
Latent dirichlet allocation based blog analysis for criminal intention detection system
73-76, 2016-01-21
[ classification , collaborative representation classifier , feature learning , latent Dirichlet allocation , text categorization ]
Lip-based visual speech recognition system
315-319, 2016-01-21
[ kernel sparse representation classifier , non-negative matrix factorization , spatiotemporal descriptor , visual speech recognition ]
Genre based emotion annotation for music in noisy environment
863-866, 2015-12-02
[ emotion and genre , hierarchical system , music emotion classification , noisy environment , sparse representation ]
VLSI design for SC-based speaker recognition
335-338, 2015-11-20
[ sparse coding , Speaker recognition , VLSI ]
Video summarization based on face recognition and speaker verification
1821-1824, 2015-11-20
[ face detection , face recognition , NMF , speaker verification , SVM , Video summarization ]
Bayesian sensing hidden markov model for hand gesture recognition
2015-10-07
[ Bayesian sensing hidden Markov models , Hand gesture recognition , Moving pose descriptor ]
Single channel source separation using sparse NMF and graph regularization
2015-10-07
[ Graph regularization,Non-negative matrix factorization,Source separation,Sparse coding ]
Monaural source separation using nonnegative matrix factorization with graph regularization constraint 結合 β 距離與圖形正規限制式之非負矩陣分解應用於單通道訊號源分離
18-26, 2015-10-01
Neural network training combines environment clustering with expert hybrid systems in robust speechrecognition (automatic recognition using neural networks, the analytics model with environment andblend of experts, in) 類神經網路訓練結合環境群集及專家混合系統於強健性語音辨識
136-147, 2015-10-01
[ Environment clustering,Mixture of experts,Neural network,Robust speech recognition ]
Kernel Sparse Representation Classifier with Center Enhanced SPM for Vehicle Classification
742-746, 2015-09-21
[ Kernel sparse representation , Spatial pyramid matching , Vehicle classification ]
Automatic recognition of audio event using dynamic local binary patterns
246-247, 2015-08-20
[ Auditory system , Feature extraction , Filtering , Pattern recognition , Spectrogram , Speech , Support vector machines ]
Liver segmentation from 3D abdominal CT images
342-343, 2015-08-20
[ 3D abdominal CT image , liver segmentation , Medical image processing , Otsu method ]
News topics categorization using latent Dirichlet allocation and sparse representation classifier
136-137, 2015-08-20
[ Computer science , Dictionaries , Resource management , Support vector machines , Testing , Training data , Vocabulary ]
MediaEval 2015: Recurrent neural network approach to emotion in music tack
2015-01-01
An overview of kernel based nonnegative matrix factorization
227-231, 2014-11-12
[ Kernel based method , nonnegative matrix factorization (NMF) ]
System implementation of robust time difference of arrival estimation
157-160, 2014-11-12
[ DOA estimation , Sound activity detection , sound enhancement ]
Sparse representation based speaker identification
31-41, 2014-09-01
Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition
2014-02-12
[ Mel frequency cepstral coefficients , spectral-temporal receptive fields , speech recognition ]
Robust emotion recognition in live music using noise suppression and a hierarchical sparse representation classifier
2014-02-12
[ live , music classification , music emotion recognition , Sparse representation ]
2D semi-NMF of scale-frequency map for environmental sound classification
2014-02-12
[ Environmental sound classification , scale-frequency map (SFM) , two dimensional semi-nonnegative matrix factorization (2D Semi-NMF) ]
Emotion profile-based music recommendation
111-114, 2014-01-01
[ Emotion profile , music emotion recognition , music recommendation system ]
Automatic singing evaluating system based on acoustic features and rhythm
165-168, 2014-01-01
[ acoustic feature , dynamic time warping , melodic similarity , Singing assessment , singing evaluating ]
Efficient and portable content-based music retrieval system
153-156, 2014-01-01
[ information entropy , MFCC , multimedia database , music retrieval , pattern indexing , portability , Query by singing , symbolic sequence ]
Happiness detection in music using hierarchical SVMs with dual types of kernels
2013-12-01
[ happiness verification , Music emotion , support vector machine ]
Mixed sound event validation is | in the home environmental sound process app 混合聲音事件驗證在家庭自動化之應用
143-153, 2013-10-01
Music emotion classification using double-layer support vector machines
193-196, 2013-07-12
[ Music emotion , support vector machine ]
Robust speech-based happiness recognition
227-230, 2013-07-12
[ emotional speech , Happiness recognition , noise suppression , probability product kernel ]
A division-free algorithm for fixed-point power exponential function in embedded system
223-226, 2013-07-12
[ fixed-point mathematical function , Newton's method , Power exponential function ]
Convolutive blind source separation based on sparse component analysis
192-201, 2012-12-01
Intelligent appliance control using a low-cost embedded speech recognizer
311-314, 2012-12-01
[ Embedded system , inteligent appliance , SPCE061A microcontroller , speech recognition ]
Speech based boredom verification approach for modern education system
87-90, 2012-10-26
[ Boredom verification , emotional speech , modern education , support vector machine ]
Hardware/software co-design for fast-trainable speaker identification system based on SMO
1621-1625, 2011-12-23
[ Hardware/Software Co-design , Sequential Minimal Optimization (SMO) , Speaker Identification ]
Speaker identification using HHT spectrum features
145-148, 2011-12-01
[ Empirical mode decomposition (EMD) , Hilbert Huang transform , Instantaneous frequency , Speaker identification , Speaker recognition ]
2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Proceedings: Preface
2010-12-01
Dynamic fixed-point arithmetic design of embedded SVM-based speaker identification system
524-531, 2010-07-14
[ dynamic fixed-point design , linear prediction cepstral coefficient (LPCC) , sequential minimal optimization (SMO) , Support vector machine (SVM) ]
Gain function estimation for TDC-based subspace speech enhancement
180-183, 2010-01-01
Long distance person identification using height measurement and face recognition
2009-12-01
[ Face recogniton , Height measurement , Identification , Long distance ]
Stress detection based on multi-class probabilistic support vector machines for accented English speech
346-350, 2009-11-16
VLSI design of sequential minimal optimization algorithm for SVM learning
2509-2512, 2009-10-26
Video knowledge augmentation based on summarized contents and online media
738-741, 2009-10-26
SVM-based state transition framework for dynamical human behavior identification
1933-1936, 2009-09-23
[ Image processing , Pattern recognition , User interface human factors ]
An embedded system design for speech command recognition using improved AMDF-based pitch features
88-92, 2008-12-01
[ AMDF-based pitch extraction , And dynamic time warping , Embedded system design , Speech command recognition , Speech processing ]
Ubiquitous and robust text-independent speaker recognition for home automation digital life
297-310, 2008-08-04
A long-distance time domain sound localization
616-625, 2008-08-04
Event-based segmentation of sports video using motion entropy
107-111, 2007-12-01
[ Entropy-based motion feature , Event detection , Homoscedastic error model , Video segmentation ]
Efficient intra prediction in H.264 based on intensity gradient approach
3952-3955, 2007-01-01
Environmental sound classification using hybrid SVM/KNN classifier and MPEG-7 audio low-level descriptor
1731-1735, 2006-12-01
A novel dominant edge strength algorithm for intra prediction in H.264/AVC encoders
2006-12-01
A novel fast algorithm for intra mode decision in H.264/AVC encoders
3498-3501, 2006-12-01
Content-based audio classification using support vector machines and independent component analysis
157-160, 2006-12-01
Robust speaker recognition using SNR-aware subspace-based enhancement and probabilistic SVMs
1161-1164, 2006-12-01
Programmable logic array design for H.264 context-based adaptive variable length coding
2006-01-01
An ARM-based embedded system design for speech-to-speech translation
499-508, 2006-01-01
An efficient news video browsing system for wireless network application
1377-1381, 2005-12-01
VLSI design of a very low bit rate speech decoder
239-243, 2005-12-01
[ ASIC , LPC , LSP , Speech coding , VLSI , Vocoder ]
Translation divergence analysis and processing for Mandarin-English parallel text exploitation
2005-12-01
A novel algorithm for speaker change detection based on support vector machine
2005-12-01
VLSI architecture design for concatenative speech synthesizer
2005-01-01
VLSI architecture design for BNN speech recognition
200-204, 2003-12-01
[ Bayesian neural network , Speech recognition , VLSI ]
HOME ENVIRONMENTAL SOUND RECOGNITION BASED ON MPEG-7 FEATURES
682-685, 2003-01-01
A programmable application-specific VLSI architecture for speech recognition
477-480, 2001-12-01
Single chip implementation of the 1.6 Kbps speech vocoder
2000-01-01
Chip design of mel frequency cepstral coefficients for speech recognition
3658-3661, 2000-01-01
1.2 Kbps FBLPC Vocoder with applications in phone-to-phone over Internet
414-415, 1998-01-01
VLSI implementation of 3-D sound generator
296-297, 1997-12-01

專書

Communications in Computer and Information Science
128-141, 2026-01-01
[ Feature Representation,Hierarchical Learning,MRI Brain Tumor Identification ]
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
431-445, 2025-01-01
[ air writing,angle time-map,convolutional neural network,Doppler-time map,FMCW radar,gesture recognition,human-machine interface,mmWave,range-time map,spectrogram ]
Communications in Computer and Information Science
415-426, 2025-01-01
[ Complex matrix factorization,Data representation,Feature extraction ]
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
386-397, 2024-01-01
[ Attention Mechanism,Deep Learning,Reconstruction of 3D Human Body Model ]
Lecture Notes in Electrical Engineering
153-162, 2021-01-01
Lecture Notes in Electrical Engineering
1031-1042, 2014-02-17
[ Attentive motion entropy,Mutual information analysis,Segmental spectro-temporal subtraction,Video summarization ]
Lecture Notes in Electrical Engineering
1049-1054, 2014-02-17
Lecture Notes in Electrical Engineering
1043-1048, 2014-02-17
Communications in Computer and Information Science
536-543, 2012-09-13
Lecture Notes in Electrical Engineering
901-908, 2010-12-01
[ Information analysis,Motion entropy analysis,Salient motion entropy,Video summarization ]
Advances in Chinese Spoken Language Processing
503-522, 2006-01-01

校內獲獎

113 特聘教授
112 研究傑出獎
109 特聘教授
108 研究傑出獎
107 研究傑出獎
106 研究傑出獎
105 研究傑出獎

建立組織微陣列影像模型以及使用該模型判定組織型態的方法 [中華民國]
識別荷爾蒙受體狀態的方法及系統 [日本]
手寫中文字辨識方法及手寫中文字辨識裝置 [中華民國]
識別荷爾蒙受體狀態的方法及系統 [中華民國]
藉由乳房X光攝影影像運用機器學習進行自動偵測乳癌病灶之方法 [中華民國]
SOURCE SEPARATION METHOD, APPARATUS, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM [美國]
MACHINE LEARNING METHOD AND MACHINE LEARNING DEVICE [美國]
COMPUTING DEVICE AND METHOD FOR GENERATING MACHINE TRANSLATION MODEL AND MACHINE-TRANSLATION DEVICE [美國]
模型訓練裝置及方法/MODEL TRAINING APPARATUS AND METHOD [中華民國]
影像辨識方法及影像辨識裝置 [中華民國]
用於人臉辨識的訓練資料產生方法及資料產生裝置/TRAINING DATA GENERATION METHOD FOR HUMAN FACIAL RECOGNITION AND DATA GENERATION APPARATUS [中華民國]
產生機器翻譯模型的計算裝置及方法及機器翻譯裝置 [中華民國]
TRAINING DATA GENERATION METHOD FOR HUMAN FACIAL RECOGNITION AND DATA GENERATION APPARATUS [美國]
物件分類方法及物件分類裝置 [中華民國]
動作辨識系統 [中華民國]
來源分離方法、裝置及非暫態電腦可讀取媒體/SOURCE SEPARATION METHOD, APPARATUS, AND NON-TRANSITORY COMPUTER READABLE MEDIUM [中華民國]
訓練類神經網路以預測個體基因表現特徵的方法及系統/ METHODS FOR TRAINING AN ARTIFICIAL NEURAL NETWORK TO PREDICT WHETHER A SUBJECT WILL EXHIBIT A CHARACTERISTIC GENE EXPRESSION AND SYSTEMS FOR EXECUTING THE SAME [中華民國]
METHOD FOR REPAIRING INCOMPLETE 3D DEPTH IMAGE USING 2D IMAGE INFORMATION [美國]
多通道之多重音頻串流方法以及使用該方法之系統 [中華民國]
利用2D影像資訊修補不完整3D深度影像之方法 [中華民國]
機器學習方法及機器學習裝置 [中華民國]
訊號源分離方法及訊號源分離裝置 [中華民國]
動作辨識方法以及其系統 [中華民國]
字詞校正方法 [中華民國]
非監督式語者轉換偵測方法 [中華民國]
透過視覺的聽覺輔助裝置 VISION-AIDED HEARING ASSISTING DEVICE [美國]
透過視覺的聽覺輔助裝置 HEARING ASSISTING DEVICE THROUGH VISION [中華民國]
環境聲音辨識方法 [中華民國]
盲訊號分離系統 [中華民國]
於旋積混合盲信號源中將多個訊號源進行分離之方法 [中華民國]

王家慶

Wang, Jia-Ching

現職

專長

專長簡述

最高學歷

相關連結

科技部學門領域

政府GRB研究專長領域

教育部學門領域

其他補助

智能語音及語言處理技術研發人才培育計畫 1140801~1150731 [人工智慧,語音辨識,人才培育 , Artificial Intelligence,Speech Recognition,Talent Cultivation]

智能口語處理技術研發人才培育計畫 1120801~1130731 [智能口語 , Intelligent Spoken Language Processing]

智能口語處理技術研發人才培育計畫 1110801~1120731 [智能口語 , Intelligent Spoken Language Processing]

智慧晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫 1110401~1120331 [智慧晶片 , Intelligent Chip]

應用於客語友善交流及輔助學習之客語翻譯APP 1110110~1110831 [客家,翻譯 , Hakka,translate]

應用於客家民族植物知識語言之智慧型語音及影像辨識功能建置 1100801~1110731 [語音辨識,影像辨識,客家,植物 , Speech recognition,Image recognition,Hakka,Plant]

智慧終端裝置晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫 1100701~1110331 [智慧晶片 , Intelligent Chip]

國科會計畫統計

基於生成式模型之語音及文字惡意攻擊的防禦技術 1160801~1170731

基於生成式模型之語音及文字惡意攻擊的防禦技術 1150801~1160731

AI語音技術與產業應用聯盟(3/3) 1150201~1160131

科研創業計畫：結合唇形之語音辨識商業化個案 1150101~1151231

基於生成式模型之語音及文字惡意攻擊的防禦技術 1140801~1150731

低資源語言之開放式口語問答系統 1140801~1150731

適於邊緣裝置之輕量化多模態大型語言模型與異質加速平台設計及其於生命科學之應用(2/2) 1140501~1150430 [多模態 , Multimodal]

AI語音技術與產業應用聯盟(2/3) 1140201~1150131 [語音降噪、語音辨識 , Speech denoising, speech recognition]

低資源語言之開放式口語問答系統 1130801~1140731

多模態語音辨識技術 1130601~1140531 [多模態語音辨識,聲學模型,,掩碼語言模型, , Multimodal Speech Recognition,Acoustic Model,Masked Language Modeling]

適於邊緣裝置之輕量化多模態大型語言模型與 異質加速平台設計及其於生命科學之應用(1/2) 1130501~1140731 [多模態 , Multimodal]

低資源語言之開放式口語問答系統 1120801~1130731

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術 1120801~1130731 [語音辨識 , Speech recognition]

建置轉譯導向乳癌輛資料及生物資料庫平台(3/4) 1120501~1130430 [乳癌資料庫平台 , Breas Cancer Database Platform.]

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術 1110801~1120731 [語音辨識 , Speech recognition]

SMA-SRGAN:適用於生物辨識影像生成之基於空間域遮罩注意力機制的超解析度生成對抗式網路 1100801~1110731 [生物辨識 ]

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術 1100801~1110731 [語音辨識 , Speech recognition]

智慧型照護互動系統-基於深度智能之口語處理技術(4/4) 1100101~1110331 [智慧型照護互動系統 ]

智慧型照護互動系統-基於深度智能之口語處理技術(3/4) 1090101~1091231

智慧型照護互動系統－基於深度智能之口語處理技術(2/4) 1080101~1081231

基於深度學習之多媒體資料融合與分析–子計畫六：基於深度學習之音樂情緒辨識研究 1060801~1070228

基於深層學習於盲訊號源分離及語音增強之研究 1060801~1070731 [Deep Learning, Speech Enhancement, Blind Source Separation ]

銀髮族口語互動式居家陪伴及推薦系統 1051101~1061031

基於深層學習於盲訊號源分離及語音增強之研究 1050801~1060731 [Deep Learning, Speech Enhancement, Blind Source Separation ]

基於深層學習於盲訊號源分離及語音增強之研究 1040801~1050731 [Deep Learning, Speech Enhancement, Blind Source Separation ]

產學合作計畫統計

口語處理人工智慧研究計畫書 1150428~1151027 [生成式人工智慧,自然語言處理 , Generative AI,Natural Language Processing]

電子製造產業智慧化共通應用模組導入驗證 1140919~1141130 [智慧製造,人工智慧應用 , Smart Manufacturing,AI Applications]

凌陽中央聯合實驗室 1130901~1150630 [聲紋複製 , Voice print]

AI 模型測試 1130501~1131031 [AI 模型測試 , Testing AI Models]

語音辨識 1120901~1130610 [語音辨識 , Speech recognition]

人工智慧技術 1120801~1131231 [人工智慧 , Artificial Intelligence Technology]

群創中央聯合實驗室-指紋辨識技術及其開發平台建置 1110801~1120930 [指紋辨識技術 , Fingerprint Recognition Technique]

110中大聯新聯合研發中心計畫-一般X光影像自動核對系統 1100801~1110731 [影像核對,左右顛倒,QC流程 , image verification,incorrect left and right,QC process]

期刊著作

Wind Noise Reduction Based on the Double Masking and Permutation-Invariant Training Electronics (Switzerland), 15, 5, 2026-03-01 [ dual masking learning,non-stationary noise,permutation-invariant training,wind noise reduction ]

YOWOv3: An Efficient and Generalized Framework for Spatiotemporal Action Detection IEEE Intelligent Systems, 41, 1, 75-85, 2026-01-01

Hardware Implementation of Improved Oriented FAST and Rotated BRIEF-Simultaneous Localization and Mapping Version 2 Sensors, 25, 20, 2025-10-01 [ machine vision,ORB-SLAM2,Raspberry Pi 3,SLAM,spatial scene construction ]

A Hybrid Deep Learning and Feature Descriptor Approach for Partial Fingerprint Recognition Electronics (Switzerland), 14, 9, 2025-05-01 [ biometric authentication,convolutional neural networks,deep learning,feature descriptor,partial fingerprint recognition ]

Editorial for Special Issue on Invited Papers from APSIPA ASC 2023 APSIPA Transactions on Signal and Information Processing, 13, 5, 2024-10-07

Implementation of Sound Direction Detection and Mixed Source Separation in Embedded Systems Sensors, 24, 13, 2024-07-01 [ embedded systems,hybrid sound source separation,position detection,signal-to-interference ratio (SIR),speech recognition ]

Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network Electronics (Switzerland), 13, 13, 2024-07-01 [ few-shot image segmentation,few-shot learning,meta-learning,semantic segmentation ]

Audio Pre-Processing and Beamforming Implementation on Embedded Systems Electronics (Switzerland), 13, 14, 2024-07-01 [ audio pre-processing,beamforming,embedded system,enhance speech ]

Semantic-Based Public Opinion Analysis System Electronics (Switzerland), 13, 11, 2024-06-01 [ K-nearest neighbor algorithm,sentence analysis,support vector machines,topic input and commentary ]

Zero-FVeinNet: Optimizing Finger Vein Recognition with Shallow CNNs and Zero-Shuffle Attention for Low-Computational Devices Electronics (Switzerland), 13, 9, 2024-05-01 [ attention,biometrical verification,convolution neural network,finger vein,lightweight model ]

Multi-view and multi-augmentation for self-supervised visual representation learning Applied Intelligence, 54, 1, 629-656, 2024-01-01 [ Data augmentation policies,Metric learning,Multi-augmentation,Nuisance factors,Scale-invariant representation learning,SSL augmentation pipelines ]

Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network Electronics (Switzerland), 13, 2, 2024-01-01 [ automatic speech recognition (ASR),convolutional neural network (CNN),deep learning,target speaker extraction,temporal convolutional network (TCN) ]

Enhancing Breast Cancer Detection: A Novel Training Strategy and Batch Scheduler Method Proceedings - IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS, 2024, 2024-01-01 [ Batch Scheduler,ConvNext,Dynamic Batch Size,F1 Score,Pretrain ]

LCSL: Long-Tailed Classification via Self-Labeling IEEE Transactions on Circuits and Systems for Video Technology, 34, 11, 12048-12058, 2024-01-01 [ Image classification,imbalance classification,long-tailed problem,self-labeling ]

Anti-Aliasing Attention U-net Model for Skin Lesion Segmentation Diagnostics, 13, 8, 2023-04-01 [ computer-aided diagnosis,deep learning,light-weight model,medical internet of things,skin lesion segmentation ]

Electrocardiogram Heartbeat Classification for Arrhythmias and Myocardial Infarction Sensors, 23, 6, 2023-03-01 [ deep learning,electrocardiogram (ECG) classification,MIT-BIH dataset,PTB dataset ]

The COVIDTW study: Clinical predictors of COVID-19 mortality and a novel AI prognostic model using chest X-ray Journal of the Formosan Medical Association, 122, 3, 267-275, 2023-03-01 [ Artificial intelligence,Chest X-rays,COVID-19,Intensive care unit,Mortality,Prognosis ]

Deep Learning for Human Action Recognition: A Comprehensive Review APSIPA Transactions on Signal and Information Processing, 12, 1, 2023-01-01 [ Action recognition,deep learning,deep neural networks,self-supervised learning,supervised learning ]

Editorial for the Special Issue on Learning, Security, AIoT for Emerging Communication/Networking Systems APSIPA Transactions on Signal and Information Processing, 12, 2, 2023-01-01

Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition IEEE Signal Processing Letters, 30, 1387-1391, 2023-01-01 [ code-switching speech recognition,cyclic transfer learning,Speech recognition,transfer learning ]

A DEEP LEARNING-BASED FAKE NEWS DETECTING SYSTEM IET Conference Proceedings, 2023, 35, 172-173, 2023-01-01 [ data augmentation,deep learning,Fake news detection,word embedding ]

Fast Gated Recurrent Network for Speech Synthesis IEICE Transactions on Information and Systems, E105D, 9, 1634-1638, 2022-09-01 [ acoustic modelling,gated recurrent neural network,long short-term memory,speech synthesis ]

Heuristic Attention Representation Learning for Self-Supervised Pretraining Sensors, 22, 14, 2022-07-10 [ computer vision,deep learning,heuristic attention,perceptual grouping,self-supervised learning,visual representation learning ]

Spectral-Temporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification IEEE Transactions on Cybernetics, 52, 5, 3684-3695, 2022-05-01

Self-Supervised Learning Framework toward State-of-the-Art Iris Image Segmentation Sensors, 22, 6, 2022-03-01

Convolutional Blur Attention Network for Cell Nuclei Segmentation Sensors, 22, 4, 2022-02-01 [ Cell nuclei,Convolutional neural network,Deep learning,Nucleus segmentation ]

智能語音及語言處理技術研發人才培育計畫
1140801~1150731
[人工智慧,語音辨識,人才培育 , Artificial Intelligence,Speech Recognition,Talent Cultivation]

智能口語處理技術研發人才培育計畫
1120801~1130731
[智能口語 , Intelligent Spoken Language Processing]

智能口語處理技術研發人才培育計畫
1110801~1120731
[智能口語 , Intelligent Spoken Language Processing]

智慧晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫
1110401~1120331
[智慧晶片 , Intelligent Chip]

應用於客語友善交流及輔助學習之客語翻譯APP
1110110~1110831
[客家,翻譯 , Hakka,translate]

應用於客家民族植物知識語言之智慧型語音及影像辨識功能建置
1100801~1110731
[語音辨識,影像辨識,客家,植物 , Speech recognition,Image recognition,Hakka,Plant]

智慧終端裝置晶片系統與應用跨校教學聯盟計畫-模組教材發展計畫
1100701~1110331
[智慧晶片 , Intelligent Chip]

基於生成式模型之語音及文字惡意攻擊的防禦技術
1160801~1170731

基於生成式模型之語音及文字惡意攻擊的防禦技術
1150801~1160731

AI語音技術與產業應用聯盟(3/3)
1150201~1160131

科研創業計畫：結合唇形之語音辨識商業化個案
1150101~1151231

基於生成式模型之語音及文字惡意攻擊的防禦技術
1140801~1150731

低資源語言之開放式口語問答系統
1140801~1150731

適於邊緣裝置之輕量化多模態大型語言模型與異質加速平台設計及其於生命科學之應用(2/2)
1140501~1150430
[多模態 , Multimodal]

AI語音技術與產業應用聯盟(2/3)
1140201~1150131
[語音降噪、語音辨識 , Speech denoising, speech recognition]

低資源語言之開放式口語問答系統
1130801~1140731

多模態語音辨識技術
1130601~1140531
[多模態語音辨識,聲學模型,,掩碼語言模型, , Multimodal Speech Recognition,Acoustic Model,Masked Language Modeling]

適於邊緣裝置之輕量化多模態大型語言模型與異質加速平台設計及其於生命科學之應用(1/2)
1130501~1140731
[多模態 , Multimodal]

低資源語言之開放式口語問答系統
1120801~1130731

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1120801~1130731
[語音辨識 , Speech recognition]

建置轉譯導向乳癌輛資料及生物資料庫平台(3/4)
1120501~1130430
[乳癌資料庫平台 , Breas Cancer Database Platform.]

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1110801~1120731
[語音辨識 , Speech recognition]

SMA-SRGAN:適用於生物辨識影像生成之基於空間域遮罩注意力機制的超解析度生成對抗式網路
1100801~1110731
[生物辨識 ]

基於大規模非監督預訓練模型之低資源及語碼轉換語音識別技術
1100801~1110731
[語音辨識 , Speech recognition]

智慧型照護互動系統-基於深度智能之口語處理技術(4/4)
1100101~1110331
[智慧型照護互動系統 ]

智慧型照護互動系統-基於深度智能之口語處理技術(3/4)
1090101~1091231

智慧型照護互動系統－基於深度智能之口語處理技術(2/4)
1080101~1081231

基於深度學習之多媒體資料融合與分析–子計畫六：基於深度學習之音樂情緒辨識研究
1060801~1070228

基於深層學習於盲訊號源分離及語音增強之研究
1060801~1070731
[Deep Learning, Speech Enhancement, Blind Source Separation ]

銀髮族口語互動式居家陪伴及推薦系統
1051101~1061031

基於深層學習於盲訊號源分離及語音增強之研究
1050801~1060731
[Deep Learning, Speech Enhancement, Blind Source Separation ]

基於深層學習於盲訊號源分離及語音增強之研究
1040801~1050731
[Deep Learning, Speech Enhancement, Blind Source Separation ]

口語處理人工智慧研究計畫書
1150428~1151027
[生成式人工智慧,自然語言處理 , Generative AI,Natural Language Processing]

電子製造產業智慧化共通應用模組導入驗證
1140919~1141130
[智慧製造,人工智慧應用 , Smart Manufacturing,AI Applications]

凌陽中央聯合實驗室
1130901~1150630
[聲紋複製 , Voice print]

AI 模型測試
1130501~1131031
[AI 模型測試 , Testing AI Models]

語音辨識
1120901~1130610
[語音辨識 , Speech recognition]

人工智慧技術
1120801~1131231
[人工智慧 , Artificial Intelligence Technology]

群創中央聯合實驗室-指紋辨識技術及其開發平台建置
1110801~1120930
[指紋辨識技術 , Fingerprint Recognition Technique]

110中大聯新聯合研發中心計畫-一般X光影像自動核對系統
1100801~1110731
[影像核對,左右顛倒,QC流程 , image verification,incorrect left and right,QC process]

Wind Noise Reduction Based on the Double Masking and Permutation-Invariant Training
Electronics (Switzerland), 15, 5, 2026-03-01
[ dual masking learning,non-stationary noise,permutation-invariant training,wind noise reduction ]

YOWOv3: An Efficient and Generalized Framework for Spatiotemporal Action Detection
IEEE Intelligent Systems, 41, 1, 75-85, 2026-01-01

Hardware Implementation of Improved Oriented FAST and Rotated BRIEF-Simultaneous Localization and Mapping Version 2
Sensors, 25, 20, 2025-10-01
[ machine vision,ORB-SLAM2,Raspberry Pi 3,SLAM,spatial scene construction ]

A Hybrid Deep Learning and Feature Descriptor Approach for Partial Fingerprint Recognition
Electronics (Switzerland), 14, 9, 2025-05-01
[ biometric authentication,convolutional neural networks,deep learning,feature descriptor,partial fingerprint recognition ]

Editorial for Special Issue on Invited Papers from APSIPA ASC 2023
APSIPA Transactions on Signal and Information Processing, 13, 5, 2024-10-07

Implementation of Sound Direction Detection and Mixed Source Separation in Embedded Systems
Sensors, 24, 13, 2024-07-01
[ embedded systems,hybrid sound source separation,position detection,signal-to-interference ratio (SIR),speech recognition ]

Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network
Electronics (Switzerland), 13, 13, 2024-07-01
[ few-shot image segmentation,few-shot learning,meta-learning,semantic segmentation ]

Audio Pre-Processing and Beamforming Implementation on Embedded Systems
Electronics (Switzerland), 13, 14, 2024-07-01
[ audio pre-processing,beamforming,embedded system,enhance speech ]

Semantic-Based Public Opinion Analysis System
Electronics (Switzerland), 13, 11, 2024-06-01
[ K-nearest neighbor algorithm,sentence analysis,support vector machines,topic input and commentary ]

Zero-FVeinNet: Optimizing Finger Vein Recognition with Shallow CNNs and Zero-Shuffle Attention for Low-Computational Devices
Electronics (Switzerland), 13, 9, 2024-05-01
[ attention,biometrical verification,convolution neural network,finger vein,lightweight model ]

Multi-view and multi-augmentation for self-supervised visual representation learning
Applied Intelligence, 54, 1, 629-656, 2024-01-01
[ Data augmentation policies,Metric learning,Multi-augmentation,Nuisance factors,Scale-invariant representation learning,SSL augmentation pipelines ]

Target Speaker Extraction Using Attention-Enhanced Temporal Convolutional Network
Electronics (Switzerland), 13, 2, 2024-01-01
[ automatic speech recognition (ASR),convolutional neural network (CNN),deep learning,target speaker extraction,temporal convolutional network (TCN) ]

Enhancing Breast Cancer Detection: A Novel Training Strategy and Batch Scheduler Method
Proceedings - IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS, 2024, 2024-01-01
[ Batch Scheduler,ConvNext,Dynamic Batch Size,F1 Score,Pretrain ]

LCSL: Long-Tailed Classification via Self-Labeling
IEEE Transactions on Circuits and Systems for Video Technology, 34, 11, 12048-12058, 2024-01-01
[ Image classification,imbalance classification,long-tailed problem,self-labeling ]

Anti-Aliasing Attention U-net Model for Skin Lesion Segmentation
Diagnostics, 13, 8, 2023-04-01
[ computer-aided diagnosis,deep learning,light-weight model,medical internet of things,skin lesion segmentation ]

Electrocardiogram Heartbeat Classification for Arrhythmias and Myocardial Infarction
Sensors, 23, 6, 2023-03-01
[ deep learning,electrocardiogram (ECG) classification,MIT-BIH dataset,PTB dataset ]

The COVIDTW study: Clinical predictors of COVID-19 mortality and a novel AI prognostic model using chest X-ray
Journal of the Formosan Medical Association, 122, 3, 267-275, 2023-03-01
[ Artificial intelligence,Chest X-rays,COVID-19,Intensive care unit,Mortality,Prognosis ]

Deep Learning for Human Action Recognition: A Comprehensive Review
APSIPA Transactions on Signal and Information Processing, 12, 1, 2023-01-01
[ Action recognition,deep learning,deep neural networks,self-supervised learning,supervised learning ]

Editorial for the Special Issue on Learning, Security, AIoT for Emerging Communication/Networking Systems
APSIPA Transactions on Signal and Information Processing, 12, 2, 2023-01-01

Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
IEEE Signal Processing Letters, 30, 1387-1391, 2023-01-01
[ code-switching speech recognition,cyclic transfer learning,Speech recognition,transfer learning ]

A DEEP LEARNING-BASED FAKE NEWS DETECTING SYSTEM
IET Conference Proceedings, 2023, 35, 172-173, 2023-01-01
[ data augmentation,deep learning,Fake news detection,word embedding ]

Fast Gated Recurrent Network for Speech Synthesis
IEICE Transactions on Information and Systems, E105D, 9, 1634-1638, 2022-09-01
[ acoustic modelling,gated recurrent neural network,long short-term memory,speech synthesis ]

Heuristic Attention Representation Learning for Self-Supervised Pretraining
Sensors, 22, 14, 2022-07-10
[ computer vision,deep learning,heuristic attention,perceptual grouping,self-supervised learning,visual representation learning ]

Spectral-Temporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification
IEEE Transactions on Cybernetics, 52, 5, 3684-3695, 2022-05-01

Self-Supervised Learning Framework toward State-of-the-Art Iris Image Segmentation
Sensors, 22, 6, 2022-03-01

Convolutional Blur Attention Network for Cell Nuclei Segmentation
Sensors, 22, 4, 2022-02-01
[ Cell nuclei,Convolutional neural network,Deep learning,Nucleus segmentation ]

Speech Separation Using Augmented-Discrimination Learning on Squash-Norm Embedding Vector and Node Encoder
IEEE Access, 10, 102048-102063, 2022-01-01
[ deep clustering,monophonic source separation,Speaker separation,speech enhancement,supervised speech separation,time frequency masking ]

Antialiasing Attention Spatial Convolution Model for Skin Lesion Segmentation with Applications in the Medical IoT
Wireless Communications and Mobile Computing, 2022, 2022-01-01

Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques
Injury, 52, 3, 616-624, 2021-03-01
[ computer-aided classification system ]

Teaching Yourself: A Self-Knowledge Distillation Approach to Action Recognition
IEEE Access, 9, 105711-105723, 2021-01-01
[ action recognition,convolutional neural network,deep learning,knowledge distillation,Self-knowledge distillation,self-learning ]

Ensemble and Multimodal Learning for Pathological Voice Classification
IEEE Sensors Letters, 2021-01-01
[ acoustic signal,Acoustics,binary classification,ensemble learning,Medical diagnostic imaging,Neoplasms,pathological voice,Pathology,Stacking,Standards,Support vector machines ]

A Calibration-Free 14-b 0.7-mW 100-MS/s Pipelined-SAR ADC Using a Weighted- Averaging Correlated Level Shifting Technique
IEEE Journal of Solid-State Circuits, 55, 12, 3271-3280, 2020-12-01

Embedded draw-down constraint using ensemble learning for stock trading
Journal of Intelligent and Fuzzy Systems, 38, 5, 5651-5659, 2020-01-01
[ ensemble learning,Kelly criterion,money managemen,Monte Carlo simulation ]

Deep learning and SURF for automated classification and detection of calcaneus fractures in CT images
Computer Methods and Programs in Biomedicine, 171, 27-37, 2019-04-01
[ Calcaneus fracture,Computed tomography image,Convolutional neural networks,Residual network,Visual geometry group ]

Locality preserved joint nonnegative matrix factorization for speech emotion recognition
IEICE Transactions on Information and Systems, E102D, 4, 821-825, 2019-04-01
[ Information extraction , Joint dictionary learning , Locality preserving , NMF , Speech emotion recognition ]

Projective complex matrix factorization for facial expression recognition
Eurasip Journal on Advances in Signal Processing, 2018, 1, 2018-12-01
[ Complex matrix factorization , Facial expression recognition , Nonnegative matrix factorization , Projected gradient descent ]

Predicting the Probability Density Function of Music Emotion Using Emotion Space Mapping
IEEE Transactions on Affective Computing, 9, 4, 541-549, 2018-10-01
[ Emotion in music , emotion recognition from audio , predictive model and algorithm , valence-Arousal space ]

A new approach of matrix factorization on complex domain for data representation
IEICE Transactions on Information and Systems, E100D, 12, 3059-3063, 2017-12-01
[ Complex matrix factorization,Data representation,Gradient descent method,Image clustering ]

Music emotion recognition using PSO-based fuzzy hyper-rectangular composite neural networks
IET Signal Processing, 11, 7, 884-891, 2017-09-01

Speaker Identification Using Discriminative Features and Sparse Representation
IEEE Transactions on Information Forensics and Security, 12, 8, 1979-1987, 2017-08-01
[ Sparse representation classifier (SRC) , speaker identification ]