Description
The authors offer a comprehensive guide to machine learning applied to signal processing and recognition problems, and then discuss real applications in domains such as speech processing and biomedical signal processing, with a focus on handling noise. This textbook is intended for advanced undergraduate and graduate students of computer science and engineering. Prof. Michael M. Richter completed his PhD on mathematical logic at the University of Freiburg, and his Habilitation in mathematics at the University of Tbingen. He taught at the University of Texas at Austin and at RWTH Aachen, in addition to numerous visiting professorships. Most recently he held a chair in computer science at the University of Kaiserslautern, where he was also the founding scientific director of the DFKI (German Research Center for Artificial Intelligence). He is currently an adjunct professor at the University of Calgary. He has taught, researched, and published extensively in the areas of mathematical logic and artificial intelligence. Prof. Richter is one of the pioneers of case-based reasoning: he founded the leading European event on the subject, he led many of the key academic research projects, and he demonstrated the real-world viability of the approach with successful commercial products. Dr. Sheuli Paul completed her PhD on a dynamic automatic noisy speech recognition system in Kaiserslautern. Her interests include speech recognition and signal processing. Part I Realms of Signal Processing 1 Digital Signal Representation 1.1 Introduction 1.2 Numbers 1.2.1 Numbers and Numerals 1.2.2 Types of Numbers 1.2.3 Positional Number Systems 1.3 Sampling and Reconstruction of Signals 1.3.1 Scalar Quantization 1.3.2 Quantization Noise 1.3.3 Signal-To-Noise Ratio 1.3.4 Transmission Rate 1.3.5 Nonuniform Quantizer 1.3.6 Companding 1.4 Data Representations 1.4.1 Fixed-Point Number Representations 1.4.2 Sign-Magnitude Format 1.4.3 One’s-Complement Format 1.4.4 Two’s-Complement Format 1.5 Fix-Point DSP’s 1.6 Fixed-Point Representations Based on Radix-Point 1.7 Dynamic Range 1.8 Precision 1.9 Background Information 1.10 Exercises 2 Signal Processing Background 2.1 Basic Concepts 2.2 Signals and Information 2.3 Signal Processing ix x Contents 2.4 Discrete Signal Representations 2.5 Delta and Impulse Function 2.6 Parseval’s Theorem 2.7 Gibbs Phenomenon 2.8 Wold Decomposition 2.9 State Space Signal Processing 2.10 Common Measurements 2.10.1 Convolution 2.10.2 Correlation 2.10.3 Auto Covariance 2.10.4 Coherence 2.10.5 Power Spectral Density (PSD) 2.10.6 Estimation and Detection 2.10.7 Central Limit Theorem 2.10.8 Signal Information Processing Types 2.10.9 Machine Learning 2.10.10Exercises 3 Fundamentals of Signal Transformations 3.1 Transformation Methods 3.1.1 Laplace Transform 3.1.2 Z-Transform 3.1.3 Fourier Series 3.1.4 Fourier Transform 3.1.5 Discrete Fourier Transform and Fast Fourier Transform 3.1.6 Zero Padding 3.1.7 Overlap-Add and Overlap-Save Convolution Algorithms 3.1.8 Short Time Fourier Transform (STFT) 3.1.9 Wavelet Transform 3.1.10 Windowing Signal and the DCT Transforms 3.2 Analysis and Comparison of Transformations 3.3 Background Information 3.4 Exercises 3.5 References 4 Digital Filters 4.1 Introduction 4.1.1 FIR and IIR Filters 4.1.2 Bilinear Transform 4.2 Windowing for Filtering 4.3 Allpass Filters 4.4 Lattice Filters 4.5 All-Zero Lattice Filter 4.6 Lattice Ladder Filters Contents xi 4.7 Comb Filter 4.8 Notch Filter 4.9 Background Information 4.10 Exercises 5 Estimation and Detection 5.1 Introduction 5.2 Hypothesis Testing 5.2.1 Bayesian Hypothesis Testing 5.2.2 MAP Hypothesis Testing 5.3 Maximum Likelihood (ML) Hypothesis Testing 5.4 Standard Analysis Techniques 5.4.1 Best Linear Unbiased Estimator (BLUE) 5.4.2 Maximum Likelihood Estimator (MLE) 5.4.3 Least Squares Estimator (LSE) 5.4.4 Linear Minimum Mean Square Error Estimator (LMMSE) 5.5 Exercises 6 Adaptive Signal Processing 6.1 Introduction 6.2 Parametric Signal Modeling 6.2.1 Parametric Estimation 6.3 Wiener Filtering 6.4 Kalman Filter 6.4.1 Smoothing 6.5 Particle Filter 6.6 Fundamentals of Monte Carl 6.6.1 Importance Sampling (IS) 6.7 Non-Parametric Signal Modeling 6.8 Non-Parametric Estimation 6.8.1 Correlogram 6.8.2 Periodogram 6.9 Filter Bank Method 6.10 Quadrature Mirror Filter Bank (QMF) 6.11 Background Information 6.12 Exercises 7 Spectral Analysis 7.1 Introduction 7.2 Adaptive Spectral Analysis 7.3 Multivariate Signal Processing 7.3.1 Sub-band Coding and Subspace Analysis 7.4 Wavelet Analysis 7.5 Adaptive Beam Forming xii Contents 7.6 Independent Component Analysis (ICA) 7.7 Principal Component Analysis (PCA) 7.8 Best Basis Algorithms 7.9 Background Information 7.10 Exercises Part II Machine Learning and Recognition 8 General Learning 8.1 Introduction to Learning 8.2 The Learning Phases 8.2.1 Search and Utility 8.3 Search 8.3.1 General Search Model 8.3.2 Preference relations 8.3.3 Different learning methods 8.3.4 Similarities 8.3.5 Learning to Recognize 8.3.6 Learning again 8.4 Background Information 8.5 Exercises 9 Signal Processes, Learning, and Recognition 9.1 Learning 9.2 Bayesian Formalism 9.2.1 Dynamic Bayesian Theory 9.2.2 Recognition and Search 9.2.3 Influences 9.3 Subjectivity 9.4 Background Information 9.5 Exercises 10 Stochastic Processes 10.1 Preliminaries on Probabilities 10.2 Basic Concepts of Stochastic Processes 10.2.1 Markov Processes 10.2.2 Hidden Stochastic Models (HSM) 10.2.3 HSM Topology 10.2.4 Learning Probabilities 10.2.5 Re-estimation 10.2.6 Redundancy 10.2.7 Data Preparation 10.2.8 Proper Redundancy Removal 10.3 Envelope Detection 10.3.1 Silence Threshold Selection 10.3.2 Pre-emphasis Contents xiii 10.4 Several Processes 10.4.1 Similarity 10.4.2 The Local-Global Principle 10.4.3 HSM Similarities 10.5 Conflict and Support 10.6 Examples and Applications 10.7 Predictions 10.8 Background Information 10.9 Exercises 11 Feature Extraction 11.1 Feature Extractions 11.2 Basic Techniques 11.2.1 Spectral Shaping 11.3 Spectral Analysis and Feature Transformation 11.3.1 Parametric Feature Transformations and Cepstrum 11.3.2 Standard Feature Extraction Techniques 11.3.3 Frame Energy 11.4 Linear Prediction Coe_cients (LPC) 11.5 Linear Prediction Cepstral Coe_cients (LPCC) 11.6 Adaptive Perceptual Local Trigonometric Transformation (APLTT) 11.7 Search 11.7.1 General Search Model 11.8 Predictions 11.8.1 Purpose 11.8.2 Linear Prediction 11.8.3 Mean Squared Error Minimization 11.8.4 Computation of Probability of an Observation Sequence 11.8.5 Forward and Backward Prediction 11.8.6 Forward-Backward Prediction 11.9 Background Information 11.10Exercises 12 Unsupervised Learning 12.1 Generalities 12.2 Clustering Principles 12.3 Cluster Analysis Methods 12.4 Special Methods 12.4.1 K-means 12.4.2 Vector Quantization (VQ) 12.4.3 Expectation Maximization (EM) 12.4.4 GMM Clustering 12.5 Background Information 12.6 Exercises xiv Contents 13 Markov Model and Hidden Stochastic Model 13.1 Markov Process 13.2 Gaussian Mixture Model (GMM) 13.3 Advantages of using GMM 13.4 Linear Prediction Analysis 13.4.1 Autocorrelation Method 13.4.2 Yule-Walker Approach 13.4.3 Covariance Method 13.4.4 Comparison of Correlation and Covariance methods 13.5 The ULS Approach 13.6 Comparison of ULS and Covariance Methods 13.7 Forward Prediction 13.8 Backward Prediction 13.9 Forward-Backward Prediction 13.10Baum-Welch Algorithm 13.11Viterbi Algorithm 13.12Background Information 13.13Exercises 14 Fuzzy Logic and Rough Sets 14.1 Rough Sets 14.2 Fuzzy Sets 14.2.1 Basis Elements 14.2.2 Possibility and Necessity 14.3 Fuzzy Clustering 14.4 Fuzzy Probabilities 14.5 Background Information 14.6 Exercises 15 Neural Networks 15.1 Neural Network Types 15.1.1 Neural Network Training 15.1.2 Neural Network Topology 15.2 Parallel Distributed Processing 15.2.1 Forward and Backward Uses 15.2.2 Learning 15.3 Applications to Signal Processing 15.4 Background Information 15.5 Exercises Part III Real Aspects and Applications Contents xv 16 Noisy Signals 16.1 Introduction 16.2 Noise Questions 16.3 Sources of Noise 16.4 Noise Measurement 16.5 Weights and A-Weights 16.6 Signal to Noise Ratio (SNR) 16.7 Noise Measuring Filters and Evaluation 16.8 Types of noise 16.9 Origin of noises 16.10Box Plot Evaluation 16.11Individual noise types 16.11.1Residual 16.11.2Mild 16.11.3Steady-unsteady Time varying Noise 16.11.4Strong Noise 16.12Solution to Strong Noise: Matched Filter 16.13Background Information 16.14Exercises 17 Reasoning Methods and Noise Removal 17.1 Generalities 17.2 Special Noise Removal Methods 17.2.1 Residual Noise 17.2.2 Mild Noise 17.2.3 Steady-Unsteady Noise 17.2.4 Strong Noise 17.3 Poisson Distribution 17.3.1 Outliers and Shots 17.3.2 Underlying probability of Shots 17.4 Kalman Filter 17.4.1 Prediction Estimates 17.4.2 White noise Kalman filtering 17.4.3 Application of Kalman filter 17.5 Classification, Recognition and Learning 17.5.1 Summary of the used concepts 17.6 Principle Component Analysis (PCA) 17.7 Reasoning Methods 17.7.1 Case-Based Reasoning (CBR) 17.8 Background Information 17.9 Exercises xvi Contents 18 Audio Signals and Speech Recognition 18.1 Generalities of Speech 18.2 Categories of Speech Recognition 18.3 Automatic Speech Recognition 18.3.1 System Structure 18.4 Speech Production Model 18.5 Acoustics 18.6 Human Speech Production 18.6.1 The Human Speech Generation 18.6.2 Excitation 18.6.3 Voiced Speech 18.6.4 Unvoiced Speech 18.7 Silence Regions 18.8 Glottis 18.9 Lips 18.10Plosive Speech Source 18.11Vocal-Tract 18.12Parametric and Non-Parametric Models 18.13Formants 18.14Strong Noise 18.15Background Information 18.16Exercises 19 Noisy Speech 19.1 Introduction 19.2 Colored Noise 19.2.1 Additional types of Colored Noise 19.3 Poisson Processes and Shots 19.4 Matched Filters 19.5 Shot Noise 19.6 Background Information 19.7 Exercises 20 Aspects Of Human Hearing 20.1 Human Ear 20.2 Human Auditory System 20.3 Critical Bands and Scales 20.3.1 Mel Scale 20.3.2 Bark Scale 20.3.3 Erb Scale 20.3.4 Greenwood Scale 20.4 Filter Banks 20.4.1 ICA Network 20.4.2 Auditory Filter Banks 20.4.3 Filter Banks Contents xvii 20.4.4 Mel Critical Filter Bank 20.5 Psycho-acoustic Phenomena 20.5.1 Perceptual Measurement 20.5.2 Human Hearing and Perception 20.5.3 Sound Pressure Level (SPL) 20.5.4 Absolute Threshold of Hearing (ATH) 20.6 Perceptual Adaptation 20.7 Auditory System and Hearing Model 20.8 Auditory Masking and Masking Frequency 20.9 Perceptual Spectral Features 20.10Critical Band Analysis 20.11Equal Loudness Pre-emphasis 20.12Perceptual Transformation 20.13Feature Transformation 20.14Filters and Human Ear 20.15Temporal Aspects 20.16Background Information 20.17Exercises 21 Speech Features 21.1 Generalities 21.2 Cost Functions 21.3 Special Feature Extractions 21.3.1 MFCC Features 21.3.2 Feature Transformation applying DCT 21.4 Background Information 21.5 Exercises 22 Hidden Stochastic Model for Speech 22.1 General 22.2 Hidden Stochastic Model 22.3 Forward and Backward Predictions 22.3.1 Forward Algorithm 22.3.2 Backward Algorithm 22.4 Forward-Backward Prediction 22.5 Burg Approach 22.6 Graph Search 22.6.1 Recognition Model with Search 22.7 Semantic Issues and Industrial Applications 22.8 Problems with Noise 22.9 Aspects of Music 22.10Music reception 22.11Background Information 22.12Exercises xviii Contents 23 Different Speech Applications – Part A 23.1 Generalities 23.2 Example Applications 23.2.1 Experimental laboratory 23.2.2 Health care support (everyday actions) 23.2.3 Diagnostic support for persons with possible dementia 23.2.4 Noise 23.3 Background Information 23.4 Exercises 24 Different Speech Applications – Part B 24.1 Introduction 24.2 Discrete-Time Signals 24.3 Speech Processing 24.3.1 Framing 24.3.2 Pre-emphasis 24.3.3 Windowing 24.3.4 Fourier Transform 24.3.5 Mel-Filtering 24.3.6 Mel-Frequency Cepstral Coeffcients 24.4 Speech Analysis and Sound Effects Laboratory (SASE_Lab) 24.5 Wake-Up-Word Speech Recognition 24.5.1 Introduction 24.5.2 Wake-up-Word Paradigm 24.5.3 Wake-Up-Word: Definition 24.5.4 Wake-Up-Word System 24.5.5 Front-End of the Wake-Up-Word System 24.6 Conclusion 24.6.1 Wake-Up-Word: Tool Demo 24.6.2 Elevator Simulator 24.7 Background Information 24.8 Exercises 24.9 Speech Analysis and Sound E_ects Laboratory (SASE_Lab)” 25 Biomedical Signals: ECG, EEG 25.1 ECG signals 25.1.1 Bioelectric Signals 25.1.2 Noise 25.2 EEG Signals 25.2.1 General properties 25.2.2 Signal types and properties 25.2.3 Disadvantages 25.3 Neural Network use 25.4 Major Research Questions 25.5 Background Information Contents xix 25.6 Exercises 26 Seismic Signals 26.1 Generalities 26.2 Sources of seismic signals 26.3 Intermediate elements 26.4 Practical Data Sources 26.5 Major seismic problems 26.6 Noise 26.7 Background Information 26.8 Exercises 27 Radar Signals 27.1 Introduction 27.2 Radar Types and Applications 27.3 Doppler Equations, Ambiguity Function(AF) and Matched Filter 27.4 Moving Target Detection 27.5 Applications and Discussions 27.6 Examples 27.7 Background Information 27.8 Exercises 28 Visual Story Telling 28.1 Introduction 28.1.1 Common Visualization Approaches 28.2 Analytics and Visualization 28.2.1 Visualization 28.2.2 Visual Data Minin 28.3 Communication and Visualization 28.4 Background Information 28.5 Exercises 29 Digital Processes and Multimedia 29.1 Images 29.1.1 Digital Image Processing 29.1.2 Images as Matrices 29.1.3 Gray Scale Images 29.2 Spatial Filtering 29.2.1 Linear Filtering of Images 29.2.2 Separable Filters 29.2.3 Mechanics of Linear Spatial Filtering Operation 29.3 Median Filtering 29.4 Color Equalization 29.4.1 Image Transformations 29.4.2 Examples of Image Transformation Matrixes xx Contents 29.5 Basic Image Statistics 29.6 Abstraction Levels of Images and its Representations 29.6.1 Lowest Level 29.6.2 Geometric Level 29.6.3 Domain Level 29.6.4 Segmentation 29.7 Background Information 29.8 Exercises 30 Visualizations of Emergency Operation Centre 30.1 Introduction 30.2 Communications in Emergency Situations 30.3 Emergency Scenario 30.3.1 Classification and EOC Scenario 30.4 Technical Aspects and Techniques 30.4.1 Classification 30.4.2 Clustering 30.5 Background Information 30.6 Exercises 31 Intelligent Interactive Communications 31.1 Introduction 31.2 Spoken Dialogue System 31.3 Gesture based Interaction 31.4 Object Recognition and Identification 31.5 Visual Story Telling 31.6 Virtual Environment for Personal Assistance 31.7 Sensor Fusion 31.8 Intelligent Human Machine for Communication Application Scenario 31.9 Background Information 31.10Exercises 32 Comparisons 32.1 Generalities 32.1.1 EEG and ECG 32.1.2 Speech and biomedical applications 32.1.3 Seismic and biomedical signals 32.1.4 Speech and Images 32.2 Overall 32.3 Background Information 32.3.1 General 32.4 Exercises Glossary




