DA321 Multimodal Data Analysis & Learning Exam Solutions
DA321: Multimodal Data Analysis and Learning-I Exam Solutions
Question 1: Image Intensity Histogram and Convolution (4 Points)
Approach:
Constructing the Histogram:
- Create an 8 × 8 matrix with alternating black (0) and white (255) pixels.
- Count the frequency of pixel values (0 and 255) to plot the histogram.
- Resulting histogram:
- 32 pixels with intensity 0.
- 32 pixels with intensity 255.
Applying the 2 × 2 Smoothing Kernel:
- Kernel: 1/4.
- Apply convolution by moving the kernel over the image, assuming mirrored borders for boundary pixels.
- Show a few detailed examples of convolution outputs. For example, a 2 × 2 block of (0, 255, 255, 0) will result in a value of 127.5.
Normalization Explanation:
- Dividing by 4 ensures the output values stay within the original intensity range (0 to 255).
Resulting Histogram After Convolution:
- Draw the new histogram showing a spread of pixel values (e.g., peaks at 128).
Question 2: RGB Color Space and Fourier Spectrum (6 Points)
(a) Total Number of Colors:
- Calculate using: 224.
- Each color component (R, G, B) is represented by an 8-bit number, allowing for 256 values (0-255) per component.
(b) Fourier Spectrum for Each Channel:
- Compute the 2-D Fourier transform for each channel (R, G, B) separately.
- Due to different color distributions, the spectra for R, G, and B may vary.
(c) Edge Enhancement Approach:
- Steps:
- Convert the image to grayscale (optional).
- Apply an edge detection kernel (e.g., Sobel operator).
- Pre-processing: Apply Gaussian smoothing.
- Post-processing: Normalize the output to highlight edges.
(d) 2-D Pattern with Increasing Frequency:
- Draw a pattern matrix with values that form a sinusoidal pattern at 45°.
Question 3: Spectrogram Analysis (12 Points)
Approach:
- Given temporal segments (T1 to T6), map each to sound categories:
- T1 (Fricative): High frequency, broad spectrum.
- T2 (Male Speech): Low to mid-range frequencies.
- T3 (Female Speech): Higher average frequency compared to male speech.
- T4 (Bird Chirp): Sharp, high-frequency components.
- T5 (DTMF Tone): Specific dual frequencies.
- T6 (Pause): Low energy, minimal spectral content.
- Describe the typical spectral characteristics of each category.
Question 4: Fundamental Frequency Estimation (4 Points)
Approach:
2-D Fourier Transform:
- Treat the 2-D spectrogram as an input matrix.
- Apply the 2-D Fourier transform to extract the frequency domain representation.
Identify Harmonics:
- Locate the peak representing the fundamental frequency.
Pre/Post-Processing:
- Use a windowing function before applying the transform.
- Perform peak detection to identify the fundamental frequency accurately.
Question 5: Low-Frequency Spectrum Transmission (6 Points)
Approach:
Design High-Pass Filter:
- Specify a filter with a cutoff frequency > 1000 Hz.
Transmission Strategy:
- Explain how the high-frequency components can be transmitted.
Reconstruction:
- At the receiver, use an inverse transform to reconstruct the signal.
- Consider pre-filtering for noise reduction.
Question 6: Image Conversion to Binary (6 Points)
(a) Thresholding:
- Apply a threshold (e.g., Otsu’s method) for each channel separately.
- Choose thresholds based on maximizing inter-class variance.
(b) k-Means Clustering:
- Steps:
- Convert each pixel’s RGB values into feature space.
- Apply k-means with k=2.
- Assign binary values to clustered pixels.
Question 7: EEG Relaxation Detection (4 Points)
(a) Band Identification:
- Choose the alpha band (8-13 Hz) for relaxation monitoring.
(b) Power Calculation Approach:
- Compute the FFT of the EEG signal.
- Extract the power of the alpha band.
Feedback System Implementation:
- Use a visual or auditory alert when alpha power crosses a defined threshold.
- Suggest real-time tracking software implementation.
Question 8: AR Model and Linear Prediction (3 Points)
(a) Stating the AR Model and Parameter Estimation Methodology:
AR Model Definition:
- An Autoregressive (AR) model of order p is defined as:
xt = a1xt-1 + a2xt-2 + … + apxt-p + εt
where ai are the model parameters, and εt is the error term.
- An Autoregressive (AR) model of order p is defined as:
Parameter Estimation Methodology:
- Least Squares Estimation:
- Construct a system of equations using observed data samples.
- Solve for the parameters using linear algebra techniques (e.g., solving y = Xa where X is the lagged data matrix).
- Yule-Walker Equations:
- Alternatively, derive parameters using the autocorrelation function and solve the Yule-Walker equations.
- Least Squares Estimation:
(b) Poor Predictions for AR(3) on Validation Set:
- Potential reasons for poor performance:
- Model Order Mismatch: The actual signal might require a higher or lower order for accurate prediction.
- Non-stationarity: The signal might be non-stationary, violating the AR model assumptions.
- Overfitting: The training data might fit well with the model, but the validation data might not.
- Noise Sensitivity: High noise levels might affect model accuracy.
Question 9: Crossword Puzzle Solutions (10 Points)
Across:
- Echo (Persistence of sound as it reflects off surfaces in an enclosed space)
- Gaussian (Type of blur filter that uses a bell-shaped curve)
- Microphone (Device that converts sound waves to electrical signals)
- Variance (Statistical measure of the spread in a dataset)
Down:
- Speaker (Output device that converts electrical signals to sound waves)
- Neuron (Basic unit of the nervous system for transmitting signals)
- Median (Noise-reducing filter that replaces pixel values with the middle value)
- Cones (Photoreceptor cells in the eye for detecting color)
- Bayer (Pattern of colored filters on sensors for color images)
- PCA (Dimensionality reduction technique, abbreviated)