Machine Learning Key Concepts: A Quick Review
Machine Learning Key Concepts
1. You have a dataset with 7 features and 3 different labels. Which of the following would be a valid NN model? All of the above
2. Which of the following is not a data preprocessing task? Data translation
3. On a neural network, the output layer represents… Labels
6. Which is the output of this code? [[0,0,1], [0, 1,0],[1,0,0]]
7. What does this confusion matrix represent, where 1= positive and 0 = negative & T=True and F=False? TP=41, TN=116, FP=9, FN=26
9. With data binning you can achieve … All of the above
10. What should we do to deal with the Bias-Variance Trade-off dilemma? Find the point that minimizes the MSE
11. The PCA method allows you to reduce the number of features of a huge dataset Yes, by obtaining the eigenvalues of the covariance matrix
12. An automatic vacuum cleaner begins navigating through your flat to make a map as a base for cleaning your home. This behavior is: Model free
13. On a CNN with stride=3, which will be the upper-left value of the next “square” to analyze? 2
14. On CNN algorithms, what is padding? To add zeroes out of the borders of the dataset to avoid underrepresent them
15. On a neural network, what is the purpose of the activation function… a and b are true
16. On a CNN, what is a kernel? A sort of window to perform operations on the whole dataset by moving it
17. In bagging we … Divide the dataset in bags and train and test with each
18. Which are the axis on a plot for the Elbow method? All of the above
20. In k-Fold cross validation we … Split the dataset in groups and use each group to test the remaining data (used as train data)
21. On a neural network, why do you choose initially the weights of neurons randomly? You don’t have any information of the neurons at the beginning
23. On hierarchical clustering we begin by assigning each observation to its own 1-point cluster. Yes
24. Which clustering method would you use to clusterize these observations? DBScan
25. What do you plot on a silhouette graphic? Silhouette coefficient of each point, grouped by its cluster
26. On NN algorithms, what is a batch? One or more samples considered by the model within an epoch before weights are updated
27. With a ReLu activation function … The neuron activates above zero and output is linear
28. What would you conclude from this silhouette plot? We should try with less clusters to get better silhouette
30. On a Reinforcement ML system, what’s the value? The expected long-term return with discount
31. A robotic arm must open the leg of a patient and once the bone is reached, let the doctor finish the operation. Which is the environment on this ML system? The patient
32. On the Bias-Variance Trade-off dilemma, underfitting implies: High bias and low variance
33. A neural network, … Divides the feature space allowing classifying observations
34. GPT means Generative Pretrained Transformers. Why do you think they are pretrained? Like any ML algorithm you train it and then release to real data
35. On a neural network, hidden layers represent… None of the above
36. A robotic arm has to open the leg of a patient and once the bone is reached, let the doctor finish the operation. Which are the actions on this ML system? Movements and actions of the robot
37. Linear regression is a ML algorithm Yes, because it predicts outcomes based on training data
38. On the Bias-Variance Trade-off dilemma, overfitting implies: Low bias and high variance