Explore topic-wise MCQs in Machine Learning (ML).

This section includes 49 Mcqs, each offering curated multiple-choice questions to sharpen your Machine Learning (ML) knowledge and support exam preparation. Choose a topic below to get started.

1.

During the last few years, many              algorithms have been applied to deep

A. logical
B. classical
C. classification
D. none of above
Answer» E.
2.

If I am using all features of my dataset and I achieve 100% accuracy on my training set, but

A. underfitting
B. nothing, the model is perfect
C. overfitting
Answer» D.
3.

SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)

A. True
B. false
Answer» C.
4.

Which of the following is not supervised learning?

A.   PCA
B.   Decision Tree
C.   Naive Bayesian
D. Linerar regression
Answer» B.   Decision Tree
5.

Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearly

A. s1 ⚂ s2
B. s1 may not be a subset of s2
C. w1 = w2
D. all of the above
Answer» C. w1 = w2
6.

What is the Accuracy in percentage based on following confusion matrix of three class classification.

A. 0.75
B. 0.97
C. 0.95
D. 0.85
Answer» C. 0.95
7.

What is true about K-Mean Clustering?

A. 1 and 3
B. 1 and 2
C. 2 and 3
D. 1, 2 and 3
Answer» E.
8.

Which of the following statement is true about k-NN algorithm?

A. 1 and 2
B. 1 and 3
C. only 1
D. 1,2 and 3
Answer» E.
9.

Which of the following can act as possible termination conditions in K-Means?

A. 1, 3 and 4
B. 1, 2 and 3
C. 1, 2 and 4
D. 1,2,3,4
Answer» E.
10.

We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other 2. Some times, feature normalization is not feasible in case of categorical variables3. Feature normalization always helps when we use Gaussian kernel in SVM

A. 1
B. 1 and 2
C. 1 and 3
D. 2 and 3
Answer» C. 1 and 3
11.

To control the size of the tree, we need to control the number of regions. One approach to

A. a and b
B. a and d
C. b, c and d
D. all of the above
Answer» B. a and d
12.

Which of the following properties are characteristic of decision trees?

A. a and b
B. a and d
C. b, c and d
D. all of the above
Answer» D. all of the above
13.

We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other

A. 1
B. 1 and 2
C. 1 and 3
D. 2 and 3
Answer» C. 1 and 3
14.

Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.

A. (i) and (ii)
B. (ii) and (iii)
C. (iii) and (iv)
D. none of these
Answer» C. (iii) and (iv)
15.

Select the correct answers for following statements.

A. both are true
B. 1 is true and 2 is false
C. both are false
D. 1 is false and 2 is true
Answer» C. both are false
16.

Give the correct Answer for following statements.

A. 1 is true, 2 is false
B. 1 is false, 2 is true
C. 1 is true, 2 is true
D. 1 is false, 2 is false
Answer» D. 1 is false, 2 is false
17.

Which of the following metrics, do we have for finding dissimilarity between two clusters in hierarchical clustering?

A. 1 and 2
B. 1 and 3
C. 2 and 3
D. 1, 2 and 3
Answer» E.
18.

Given that we can select the same feature multiple times during the recursive partitioning of

A. yes
B. no
Answer» C.
19.

Which of the following sentences are correct in reference to

A. a and b
B. a and d
C. b, c and d
D. all of the above
Answer» D. all of the above
20.

Imagine, you are solving a classification problems with highly imbalanced class. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case?

A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4
Answer» B. 1 and 4
21.

Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements

A. a and b
B. a and d
C. b, c and d
D. all of the above
Answer» E.
22.

which of the following cases will K-Means clustering give poor results?

A. 1 and 2
B. 2 and 3
C. 2 and 4
D. 1, 2 and 4
Answer» D. 1, 2 and 4
23.

In which of the following cases will K-Means clustering fail to give good results?

A. 1 and 2
B. 2 and 3
C. 2 and 4
D. 1, 2 and 4
Answer» E.
24.

Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps?

A. 1-2-3
B. 1-3-4
C. 2-1-3
D. none of above
Answer» E.
25.

Q25. Which of the following are advantages of stacking?

A. 1 and 2
B. 2 and 3
C. 1 and 3
D. all of the above
Answer» B. 2 and 3
26.

Which of the following is true about weighted majority votes?

A. 1 and 3
B. 2 and 3
C. 1 and 2
D. 1, 2 and 3
Answer» E.
27.

PCA works better if there is

A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1,2 and 3
Answer» D. 1,2 and 3
28.

Below are the two ensemble models:

A. e1
B. e2
C. any of e1 and e2
D. none of these
Answer» C. any of e1 and e2
29.

In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure?

A. bagging
B. boosting
C. a or b
D. none of these
Answer» B. boosting
30.

How can we assign the weights to output of different models in an ensemble?

A. 1 and 2
B. 1 and 3
C. 2 and 3
D. all of above
Answer» E.
31.

Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35.

A. 0.05
B. 0.06
C. 0.07
D. 0.09
Answer» C. 0.07
32.

Which of the following is / are true about weak learners used in ensemble model?

A. 1 and 2
B. 1 and 3
C. 2 and 3
D. none of these
Answer» B. 1 and 3
33.

Which of the following can be one of the steps in stacking?

A. 1 and 2
B. 2 and 3
C. 1 and 3
D. all of above
Answer» B. 2 and 3
34.

Suppose you are given ‘n’ predictions on test data by ‘n’ different models (M1, M2, …. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models?

A. 1, 3 and 4
B. 1,3 and 6
C. 1,3, 4 and 6
D. all of above
Answer» E.
35.

Having built a decision tree, we are using reduced error pruning to reduce the size of the

A. 10.8, 13.33, 14.48
B. 10.8, 13.33, 12.06
C. 7.2, 10, 8.8
D. 7.2, 10, 8.6
Answer» D. 7.2, 10, 8.6
36.

Which of the following is true about bagging?

A. 1 and 2
B. 2 and 3
C. 1 and 3
D. all of these
Answer» D. all of these
37.

Assume that you are given a data set and a neural network model trained on the data set. You

A. accuracy of the decision tree model on the given data set
B. f1 measure of the decision tree model on the given data set
C. fidelity of the decision tree model, which is the fraction of instances on which the neuralnetwork and the decision tree give the same output
D. comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set
Answer» D. comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set
38.

Let’s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data.

A. All categories of categorical variable are not present in the test dataset.
B. Frequency distribution of categories is different in train as compared to the test dataset.
C. Train and Test always have same distribution.
D. Both A and B
Answer» E.
39.

Suppose you are using stacking with n different machine learning algorithms with k folds on data.

A. you will have only k features after the first stage
B. you will have only m features after the first stage
C. you will have k+m features after the first stage
D. you will have k*n features after the first stage
Answer» C. you will have k+m features after the first stage
40.

Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms?

A. 1
B. 2
C. 3&4
D. 1,2,3&4
Answer» E.
41.

What is true about an ensembled classifier?

A. 1 and 2
B. 1 and 3
C. 2 and 3
D. all of the above
Answer» E.
42.

Which of the following are correct statement(s) about stacking?

A. 1 and 2
B. 2 and 3
C. 1 and 3
D. 1,2 and 3
Answer» D. 1,2 and 3
43.

Which of the following can be true for selecting base learners for an ensemble?

A. 1
B. 2
C. 1 and 3
D. 1, 2 and 3
Answer» E.
44.

Which of the following option is / are correct regarding benefits of ensemble model? 1. Better performance

A. 1 and 3
B. 2 and 3
C. 1, 2 and 3
D. 1 and 2
Answer» E.
45.

What are the steps for using a gradient descent algorithm?

A. 1, 2, 3, 4, 5
B. 4, 3, 1, 5, 2
C. 3, 2, 1, 5, 4
D. 5, 4, 3, 2, 1
Answer» C. 3, 2, 1, 5, 4
46.

8 observations are clustered into 3 clusters using K-Means clustering algorithm. After first iteration clusters,

A. c1: (4,4), c2: (2,2), c3: (7,7)
B. c1: (6,6), c2: (4,4), c3: (9,9)
C. c1: (2,2), c2: (0,0), c3: (5,5)
D. c1: (4,4), c2: (3,3), c3: (7,7)
Answer» E.
47.

Which of the following option is / are correct regarding benefits of ensemble model?

A. 1 and 3
B. 2 and 3
C. 1 and 2
D. 1, 2 and 3
Answer» D. 1, 2 and 3
48.

What is the sequence of the following tasks in a perceptron?

A. 1, 4, 3, 2
B. 3, 1, 2, 4
C. 4, 3, 2, 1
D. 1, 2, 3, 4
Answer» B. 3, 1, 2, 4
49.

Bayes Theorem is given by where 1. P(H) is the probability of hypothesis H being true.

A. true
B. false
Answer» B. false