49 + Mcqs in General Page 1 McqOptions

1.	During the last few years, many algorithms have been applied to deep
A.	logical
B.	classical
C.	classification
D.	none of above
Answer» E.

Discussion

2.	If I am using all features of my dataset and I achieve 100% accuracy on my training set, but
A.	underfitting
B.	nothing, the model is perfect
C.	overfitting
Answer» D.

Discussion

3.	SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)
A.	True
B.	false
Answer» C.

Discussion

4.	Which of the following is not supervised learning?
A.	PCA
B.	Decision Tree
C.	Naive Bayesian
D.	Linerar regression
Answer» B. Decision Tree

Discussion

5.	Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearly
A.	s1 ⚂ s2
B.	s1 may not be a subset of s2
C.	w1 = w2
D.	all of the above
Answer» C. w1 = w2

Discussion

6.	What is the Accuracy in percentage based on following confusion matrix of three class classification.
A.	0.75
B.	0.97
C.	0.95
D.	0.85
Answer» C. 0.95

Discussion

7.	What is true about K-Mean Clustering?
A.	1 and 3
B.	1 and 2
C.	2 and 3
D.	1, 2 and 3
Answer» E.

Discussion

8.	Which of the following statement is true about k-NN algorithm?
A.	1 and 2
B.	1 and 3
C.	only 1
D.	1,2 and 3
Answer» E.

Discussion

9.	Which of the following can act as possible termination conditions in K-Means?
A.	1, 3 and 4
B.	1, 2 and 3
C.	1, 2 and 4
D.	1,2,3,4
Answer» E.

Discussion

10.	We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other 2. Some times, feature normalization is not feasible in case of categorical variables3. Feature normalization always helps when we use Gaussian kernel in SVM
A.	1
B.	1 and 2
C.	1 and 3
D.	2 and 3
Answer» C. 1 and 3

Discussion

11.	To control the size of the tree, we need to control the number of regions. One approach to
A.	a and b
B.	a and d
C.	b, c and d
D.	all of the above
Answer» B. a and d

Discussion

12.	Which of the following properties are characteristic of decision trees?
A.	a and b
B.	a and d
C.	b, c and d
D.	all of the above
Answer» D. all of the above

Discussion

13.	We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other
A.	1
B.	1 and 2
C.	1 and 3
D.	2 and 3
Answer» C. 1 and 3

Discussion

14.	Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.
A.	(i) and (ii)
B.	(ii) and (iii)
C.	(iii) and (iv)
D.	none of these
Answer» C. (iii) and (iv)

Discussion

15.	Select the correct answers for following statements.
A.	both are true
B.	1 is true and 2 is false
C.	both are false
D.	1 is false and 2 is true
Answer» C. both are false

Discussion

16.	Give the correct Answer for following statements.
A.	1 is true, 2 is false
B.	1 is false, 2 is true
C.	1 is true, 2 is true
D.	1 is false, 2 is false
Answer» D. 1 is false, 2 is false

Discussion

17.	Which of the following metrics, do we have for finding dissimilarity between two clusters in hierarchical clustering?
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	1, 2 and 3
Answer» E.

Discussion

18.	Given that we can select the same feature multiple times during the recursive partitioning of
A.	yes
B.	no
Answer» C.

Discussion

19.	Which of the following sentences are correct in reference to
A.	a and b
B.	a and d
C.	b, c and d
D.	all of the above
Answer» D. all of the above

Discussion

20.	Imagine, you are solving a classification problems with highly imbalanced class. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case?
A.	1 and 3
B.	1 and 4
C.	2 and 3
D.	2 and 4
Answer» B. 1 and 4

Discussion

21.	Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements
A.	a and b
B.	a and d
C.	b, c and d
D.	all of the above
Answer» E.

Discussion

22.	which of the following cases will K-Means clustering give poor results?
A.	1 and 2
B.	2 and 3
C.	2 and 4
D.	1, 2 and 4
Answer» D. 1, 2 and 4

Discussion

23.	In which of the following cases will K-Means clustering fail to give good results?
A.	1 and 2
B.	2 and 3
C.	2 and 4
D.	1, 2 and 4
Answer» E.

Discussion

24.	Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps?
A.	1-2-3
B.	1-3-4
C.	2-1-3
D.	none of above
Answer» E.

Discussion

25.	Q25. Which of the following are advantages of stacking?
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of the above
Answer» B. 2 and 3

Discussion

26.	Which of the following is true about weighted majority votes?
A.	1 and 3
B.	2 and 3
C.	1 and 2
D.	1, 2 and 3
Answer» E.

Discussion

27.	PCA works better if there is
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	1,2 and 3
Answer» D. 1,2 and 3

Discussion

28.	Below are the two ensemble models:
A.	e1
B.	e2
C.	any of e1 and e2
D.	none of these
Answer» C. any of e1 and e2

Discussion

29.	In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure?
A.	bagging
B.	boosting
C.	a or b
D.	none of these
Answer» B. boosting

Discussion

30.	How can we assign the weights to output of different models in an ensemble?
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	all of above
Answer» E.

Discussion

31.	Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35.
A.	0.05
B.	0.06
C.	0.07
D.	0.09
Answer» C. 0.07

Discussion

32.	Which of the following is / are true about weak learners used in ensemble model?
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	none of these
Answer» B. 1 and 3

Discussion

33.	Which of the following can be one of the steps in stacking?
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of above
Answer» B. 2 and 3

Discussion

34.	Suppose you are given ‘n’ predictions on test data by ‘n’ different models (M1, M2, …. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models?
A.	1, 3 and 4
B.	1,3 and 6
C.	1,3, 4 and 6
D.	all of above
Answer» E.

Discussion

35.	Having built a decision tree, we are using reduced error pruning to reduce the size of the
A.	10.8, 13.33, 14.48
B.	10.8, 13.33, 12.06
C.	7.2, 10, 8.8
D.	7.2, 10, 8.6
Answer» D. 7.2, 10, 8.6

Discussion

36.	Which of the following is true about bagging?
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of these
Answer» D. all of these

Discussion

37.	Assume that you are given a data set and a neural network model trained on the data set. You
A.	accuracy of the decision tree model on the given data set
B.	f1 measure of the decision tree model on the given data set
C.	fidelity of the decision tree model, which is the fraction of instances on which the neuralnetwork and the decision tree give the same output
D.	comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set
Answer» D. comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set

Discussion

38.	Let’s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data.
A.	All categories of categorical variable are not present in the test dataset.
B.	Frequency distribution of categories is different in train as compared to the test dataset.
C.	Train and Test always have same distribution.
D.	Both A and B
Answer» E.

Discussion

39.	Suppose you are using stacking with n different machine learning algorithms with k folds on data.
A.	you will have only k features after the first stage
B.	you will have only m features after the first stage
C.	you will have k+m features after the first stage
D.	you will have k*n features after the first stage
Answer» C. you will have k+m features after the first stage

Discussion

40.	Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms?
A.	1
B.	2
C.	3&4
D.	1,2,3&4
Answer» E.

Discussion

41.	What is true about an ensembled classifier?
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	all of the above
Answer» E.

Discussion

42.	Which of the following are correct statement(s) about stacking?
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	1,2 and 3
Answer» D. 1,2 and 3

Discussion

43.	Which of the following can be true for selecting base learners for an ensemble?
A.	1
B.	2
C.	1 and 3
D.	1, 2 and 3
Answer» E.

Discussion

44.	Which of the following option is / are correct regarding benefits of ensemble model? 1. Better performance
A.	1 and 3
B.	2 and 3
C.	1, 2 and 3
D.	1 and 2
Answer» E.

Discussion

45.	What are the steps for using a gradient descent algorithm?
A.	1, 2, 3, 4, 5
B.	4, 3, 1, 5, 2
C.	3, 2, 1, 5, 4
D.	5, 4, 3, 2, 1
Answer» C. 3, 2, 1, 5, 4

Discussion

46.	8 observations are clustered into 3 clusters using K-Means clustering algorithm. After first iteration clusters,
A.	c1: (4,4), c2: (2,2), c3: (7,7)
B.	c1: (6,6), c2: (4,4), c3: (9,9)
C.	c1: (2,2), c2: (0,0), c3: (5,5)
D.	c1: (4,4), c2: (3,3), c3: (7,7)
Answer» E.

Discussion

47.	Which of the following option is / are correct regarding benefits of ensemble model?
A.	1 and 3
B.	2 and 3
C.	1 and 2
D.	1, 2 and 3
Answer» D. 1, 2 and 3

Discussion

48.	What is the sequence of the following tasks in a perceptron?
A.	1, 4, 3, 2
B.	3, 1, 2, 4
C.	4, 3, 2, 1
D.	1, 2, 3, 4
Answer» B. 3, 1, 2, 4

Discussion

49.	Bayes Theorem is given by where 1. P(H) is the probability of hypothesis H being true.
A.	true
B.	false
Answer» B. false

Discussion

Explore topic-wise MCQs in Machine Learning (ML).

During the last few years, many algorithms have been applied to deep

If I am using all features of my dataset and I achieve 100% accuracy on my training set, but

SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)

Which of the following is not supervised learning?

Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearly

What is the Accuracy in percentage based on following confusion matrix of three class classification.

What is true about K-Mean Clustering?

Which of the following statement is true about k-NN algorithm?

Which of the following can act as possible termination conditions in K-Means?

To control the size of the tree, we need to control the number of regions. One approach to

Which of the following properties are characteristic of decision trees?

We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other

Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.

Select the correct answers for following statements.

Give the correct Answer for following statements.

Which of the following metrics, do we have for finding dissimilarity between two clusters in hierarchical clustering?

Given that we can select the same feature multiple times during the recursive partitioning of

Which of the following sentences are correct in reference to

Imagine, you are solving a classification problems with highly imbalanced class. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case?

Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements

which of the following cases will K-Means clustering give poor results?

In which of the following cases will K-Means clustering fail to give good results?

Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps?

Q25. Which of the following are advantages of stacking?

Which of the following is true about weighted majority votes?

PCA works better if there is

Below are the two ensemble models:

In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure?

How can we assign the weights to output of different models in an ensemble?

Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35.

Which of the following is / are true about weak learners used in ensemble model?

Which of the following can be one of the steps in stacking?

Suppose you are given ‘n’ predictions on test data by ‘n’ different models (M1, M2, …. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models?

Having built a decision tree, we are using reduced error pruning to reduce the size of the

Which of the following is true about bagging?

Assume that you are given a data set and a neural network model trained on the data set. You

Let’s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data.

Suppose you are using stacking with n different machine learning algorithms with k folds on data.

Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms?

What is true about an ensembled classifier?

Which of the following are correct statement(s) about stacking?

Which of the following can be true for selecting base learners for an ensemble?

Which of the following option is / are correct regarding benefits of ensemble model? 1. Better performance

What are the steps for using a gradient descent algorithm?

8 observations are clustered into 3 clusters using K-Means clustering algorithm. After first iteration clusters,

Which of the following option is / are correct regarding benefits of ensemble model?

What is the sequence of the following tasks in a perceptron?

Bayes Theorem is given by where 1. P(H) is the probability of hypothesis H being true.