

MCQOPTIONS
Saved Bookmarks
This section includes 88 Mcqs, each offering curated multiple-choice questions to sharpen your Computer Science Engineering (CSE) knowledge and support exam preparation. Choose a topic below to get started.
1. |
In given image, P(H) is probability. |
A. | posterior |
B. | prior |
Answer» C. | |
2. |
Even if there are no actual supervisors learning is also based on feedback provided by the environment |
A. | supervised |
B. | reinforcement |
C. | unsupervised |
D. | none of the above |
Answer» C. unsupervised | |
3. |
According to , it's a key success factor for the survival and evolution of all species. |
A. | claude shannon s theory |
B. | gini index |
C. | darwin's theory |
D. | none of above |
Answer» D. none of above | |
4. |
overlearning causes due to an excessive . |
A. | capacity |
B. | regression |
C. | reinforcement |
D. | accuracy |
Answer» B. regression | |
5. |
It's possible to specify if the scaling process must include both mean and standard deviation using the parameters . |
A. | with_mean=tru e/false |
B. | with_std=true/ false |
C. | both a & b |
D. | none of the mentioned |
Answer» D. none of the mentioned | |
6. |
if there is only a discrete number of possible outcomes (called categories), the process becomes a . |
A. | regression |
B. | classification. |
C. | modelfree |
D. | categories |
Answer» C. modelfree | |
7. |
Q25. Which of the following are advantages of stacking?1) More robust model2) better prediction3) Lower time of execution |
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of the above |
Answer» B. 2 and 3 | |
8. |
can be adopted when it's necessary to categorize a large amount of data with a few complete examples or when there's the need to |
A. | supervised |
B. | semi- supervised |
C. | reinforcement |
D. | clusters |
Answer» C. reinforcement | |
9. |
Which of the following statements are true for a design matrix X Rn d with d > n? (The rows are n sample points and the columns represent d features.) |
A. | least-squares linear regression computes the weights w = (xtx) 1 xty |
B. | the sample points are linearly separable |
C. | x has exactly d n eigenvectors with eigenvalue zero |
D. | at least one principal component direction is orthogonal to a hyperplane that contains all the sample points |
Answer» E. | |
10. |
A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Here feature type is |
A. | nominal |
B. | ordinal |
C. | categorical |
D. | boolean |
Answer» C. categorical | |
11. |
Which of the following properties are characteristic of decision trees?(a) High bias(b) High variance(c) Lack of smoothness of prediction surfaces(d) Unbounded parameter set |
A. | a and b |
B. | a and d |
C. | b, c and d |
D. | all of the above |
Answer» D. all of the above | |
12. |
Which of the following sentences are correct in reference toInformation gain?a. It is biased towards single-valued attributesb. It is biased towards multi-valued attributesc. ID3 makes use of information gaind. The approact used by ID3 is greedy |
A. | a and b |
B. | a and d |
C. | b, c and d |
D. | all of the above |
Answer» D. all of the above | |
13. |
Let S1 and S2 be the set of support vectors and w1 and w2 be the learnt weight vectors for a linearlyseparable problem using hard and soft margin linear SVMs respectively. Which of the following are correct? |
A. | s1 s2 |
B. | s1 may not be a subset of s2 |
C. | w1 = w2 |
D. | all of the above |
Answer» C. w1 = w2 | |
14. |
Imagine, you are solving a classification problems with highly imbalanced class. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case?1. Accuracy metric is not a good idea for imbalanced class problems.2.Accuracy metric is a good idea for imbalanced class problems.3.Precision and recall metrics are good for imbalanced class problems.4.Precision and recall metrics aren t good for imbalanced class problems. |
A. | 1 and 3 |
B. | 1 and 4 |
C. | 2 and 3 |
D. | 2 and 4 |
Answer» B. 1 and 4 | |
15. |
What is/are true about kernel in SVM? 1. Kernel function map low dimensional data to high dimensional space 2. It s a similarity function |
A. | 1 |
B. | 2 |
C. | 1 and 2 |
D. | none of these |
Answer» D. none of these | |
16. |
We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other2. Some times, feature normalization is not feasible in case of categorical variables3. Feature normalization always helps when we use Gaussian kernel in SVM |
A. | 1 |
B. | 1 and 2 |
C. | 1 and 3 |
D. | 2 and 3 |
Answer» C. 1 and 3 | |
17. |
We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization?1.We do feature normalization so that new feature will dominate other2. Some times, feature normalization is not feasible in case of categorical variables3. Feature normalization always helps when we use Gaussian kernel in SVM |
A. | 1 |
B. | 1 and 2 |
C. | 1 and 3 |
D. | 2 and 3 |
Answer» C. 1 and 3 | |
18. |
which can accept a NumPy RandomState generator or an integer seed. |
A. | make_blobs |
B. | random_state |
C. | test_size |
D. | training_size |
Answer» C. test_size | |
19. |
We usually use feature normalization before using the Gaussian kernel in SVM. What is true about feature normalization? 1. We do feature normalization so that new feature will dominate other 2. Some times, feature normalization is not feasible in case of categorical variables3. Feature normalization always helps when we use Gaussian kernel in SVM |
A. | 1 |
B. | 1 and 2 |
C. | 1 and 3 |
D. | 2 and 3 |
Answer» C. 1 and 3 | |
20. |
Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set.Which among the following statements are possible in light of the performance improvement observed?(a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set(b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node(c) The validation set did not have any data points along at least one of the collapsed branches(d) The validation set did have data points adversely affected by the collapsed node |
A. | a and b |
B. | a and d |
C. | b, c and d |
D. | all of the above |
Answer» E. | |
21. |
Which of the following are correct statement(s) about stacking?A machine learning model is trained on predictions of multiple machine learning modelsA Logistic regression will definitely work better in the second stage as compared to other classification methodsFirst stage models are trained on full / partial feature space of training data |
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of above |
Answer» D. all of above | |
22. |
Which of the following can be true for selecting base learners for an ensemble?1. Different learners can come from same algorithm with different hyper parameters2. Different learners can come from different algorithms3. Different learners can come from different training spaces |
A. | 1 |
B. | 2 |
C. | 1 and 3 |
D. | 1, 2 and 3 |
Answer» E. | |
23. |
Suppose you are using stacking with n different machine learning algorithms with k folds on data.Which of the following is true about one level (m base models + 1 stacker) stacking?Note:Here, we are working on binary classification problemAll base models are trained on all featuresYou are using k folds for base models |
A. | you will have only k features after the first stage |
B. | you will have only m features after the first stage |
C. | you will have k+m features after the first stage |
D. | you will have k*n features after the first stage |
Answer» C. you will have k+m features after the first stage | |
24. |
Which of the following are correct statement(s) about stacking? 1. A machine learning model is trained on predictions of multiple machine learning models 2. A Logistic regression will definitely work better in the second stage as compared to other classification methods 3. First stage models are trained on full / partial feature space of training data |
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | 1,2 and 3 |
Answer» D. 1,2 and 3 | |
25. |
Let s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data.You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset? |
A. | All categories of categorical variable are not present in the test dataset. |
B. | Frequency distribution of categories is different in train as compared to the test dataset. |
C. | Train and Test always have same distribution. |
D. | Both A and B |
Answer» E. | |
26. |
Let s say, you are working with categorical feature(s) and you have not looked at the distribution of the categorical variable in the test data. You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges you may face if you have applied OHE on a categorical variable of train dataset? |
A. | all categories of categorical variable are not present in the test dataset. |
B. | frequency distribution of categories is different in train as compared to the test dataset. |
C. | train and test always have same distribution. |
D. | both a and b |
Answer» E. | |
27. |
which of the following cases will K-Means clustering give poor results?1. Data points with outliers2. Data points with different densities3. Data points with round shapes4. Data points with non-convex shapes |
A. | 1 and 2 |
B. | 2 and 3 |
C. | 2 and 4 |
D. | 1, 2 and 4 |
Answer» D. 1, 2 and 4 | |
28. |
In which of the following cases will K-Means clustering fail to give good results?1. Data points with outliers2. Data points with different densities3. Data points with round shapes4. Data points with non-convex shapes |
A. | 1 and 2 |
B. | 2 and 3 |
C. | 2 and 4 |
D. | 1, 2 and 4 |
Answer» E. | |
29. |
What is/are true about ridge regression?1. When lambda is 0, model works like linear regression model2. When lambda is 0, model doesn t work like linear regression model3. When lambda goes to infinity, we get very, very small coefficients approaching 04. When lambda goes to infinity, we get very, very large coefficients approaching infinity |
A. | 1 and 3 |
B. | 1 and 4 |
C. | 2 and 3 |
D. | 2 and 4 |
Answer» B. 1 and 4 | |
30. |
In many classification problems, the target dataset is made up of categorical labels which cannot immediately be processed by any algorithm. An encoding is needed and scikit-learn offers at least valid options |
A. | 1 |
B. | 2 |
C. | 3 |
D. | 4 |
Answer» C. 3 | |
31. |
We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. Now we increase the training set size gradually. As the training set size increases, what do you expect will happen with the mean training error? |
A. | increase |
B. | decrease |
C. | remain constant |
D. | can t say |
Answer» E. | |
32. |
What are the steps for using a gradient descent algorithm?1)Calculate error between the actual value and the predicted value2)Reiterate until you find the best weights of network3)Pass an input through the network and get values from output layer4)Initialize random weight and bias5)Go to each neurons which contributes to the error and change its respective values to reduce the error |
A. | 1, 2, 3, 4, 5 |
B. | 4, 3, 1, 5, 2 |
C. | 3, 2, 1, 5, 4 |
D. | 5, 4, 3, 2, 1 |
Answer» C. 3, 2, 1, 5, 4 | |
33. |
Regarding bias and variance, which of the following statements are true? (Here high and low are relative to the ideal model.(i) Models which overfit are more likely to have high bias(ii) Models which overfit are more likely to have low bias(iii) Models which overfit are more likely to have high variance(iv) Models which overfit are more likely to have low variance |
A. | (i) and (ii) |
B. | (ii) and (iii) |
C. | (iii) and (iv) |
D. | none of these |
Answer» C. (iii) and (iv) | |
34. |
The most popularly used dimensionality reduction algorithm is Principal Component Analysis (PCA). Which of the following is/are true about PCA? 1. PCA is an unsupervised method2. It searches for the directions that data have the largest variance3. Maximum number of principal components <= number of features4. All principal components are orthogonal to each other |
A. | 1 & 2 |
B. | 2 & 3 |
C. | 3 & 4 |
D. | all of the above |
Answer» E. | |
35. |
When it is necessary to allow the model to develop a generalization ability and avoid a common problem called . |
A. | overfitting |
B. | overlearning |
C. | classification |
D. | regression |
Answer» B. overlearning | |
36. |
What is back propagation? a) It is another name given to the curvy function in the perceptron b) It is the transmission of error back through the network to adjust the inputs c) It is the transmission of error back through the network to allow weights to be adjusted so that the network can learn d) None of the mentioned |
A. | a |
B. | b |
C. | c |
D. | b&c |
Answer» D. b&c | |
37. |
Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps?Note: You have more than 1000 models predictions1. Add the models predictions (or in another term take the average) one by one in the ensemble which improves the metrics in the validation set.2. Start with empty ensemble3. Return the ensemble from the nested set of ensembles that has maximum performance on the validation set |
A. | 1-2-3 |
B. | 1-3-4 |
C. | 2-1-3 |
D. | none of above |
Answer» E. | |
38. |
In many classification problems, the target dataset is made up of categorical labels which cannot immediately be processed by any algorithm. An encoding is needed and scikit-learn offers at least valid options |
A. | 1 |
B. | 2 |
C. | 3 |
D. | 4 |
Answer» C. 3 | |
39. |
Which of the following is true about weighted majority votes?1. We want to give higher weights to better performing models2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model3. Voting is special case of weighted voting |
A. | 1 and 3 |
B. | 2 and 3 |
C. | 1 and 2 |
D. | 1, 2 and 3 |
Answer» E. | |
40. |
Given that we can select the same feature multiple times during the recursive partitioning ofthe input space, is it always possible to achieve 100% accuracy on the training data (giventhat we allow for trees to grow to their maximum size) when building decision trees? |
A. | yes |
B. | no |
Answer» C. | |
41. |
Which of the following option is / are correct regarding benefits of ensemble model?1. Better performance2. Generalized models3. Better interpretability |
A. | 1 and 3 |
B. | 2 and 3 |
C. | 1 and 2 |
D. | 1, 2 and 3 |
Answer» D. 1, 2 and 3 | |
42. |
Which of the following option is / are correct regarding benefits of ensemble model? 1. Better performance2. Generalized models3. Better interpretability |
A. | 1 and 3 |
B. | 2 and 3 |
C. | 1, 2 and 3 |
D. | 1 and 2 |
Answer» E. | |
43. |
We have been given a dataset with n records in which we have input attribute as x and output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in training set and test set randomly. What do you expect will happen with bias and variance as you increase the size of training data? |
A. | bias increases and variance increases |
B. | bias decreases and variance increases |
C. | bias decreases and variance decreases |
D. | bias increases and variance decreases |
Answer» E. | |
44. |
Which of the following statement is true about k-NN algorithm?1) k-NN performs much better if all of the data have the same scale2) k-NN works well with a small number of input variables (p), but struggles when the number of inputs is very large3) k-NN makes no assumptions about the functional form of the problem being solved |
A. | 1 and 2 |
B. | 1 and 3 |
C. | only 1 |
D. | 1,2 and 3 |
Answer» E. | |
45. |
Which of the following statement(s) can be true post adding a variable in a linear regression model?1. R-Squared and Adjusted R-squared both increase2. R-Squared increases and Adjusted R-squared decreases3. R-Squared decreases and Adjusted R-squared decreases4. R-Squared decreases and Adjusted R-squared increases |
A. | 1 and 2 |
B. | 1 and 3 |
C. | 2 and 4 |
D. | None of the above |
Answer» B. 1 and 3 | |
46. |
Assume that you are given a data set and a neural network model trained on the data set. Youare asked to build a decision tree model with the sole purpose of understanding/interpretingthe built neural network model. In such a scenario, which among the following measures wouldyou concentrate most on optimising? |
A. | accuracy of the decision tree model on the given data set |
B. | f1 measure of the decision tree model on the given data set |
C. | fidelity of the decision tree model, which is the fraction of instances on which the neural network and the decision tree give the same output |
D. | comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set |
Answer» D. comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set | |
47. |
Which of the following metrics can be used for evaluating regression models?i) R Squaredii) Adjusted R Squarediii) F Statisticsiv) RMSE / MSE / MAE |
A. | ii and iv |
B. | i and ii |
C. | ii, iii and iv |
D. | i, ii, iii and iv |
Answer» E. | |
48. |
In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change? |
A. | by 1 |
B. | no change |
C. | by intercept |
D. | by its slope |
Answer» E. | |
49. |
Naive Bayes classifiers is Learning |
A. | supervised |
B. | unsupervised |
C. | both |
D. | none |
Answer» B. unsupervised | |
50. |
In the last decade, many researchers started training bigger and bigger models, built with several different layers that's why this approach is called . |
A. | deep learning |
B. | machine learning |
C. | reinforcement learning |
D. | unsupervised learning |
Answer» B. machine learning | |