

MCQOPTIONS
Saved Bookmarks
This section includes 940 Mcqs, each offering curated multiple-choice questions to sharpen your Artificial Intelligence knowledge and support exam preparation. Choose a topic below to get started.
151. |
Some people are using the term ___ instead of prediction only to avoid the weird idea that machine learning is a sort of modern magic. |
A. | Inference |
B. | Interference |
C. | Accuracy |
D. | None of above |
Answer» B. Interference | |
152. |
If there is only a discrete number of possible outcomes called _____. |
A. | Modelfree |
B. | Categories |
C. | Prediction |
D. | None of above |
Answer» C. Prediction | |
153. |
The SVM’s are less effective when: |
A. | The data is linearly separable |
B. | The data is clean and ready to use |
C. | The data is noisy and contains overlapping points |
Answer» D. | |
154. |
Suppose you are building a SVM model on data X. The data X can be error prone which means that you should not trust any specific data point too much. Now think that you want to build a SVM model which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper parameter.What would happen when you use very small C (C~0)? |
A. | Misclassification would happen |
B. | Data will be correctly classified |
C. | Can’t say |
D. | None of these |
Answer» B. Data will be correctly classified | |
155. |
Which of the following statement(s) can be true post adding a variable in a linear regression model?1. R-Squared and Adjusted R-squared both increase2. R-Squared increases and Adjusted R-squared decreases3. R-Squared decreases and Adjusted R-squared decreases4. R-Squared decreases and Adjusted R-squared increases |
A. | 1 and 2 |
B. | 1 and 3 |
C. | 2 and 4 |
D. | None of the above |
Answer» B. 1 and 3 | |
156. |
What is/are true about kernel in SVM?1. Kernel function map low dimensional data to high dimensional space2. It’s a similarity function |
A. | 1 |
B. | 2 |
C. | 1 and 2 |
D. | None of these |
Answer» D. None of these | |
157. |
Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection? |
A. | Ridge regression uses subset selection of features |
B. | Lasso regression uses subset selection of features |
C. | Both use subset selection of features |
D. | None of above |
Answer» C. Both use subset selection of features | |
158. |
which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most. |
A. | The polynomial degree |
B. | Whether we learn the weights by matrix inversion or gradient descent |
C. | The use of a constant-term |
Answer» B. Whether we learn the weights by matrix inversion or gradient descent | |
159. |
To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited? |
A. | Scatter plot |
B. | Barchart |
C. | Histograms |
D. | None of these |
Answer» B. Barchart | |
160. |
In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true? |
A. | If R Squared increases, this variable is significant. |
B. | If R Squared decreases, this variable is not significant. |
C. | Individually R squared cannot tell about variable importance. We can’t say anything about it right now. |
D. | None of these. |
Answer» D. None of these. | |
161. |
Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true? |
A. | You will always have test error zero |
B. | You can not have test error zero |
C. | None of the above |
Answer» D. | |
162. |
______allows exploiting the natural sparsity of data while extracting principal components. |
A. | SparsePCA |
B. | KernelPCA |
C. | SVD |
D. | init parameter |
Answer» B. KernelPCA | |
163. |
The_____ parameter can assume different values which determine how the data matrix is initially processed. |
A. | run |
B. | start |
C. | init |
D. | stop |
Answer» D. stop | |
164. |
In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____. |
A. | Concuttent matrix |
B. | Convergance matrix |
C. | Supportive matrix |
D. | Covariance matrix |
Answer» E. | |
165. |
______dataset with many features contains information proportional to the independence of all features and their variance. |
A. | normalized |
B. | unnormalized |
C. | Both A & B |
D. | None of the Mentioned |
Answer» C. Both A & B | |
166. |
scikit-learn also provides a class for per-sample normalization,_____ |
A. | Normalizer |
B. | Imputer |
C. | Classifier |
D. | All above |
Answer» B. Imputer | |
167. |
scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency |
A. | LabelEncoder |
B. | LabelBinarizer |
C. | DictVectorizer |
D. | Imputer |
Answer» E. | |
168. |
_______produce sparse matrices of real numbers that can be fed into any machine learning model. |
A. | DictVectorizer |
B. | FeatureHasher |
C. | Both A & B |
D. | None of the Mentioned |
Answer» D. None of the Mentioned | |
169. |
While using _____ all labels areturned into sequential numbers. |
A. | LabelEncoder class |
B. | LabelBinarizer class |
C. | DictVectorizer |
D. | FeatureHasher |
Answer» B. LabelBinarizer class | |
170. |
_____ provides some built-in datasets that can be used for testing purposes. |
A. | scikit-learn |
B. | classification |
C. | regression |
D. | None of the above |
Answer» B. classification | |
171. |
Which of the following are several models for feature extraction |
A. | regression |
B. | classification |
C. | None of the above |
Answer» D. | |
172. |
overlearning causes due to an excessive ______. |
A. | Capacity |
B. | Regression |
C. | Reinforcement |
D. | Accuracy |
Answer» B. Regression | |
173. |
A supervised scenario is characterized by the concept of a _____. |
A. | Programmer |
B. | Teacher |
C. | Author |
D. | Farmer |
Answer» C. Author | |
174. |
Techniques involve the usage of both labeled and unlabeled data is called___. |
A. | Supervised |
B. | Semi-supervised |
C. | Unsupervised |
D. | None of the above |
Answer» C. Unsupervised | |
175. |
When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______. |
A. | Overfitting |
B. | Overlearning |
C. | Classification |
D. | Regression |
Answer» B. Overlearning | |
176. |
Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment |
A. | Supervised |
B. | Reinforcement |
C. | Unsupervised |
D. | None of the above |
Answer» C. Unsupervised | |
177. |
The linear SVM classifier works by drawing a straight line between two classes |
A. | True |
B. | false |
Answer» B. false | |
178. |
SVM is a ------------------ learning |
A. | Supervised |
B. | Unsupervised |
C. | Both |
D. | None |
Answer» B. Unsupervised | |
179. |
SVM is a ------------------ algorithm |
A. | Classification |
B. | Clustering |
C. | Regression |
D. | All |
Answer» B. Clustering | |
180. |
Gaussian distribution when plotted, gives a bell shaped curve which is symmetric about the _______ of the feature values. |
A. | Mean |
B. | Variance |
C. | Discrete |
D. | Random |
Answer» B. Variance | |
181. |
Gaussian Naïve Bayes Classifier is ___________distribution |
A. | Continuous |
B. | Discrete |
C. | Binary |
Answer» B. Discrete | |
182. |
Multinomial Naïve Bayes Classifier is ___________distribution |
A. | Continuous |
B. | Discrete |
C. | Binary |
Answer» C. Binary | |
183. |
Bernoulli Naïve Bayes Classifier is ___________distribution |
A. | Continuous |
B. | Discrete |
C. | Binary |
Answer» D. | |
184. |
Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. |
A. | True |
B. | false |
Answer» B. false | |
185. |
Conditional probability is a measure of the probability of an event given that another event has already occurred. |
A. | True |
B. | false |
Answer» B. false | |
186. |
Features being classified is __________ of each other in Naïve Bayes Classifier |
A. | Independent |
B. | Dependent |
C. | Partial Dependent |
D. | None |
Answer» B. Dependent | |
187. |
Features being classified is independent of each other in Naïve Bayes Classifier |
A. | False |
B. | true |
Answer» C. | |
188. |
Naive Bayes classifiers is _______________ Learning |
A. | Supervised |
B. | Unsupervised |
C. | Both |
D. | None |
Answer» B. Unsupervised | |
189. |
Naive Bayes classifiers are a collection ------------------of algorithms |
A. | Classification |
B. | Clustering |
C. | Regression |
D. | All |
Answer» B. Clustering | |
190. |
In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to __________ |
A. | (X-intercept, Slope) |
B. | (Slope, X-Intercept) |
C. | (Y-Intercept, Slope) |
D. | (slope, Y-Intercept) |
Answer» D. (slope, Y-Intercept) | |
191. |
In syntax of linear model lm(formula,data,..), data refers to ______ |
A. | Matrix |
B. | Vector |
C. | Array |
D. | List |
Answer» C. Array | |
192. |
Function used for linear regression in R is __________ |
A. | lm(formula, data) |
B. | lr(formula, data) |
C. | lrm(formula, data) |
D. | regression.linear(formula, data) |
Answer» B. lr(formula, data) | |
193. |
If Linear regression model perfectly first i.e., train error is zero, then _____________________ |
A. | Test error is also always zero |
B. | Test error is non zero |
C. | Couldn’t comment on Test error |
D. | Test error is equal to Train error |
Answer» D. Test error is equal to Train error | |
194. |
In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm. |
A. | random_state |
B. | dataset |
C. | test_size |
D. | All above |
Answer» C. test_size | |
195. |
_______adopts a dictionary-oriented approach, associating to each category label a progressive integer number. |
A. | LabelEncoder class |
B. | LabelBinarizer class |
C. | DictVectorizer |
D. | FeatureHasher |
Answer» B. LabelBinarizer class | |
196. |
The parameter______ allows specifying the percentage of elements to put into the test/training set |
A. | test_size |
B. | training_size |
C. | All above |
D. | None of these |
Answer» D. None of these | |
197. |
The parameter can assume different values which determine how the data matrix is initially processed. |
A. | run |
B. | start |
C. | init |
D. | stop |
Answer» D. stop | |
198. |
________performs a PCA with non-linearly separable data sets. |
A. | SparsePCA |
B. | KernelPCA |
C. | SVD |
D. | None of the Mentioned |
Answer» C. SVD | |
199. |
There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________. |
A. | F-tests and p-values |
B. | chi-square |
C. | ANOVA |
D. | All above |
Answer» B. chi-square | |
200. |
If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________. |
A. | RobustScaler |
B. | DictVectorizer |
C. | LabelBinarizer |
D. | FeatureHasher |
Answer» B. DictVectorizer | |