MCQOPTIONS
Saved Bookmarks
This section includes 940 Mcqs, each offering curated multiple-choice questions to sharpen your Artificial Intelligence knowledge and support exam preparation. Choose a topic below to get started.
| 151. |
Some people are using the term ___ instead of prediction only to avoid the weird idea that machine learning is a sort of modern magic. |
| A. | Inference |
| B. | Interference |
| C. | Accuracy |
| D. | None of above |
| Answer» B. Interference | |
| 152. |
If there is only a discrete number of possible outcomes called _____. |
| A. | Modelfree |
| B. | Categories |
| C. | Prediction |
| D. | None of above |
| Answer» C. Prediction | |
| 153. |
The SVM’s are less effective when: |
| A. | The data is linearly separable |
| B. | The data is clean and ready to use |
| C. | The data is noisy and contains overlapping points |
| Answer» D. | |
| 154. |
Suppose you are building a SVM model on data X. The data X can be error prone which means that you should not trust any specific data point too much. Now think that you want to build a SVM model which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper parameter.What would happen when you use very small C (C~0)? |
| A. | Misclassification would happen |
| B. | Data will be correctly classified |
| C. | Can’t say |
| D. | None of these |
| Answer» B. Data will be correctly classified | |
| 155. |
Which of the following statement(s) can be true post adding a variable in a linear regression model?1. R-Squared and Adjusted R-squared both increase2. R-Squared increases and Adjusted R-squared decreases3. R-Squared decreases and Adjusted R-squared decreases4. R-Squared decreases and Adjusted R-squared increases |
| A. | 1 and 2 |
| B. | 1 and 3 |
| C. | 2 and 4 |
| D. | None of the above |
| Answer» B. 1 and 3 | |
| 156. |
What is/are true about kernel in SVM?1. Kernel function map low dimensional data to high dimensional space2. It’s a similarity function |
| A. | 1 |
| B. | 2 |
| C. | 1 and 2 |
| D. | None of these |
| Answer» D. None of these | |
| 157. |
Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection? |
| A. | Ridge regression uses subset selection of features |
| B. | Lasso regression uses subset selection of features |
| C. | Both use subset selection of features |
| D. | None of above |
| Answer» C. Both use subset selection of features | |
| 158. |
which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most. |
| A. | The polynomial degree |
| B. | Whether we learn the weights by matrix inversion or gradient descent |
| C. | The use of a constant-term |
| Answer» B. Whether we learn the weights by matrix inversion or gradient descent | |
| 159. |
To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited? |
| A. | Scatter plot |
| B. | Barchart |
| C. | Histograms |
| D. | None of these |
| Answer» B. Barchart | |
| 160. |
In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true? |
| A. | If R Squared increases, this variable is significant. |
| B. | If R Squared decreases, this variable is not significant. |
| C. | Individually R squared cannot tell about variable importance. We can’t say anything about it right now. |
| D. | None of these. |
| Answer» D. None of these. | |
| 161. |
Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true? |
| A. | You will always have test error zero |
| B. | You can not have test error zero |
| C. | None of the above |
| Answer» D. | |
| 162. |
______allows exploiting the natural sparsity of data while extracting principal components. |
| A. | SparsePCA |
| B. | KernelPCA |
| C. | SVD |
| D. | init parameter |
| Answer» B. KernelPCA | |
| 163. |
The_____ parameter can assume different values which determine how the data matrix is initially processed. |
| A. | run |
| B. | start |
| C. | init |
| D. | stop |
| Answer» D. stop | |
| 164. |
In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____. |
| A. | Concuttent matrix |
| B. | Convergance matrix |
| C. | Supportive matrix |
| D. | Covariance matrix |
| Answer» E. | |
| 165. |
______dataset with many features contains information proportional to the independence of all features and their variance. |
| A. | normalized |
| B. | unnormalized |
| C. | Both A & B |
| D. | None of the Mentioned |
| Answer» C. Both A & B | |
| 166. |
scikit-learn also provides a class for per-sample normalization,_____ |
| A. | Normalizer |
| B. | Imputer |
| C. | Classifier |
| D. | All above |
| Answer» B. Imputer | |
| 167. |
scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency |
| A. | LabelEncoder |
| B. | LabelBinarizer |
| C. | DictVectorizer |
| D. | Imputer |
| Answer» E. | |
| 168. |
_______produce sparse matrices of real numbers that can be fed into any machine learning model. |
| A. | DictVectorizer |
| B. | FeatureHasher |
| C. | Both A & B |
| D. | None of the Mentioned |
| Answer» D. None of the Mentioned | |
| 169. |
While using _____ all labels areturned into sequential numbers. |
| A. | LabelEncoder class |
| B. | LabelBinarizer class |
| C. | DictVectorizer |
| D. | FeatureHasher |
| Answer» B. LabelBinarizer class | |
| 170. |
_____ provides some built-in datasets that can be used for testing purposes. |
| A. | scikit-learn |
| B. | classification |
| C. | regression |
| D. | None of the above |
| Answer» B. classification | |
| 171. |
Which of the following are several models for feature extraction |
| A. | regression |
| B. | classification |
| C. | None of the above |
| Answer» D. | |
| 172. |
overlearning causes due to an excessive ______. |
| A. | Capacity |
| B. | Regression |
| C. | Reinforcement |
| D. | Accuracy |
| Answer» B. Regression | |
| 173. |
A supervised scenario is characterized by the concept of a _____. |
| A. | Programmer |
| B. | Teacher |
| C. | Author |
| D. | Farmer |
| Answer» C. Author | |
| 174. |
Techniques involve the usage of both labeled and unlabeled data is called___. |
| A. | Supervised |
| B. | Semi-supervised |
| C. | Unsupervised |
| D. | None of the above |
| Answer» C. Unsupervised | |
| 175. |
When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______. |
| A. | Overfitting |
| B. | Overlearning |
| C. | Classification |
| D. | Regression |
| Answer» B. Overlearning | |
| 176. |
Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment |
| A. | Supervised |
| B. | Reinforcement |
| C. | Unsupervised |
| D. | None of the above |
| Answer» C. Unsupervised | |
| 177. |
The linear SVM classifier works by drawing a straight line between two classes |
| A. | True |
| B. | false |
| Answer» B. false | |
| 178. |
SVM is a ------------------ learning |
| A. | Supervised |
| B. | Unsupervised |
| C. | Both |
| D. | None |
| Answer» B. Unsupervised | |
| 179. |
SVM is a ------------------ algorithm |
| A. | Classification |
| B. | Clustering |
| C. | Regression |
| D. | All |
| Answer» B. Clustering | |
| 180. |
Gaussian distribution when plotted, gives a bell shaped curve which is symmetric about the _______ of the feature values. |
| A. | Mean |
| B. | Variance |
| C. | Discrete |
| D. | Random |
| Answer» B. Variance | |
| 181. |
Gaussian Naïve Bayes Classifier is ___________distribution |
| A. | Continuous |
| B. | Discrete |
| C. | Binary |
| Answer» B. Discrete | |
| 182. |
Multinomial Naïve Bayes Classifier is ___________distribution |
| A. | Continuous |
| B. | Discrete |
| C. | Binary |
| Answer» C. Binary | |
| 183. |
Bernoulli Naïve Bayes Classifier is ___________distribution |
| A. | Continuous |
| B. | Discrete |
| C. | Binary |
| Answer» D. | |
| 184. |
Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. |
| A. | True |
| B. | false |
| Answer» B. false | |
| 185. |
Conditional probability is a measure of the probability of an event given that another event has already occurred. |
| A. | True |
| B. | false |
| Answer» B. false | |
| 186. |
Features being classified is __________ of each other in Naïve Bayes Classifier |
| A. | Independent |
| B. | Dependent |
| C. | Partial Dependent |
| D. | None |
| Answer» B. Dependent | |
| 187. |
Features being classified is independent of each other in Naïve Bayes Classifier |
| A. | False |
| B. | true |
| Answer» C. | |
| 188. |
Naive Bayes classifiers is _______________ Learning |
| A. | Supervised |
| B. | Unsupervised |
| C. | Both |
| D. | None |
| Answer» B. Unsupervised | |
| 189. |
Naive Bayes classifiers are a collection ------------------of algorithms |
| A. | Classification |
| B. | Clustering |
| C. | Regression |
| D. | All |
| Answer» B. Clustering | |
| 190. |
In the mathematical Equation of Linear Regression Y = β1 + β2X + ϵ, (β1, β2) refers to __________ |
| A. | (X-intercept, Slope) |
| B. | (Slope, X-Intercept) |
| C. | (Y-Intercept, Slope) |
| D. | (slope, Y-Intercept) |
| Answer» D. (slope, Y-Intercept) | |
| 191. |
In syntax of linear model lm(formula,data,..), data refers to ______ |
| A. | Matrix |
| B. | Vector |
| C. | Array |
| D. | List |
| Answer» C. Array | |
| 192. |
Function used for linear regression in R is __________ |
| A. | lm(formula, data) |
| B. | lr(formula, data) |
| C. | lrm(formula, data) |
| D. | regression.linear(formula, data) |
| Answer» B. lr(formula, data) | |
| 193. |
If Linear regression model perfectly first i.e., train error is zero, then _____________________ |
| A. | Test error is also always zero |
| B. | Test error is non zero |
| C. | Couldn’t comment on Test error |
| D. | Test error is equal to Train error |
| Answer» D. Test error is equal to Train error | |
| 194. |
In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm. |
| A. | random_state |
| B. | dataset |
| C. | test_size |
| D. | All above |
| Answer» C. test_size | |
| 195. |
_______adopts a dictionary-oriented approach, associating to each category label a progressive integer number. |
| A. | LabelEncoder class |
| B. | LabelBinarizer class |
| C. | DictVectorizer |
| D. | FeatureHasher |
| Answer» B. LabelBinarizer class | |
| 196. |
The parameter______ allows specifying the percentage of elements to put into the test/training set |
| A. | test_size |
| B. | training_size |
| C. | All above |
| D. | None of these |
| Answer» D. None of these | |
| 197. |
The parameter can assume different values which determine how the data matrix is initially processed. |
| A. | run |
| B. | start |
| C. | init |
| D. | stop |
| Answer» D. stop | |
| 198. |
________performs a PCA with non-linearly separable data sets. |
| A. | SparsePCA |
| B. | KernelPCA |
| C. | SVD |
| D. | None of the Mentioned |
| Answer» C. SVD | |
| 199. |
There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________. |
| A. | F-tests and p-values |
| B. | chi-square |
| C. | ANOVA |
| D. | All above |
| Answer» B. chi-square | |
| 200. |
If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________. |
| A. | RobustScaler |
| B. | DictVectorizer |
| C. | LabelBinarizer |
| D. | FeatureHasher |
| Answer» B. DictVectorizer | |