940 + Mcqs in Machine Learning Page 9 McqOptions

401.	The apriori property means
A.	if a set cannot pass a test, its supersets will also fail the same test
B.	to decrease the efficiency, do level-wise generation of frequent item sets
C.	to improve the efficiency, do level-wise generation of frequent item sets d.
D.	if a set can pass a test, its supersets will fail the same test
Answer» B. to decrease the efficiency, do level-wise generation of frequent item sets

Discussion

402.	In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R
A.	exhaustive
B.	inclusive
C.	comprehensive
D.	mutually exclusive
Answer» B. inclusive

Discussion

403.	Which Association Rule would you prefer
A.	high support and medium confidence
B.	high support and low confidence
C.	low support and high confidence
D.	low support and low confidence
Answer» D. low support and low confidence

Discussion

404.	Which statement is true about neural network and linear regression models?
A.	both techniques build models whose output is determined by a linear sum of weighted input attribute values
B.	the output of both models is a categorical attribute value
C.	both models require numeric attributes to range between 0 and 1
D.	both models require input attributes to be numeric
Answer» E.

Discussion

405.	A good clustering method will produce high quality clusters with
A.	high inter class similarity
B.	low intra class similarity
C.	high intra class similarity
D.	no inter class similarity
Answer» D. no inter class similarity

Discussion

406.	Frequent item sets is
A.	superset of only closed frequent item sets
B.	superset of only maximal frequent item sets
C.	subset of maximal frequent item sets
D.	superset of both closed frequent item sets and maximal frequent item sets
Answer» E.

Discussion

407.	The number of iterations in apriori ___________ Select one: a. b. c. d.
A.	increases with the size of the data
B.	decreases with the increase in size of the data
C.	increases with the size of the maximum frequent set
D.	decreases with increase in size of the maximum frequent set
Answer» D. decreases with increase in size of the maximum frequent set

Discussion

408.	This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration
A.	k-means clustering
B.	conceptual clustering
C.	expectation maximization
D.	agglomerative clustering
Answer» B. conceptual clustering

Discussion

409.	Given that we can select the same feature multiple times during the recursive partitioning ofthe input space, is it always possible to achieve 100% accuracy on the training data (giventhat we allow for trees to grow to their maximum size) when building decision trees?
A.	yes
B.	no
Answer» C.

Discussion

410.	Which of the following classifications would best suit the student performance classification systems?
A.	if...then... analysis
B.	market-basket analysis
C.	regression analysis
D.	cluster analysis
Answer» B. market-basket analysis

Discussion

411.	What is the approach of basic algorithm for decision tree induction?
A.	greedy
B.	top down
C.	procedural
D.	step by step
Answer» B. top down

Discussion

412.	Which one of these is a tree based learner?
A.	rule based
B.	bayesian belief network
C.	bayesian classifier
D.	random forest
Answer» E.

Discussion

413.	Which one of these is not a tree based learner?
A.	cart
B.	id3
C.	bayesian classifier
D.	random forest
Answer» D. random forest

Discussion

414.	The distance between two points calculated using Pythagoras theorem is
A.	supremum distance
B.	eucledian distance
C.	linear distance
D.	manhattan distance
Answer» C. linear distance

Discussion

415.	The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support
A.	partitioning
B.	candidate generation
C.	itemset eliminations
D.	pruning
Answer» E.

Discussion

416.	Hierarchical agglomerative clustering is typically visualized as?
A.	dendrogram
B.	binary trees
C.	block diagram
D.	graph
Answer» B. binary trees

Discussion

417.	Which of the following algorithm comes under the classification
A.	apriori
B.	brute force
C.	dbscan
D.	k-nearest neighbor
Answer» E.

Discussion

418.	The most general form of distance is
A.	manhattan
B.	eucledian
C.	mean
D.	minkowski
Answer» C. mean

Discussion

419.	KDD represents extraction of
A.	data
B.	knowledge
C.	rules
D.	model
Answer» C. rules

Discussion

420.	Which statement is true about the K-Means algorithm?
A.	the output attribute must be cateogrical
B.	all attribute values must be categorical
C.	all attributes must be numeric
D.	attribute values may be either categorical or numeric
Answer» D. attribute values may be either categorical or numeric

Discussion

421.	his clustering approach initially assumes that each data instance represents a single cluster.
A.	expectation maximization
B.	k-means clustering
C.	agglomerative clustering
D.	conceptual clustering
Answer» D. conceptual clustering

Discussion

422.	Attribute selection measures are also known as splitting rules.
A.	true
B.	false
Answer» B. false

Discussion

423.	When the number of classes is large Gini index is not a good choice.
A.	true
B.	false
Answer» B. false

Discussion

424.	Gini index does not favour equal sized partitions.
A.	true
B.	false
Answer» C.

Discussion

425.	The gini index is not biased towards multivalued attributed.
A.	true
B.	false
Answer» C.

Discussion

426.	Gain ratio tends to prefer unbalanced splits in which one partition is much smaller than the other
A.	true
B.	false
Answer» B. false

Discussion

427.	Multivariate split is where the partitioning of tuples is based on acombination of attributes rather than on a single attribute.
A.	true
B.	false
Answer» B. false

Discussion

428.	Which of the following sentences are correct in reference toInformation gain?a. It is biased towards single-valued attributesb. It is biased towards multi-valued attributesc. ID3 makes use of information gaind. The approact used by ID3 is greedy
A.	a and b
B.	a and d
C.	b, c and d
D.	all of the above
Answer» D. all of the above

Discussion

429.	What is gini index?
A.	it is a type of index structure
B.	it is a measure of purity
C.	both options except none
D.	none of the options
Answer» C. both options except none

Discussion

430.	What are tree based classifiers?
A.	classifiers which form a tree with each attribute at one level
B.	classifiers which perform series of condition checking with one attributeat a time
C.	both options except none
D.	none of the options
Answer» D. none of the options

Discussion

431.	Choose the correct statement with respect to ‘confidence’ metric in association rules
A.	it is the conditional probability that a randomly selected transaction will include all the items in the consequent given that the transaction includes all the items in the antecedent.
B.	a high value of confidence suggests a weak association rule
C.	it is the probability that a randomly selected transaction will include all the items in the consequent as well as all the items in the antecedent.
D.	confidence is not measured in terms of (estimated) conditional probability.
Answer» B. a high value of confidence suggests a weak association rule

Discussion

432.	How can we best represent ‘support’ for the following association rule: “If X and Y, then Z”.
A.	{x,y}/(total number of transactions)
B.	{z}/(total number of transactions)
C.	{z}/{x,y}
D.	{x,y,z}/(total number of transactions)
Answer» D. {x,y,z}/(total number of transactions)

Discussion

433.	Problem in multi regression is ?
A.	multicollinearity
B.	overfitting
C.	both multicollinearity & overfitting
D.	underfitting
Answer» D. underfitting

Discussion

434.	Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample points and the columns represent d features.)
A.	least-squares linear regression computes theweights w = (xtx)−1 xty
B.	the sample points are linearly separable
C.	x has exactly d − n eigenvectors with eigenvaluezero
D.	at least one principal component direction is orthogonal to a hyperplane that contains all the samplepoints
Answer» E.

Discussion

435.	If X and Y in a regression model are totally unrelated,
A.	the correlation coefficient would be -1
B.	the coefficient of determination would be 0
C.	the coefficient of determination would be 1
D.	the sse would be 0
Answer» C. the coefficient of determination would be 1

Discussion

436.	Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n samplepoints and the columns represent d features.)
A.	least-squares linear regression computes theweights w = (xtx)−1 xty
B.	the sample points are linearly separable
C.	x has exactly d − n eigenvectors with eigenvaluezero
D.	at least one principal component direction is orthogonal to a hyperplane that contains all the samplepoints
Answer» E.

Discussion

437.	In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?
A.	bias will be high, variance will be high
B.	bias will be low, variance will be high
C.	bias will be high, variance will be low
D.	bias will be low, variance will be low
Answer» D. bias will be low, variance will be low

Discussion

438.	Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y. Now Imagine that you are applying linear regression by fitting the best fit line using least square error on this data. You found that correlation coefficient for one of it’s variable(Say X1) with Y is 0.95.
A.	relation between the x1 and y is weak
B.	relation between the x1 and y is strong
C.	relation between the x1 and y is neutral
D.	correlation can’t judge the relationship
Answer» C. relation between the x1 and y is neutral

Discussion

439.	Which of the following indicates the fundamental of least squares?
A.	arithmetic mean should be maximized
B.	arithmetic mean should be zero
C.	arithmetic mean should be neutralized
D.	arithmetic mean should be minimized
Answer» E.

Discussion

440.	Regarding bias and variance, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model. (i) Models which overfit are more likely to have high bias(ii) Models which overfit are more likely to have low bias(iii) Models which overfit are more likely to have high variance(iv) Models which overfit are more likely to have low variance
A.	(i) and (ii)
B.	(ii) and (iii)
C.	(iii) and (iv)
D.	none of these
Answer» C. (iii) and (iv)

Discussion

441.	Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?
A.	you will add more features
B.	you will remove some features
C.	all of the above
D.	none of the above
Answer» B. you will remove some features

Discussion

442.	The selling price of a house depends on many factors. For example, it depends on the number of bedrooms, number of kitchen, number of bathrooms, the year the house was built, and the square footage of the lot. Given these factors, predicting the selling price of the house is an example of ____________ task.
A.	binary classification
B.	multilabel classification
C.	simple linear regression
D.	multiple linear regression
Answer» E.

Discussion

443.	Simple regression assumes a __________ relationship between the input attribute and output attribute.
A.	quadratic
B.	inverse
C.	linear
D.	reciprocal
Answer» D. reciprocal

Discussion

444.	Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?
A.	auc-roc
B.	accuracy
C.	logloss
D.	mean-squared-error
Answer» E.

Discussion

445.	Lasso can be interpreted as least-squares linear regression where
A.	weights are regularized with the l1 norm
B.	the weights have a gaussian prior
C.	weights are regularized with the l2 norm
D.	the solution algorithm is simpler
Answer» B. the weights have a gaussian prior

Discussion

446.	Which of the following methods do we use to best fit the data in Logistic Regression?
A.	least square error
B.	maximum likelihood
C.	jaccard distance
D.	both a and b
Answer» C. jaccard distance

Discussion

447.	Which of the following methods/methods do we use to find the best fit line for data in Linear Regression?
A.	least square error
B.	maximum likelihood
C.	logarithmic loss
D.	both a and b
Answer» B. maximum likelihood

Discussion

448.	Linear Regression is a _______ machine learning algorithm.
A.	supervised
B.	unsupervised
C.	semi-supervised
D.	can\t say
Answer» B. unsupervised

Discussion

449.	Neural networks
A.	optimize a convex cost function
B.	always output values between 0 and 1
C.	can be used for regression as well as classification
D.	all of the above
Answer» D. all of the above

Discussion

450.	The difference between the actual Y value and the predicted Y value found using a regression equation is called the
A.	slope
B.	residual
C.	outlier
D.	scatter plot
Answer» B. residual

Discussion

Explore topic-wise MCQs in Artificial Intelligence.

The apriori property means

In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R

Which Association Rule would you prefer

Which statement is true about neural network and linear regression models?

A good clustering method will produce high quality clusters with

Frequent item sets is

The number of iterations in apriori ___________ Select one: a. b. c. d.

This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration

Given that we can select the same feature multiple times during the recursive partitioning ofthe input space, is it always possible to achieve 100% accuracy on the training data (giventhat we allow for trees to grow to their maximum size) when building decision trees?

Which of the following classifications would best suit the student performance classification systems?

What is the approach of basic algorithm for decision tree induction?

Which one of these is a tree based learner?

Which one of these is not a tree based learner?

The distance between two points calculated using Pythagoras theorem is

The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support

Hierarchical agglomerative clustering is typically visualized as?

Which of the following algorithm comes under the classification

The most general form of distance is

KDD represents extraction of

Which statement is true about the K-Means algorithm?

his clustering approach initially assumes that each data instance represents a single cluster.

Attribute selection measures are also known as splitting rules.

When the number of classes is large Gini index is not a good choice.

Gini index does not favour equal sized partitions.

The gini index is not biased towards multivalued attributed.

Gain ratio tends to prefer unbalanced splits in which one partition is much smaller than the other

Multivariate split is where the partitioning of tuples is based on acombination of attributes rather than on a single attribute.

Which of the following sentences are correct in reference toInformation gain?a. It is biased towards single-valued attributesb. It is biased towards multi-valued attributesc. ID3 makes use of information gaind. The approact used by ID3 is greedy

What is gini index?

What are tree based classifiers?

Choose the correct statement with respect to ‘confidence’ metric in association rules

How can we best represent ‘support’ for the following association rule: “If X and Y, then Z”.

Problem in multi regression is ?

Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n sample points and the columns represent d features.)

If X and Y in a regression model are totally unrelated,

Which of the following statements are true for a design matrix X ∈ Rn×d with d > n? (The rows are n samplepoints and the columns represent d features.)

In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?

Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is Y. Now Imagine that you are applying linear regression by fitting the best fit line using least square error on this data. You found that correlation coefficient for one of it’s variable(Say X1) with Y is 0.95.

Which of the following indicates the fundamental of least squares?

Suppose, you got a situation where you find that your linear regression model is under fitting the data. In such situation which of the following options would you consider?

Simple regression assumes a __________ relationship between the input attribute and output attribute.

Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?

Lasso can be interpreted as least-squares linear regression where

Which of the following methods do we use to best fit the data in Logistic Regression?

Which of the following methods/methods do we use to find the best fit line for data in Linear Regression?

Linear Regression is a _______ machine learning algorithm.

Neural networks

The difference between the actual Y value and the predicted Y value found using a regression equation is called the