607 + Mcqs in Machine Learning Page 11 McqOptions

501.	This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
A.	agglomerative clustering
B.	expectation maximization
C.	conceptual clustering
D.	k-means clustering
Answer» E.

Discussion

502.	This clustering algorithm initially assumes that each data instance represents a single cluster.
A.	agglomerative clustering
B.	conceptual clustering
C.	k-means clustering
D.	expectation maximization
Answer» D. expectation maximization

Discussion

503.	With Bayes classifier, missing data items are
A.	treated as equal compares.
B.	treated as unequal compares.
C.	replaced with a default value.
D.	ignored.
Answer» C. replaced with a default value.

Discussion

504.	This supervised learning technique can process both numeric and categorical input attributes.
A.	linear regression
B.	bayes classifier
C.	logistic regression
D.	backpropagation learning
Answer» B. bayes classifier

Discussion

505.	This technique associates a conditional probability value with each data instance.
A.	linear regression
B.	logistic regression
C.	simple regression
D.	multiple linear regression
Answer» C. simple regression

Discussion

506.	Logistic regression is a ________ regression technique that is used to model data having a _____outcome.
A.	linear, numeric
B.	linear, binary
C.	nonlinear, numeric
D.	nonlinear, binary
Answer» E.

Discussion

507.	The leaf nodes of a model tree are
A.	averages of numeric output attribute values.
B.	nonlinear regression equations.
C.	linear regression equations.
D.	sums of numeric output attribute values.
Answer» D. sums of numeric output attribute values.

Discussion

508.	Regression trees are often used to model _______ data.
A.	linear
B.	nonlinear
C.	categorical
D.	symmetrical
Answer» C. categorical

Discussion

509.	Simple regression assumes a __________ relationship between the input attribute and output attribute.
A.	linear
B.	quadratic
C.	reciprocal
D.	inverse
Answer» B. quadratic

Discussion

510.	The average squared difference between classifier predicted output and actual output.
A.	mean squared error
B.	root mean squared error
C.	mean absolute error
D.	mean relative error
Answer» B. root mean squared error

Discussion

511.	The correlation coefficient for two real-valued attributes is â€“0.85. What does this value tell you?
A.	the attributes are not linearly related.
B.	as the value of one attribute increases the value of the second attribute also increases.
C.	as the value of one attribute decreases the value of the second attribute increases.
D.	the attributes show a curvilinear relationship.
Answer» D. the attributes show a curvilinear relationship.

Discussion

512.	Bootstrapping allows us to
A.	choose the same training instance several times.
B.	choose the same test set instance several times.
C.	build models with alternative subsets of the training data several times.
D.	test a model with alternative subsets of the test data several times.
Answer» B. choose the same test set instance several times.

Discussion

513.	Data used to optimize the parameter settings of a supervised learner model.
A.	training
B.	test
C.	verification
D.	validation
Answer» E.

Discussion

514.	The standard error is defined as the square root of this computation.
A.	the sample variance divided by the total number of sample instances.
B.	the population variance divided by the total number of sample instances.
C.	the sample variance divided by the sample mean.
D.	the population variance divided by the sample mean.
Answer» B. the population variance divided by the total number of sample instances.

Discussion

515.	Selecting data so as to assure that each class is properly represented in both the training and test set.
A.	cross validation
B.	stratification
C.	verification
D.	bootstrapping
Answer» C. verification

Discussion

516.	The average positive difference between computed and desired outcome values.
A.	root mean squared error
B.	mean squared error
C.	mean absolute error
D.	mean positive error
Answer» E.

Discussion

517.	Which of the following is a common use of unsupervised clustering?
A.	detect outliers
B.	determine a best set of input attributes for supervised learning
C.	evaluate the likely performance of a supervised learner model
D.	determine if meaningful relationships can be found in a dataset
Answer» B. determine a best set of input attributes for supervised learning

Discussion

518.	Which statement is true about prediction problems?
A.	the output attribute must be categorical.
B.	the output attribute must be numeric.
C.	the resultant model is designed to determine future outcomes.
D.	the resultant model is designed to classify current behavior.
Answer» E.

Discussion

519.	Classification problems are distinguished from estimation problems in that
A.	classification problems require the output attribute to be numeric.
B.	classification problems require the output attribute to be categorical.
C.	classification problems do not allow an output attribute.
D.	classification problems are designed to predict future outcome.
Answer» D. classification problems are designed to predict future outcome.

Discussion

520.	Imagine a Newly-Born starts to learn walking. It will try to find a suitable policy to learn walking after repeated falling and getting up.specify what type of machine learning is best suited?
A.	classification
B.	regression
C.	kmeans algorithm
D.	reinforcement learning
Answer» E.

Discussion

521.	A nearest neighbor approach is best used
A.	with large-sized datasets.
B.	when irrelevant attributes have been removed from the data.
C.	when a generalized model of the data is desirable.
D.	when an explanation of what has been found is of primary importance.
Answer» C. when a generalized model of the data is desirable.

Discussion

522.	For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of determination is
A.	0.25
B.	4.00
C.	0.75
D.	none of the above
Answer» C. 0.75

Discussion

523.	The multiple coefficient of determination is computed by
A.	dividing ssr by sst
B.	dividing sst by ssr
C.	dividing sst by sse
D.	none of the above
Answer» D. none of the above

Discussion

524.	Another name for an output attribute.
A.	predictive variable
B.	independent variable
C.	estimated variable
D.	dependent variable
Answer» C. estimated variable

Discussion

525.	The adjusted multiple coefficient of determination accounts for
A.	the number of dependent variables in the model
B.	the number of independent variables in the model
C.	unusually large predictors
D.	none of the above
Answer» E.

Discussion

526.	A measure of goodness of fit for the estimated regression equation is the
A.	multiple coefficient of determination
B.	mean square due to error
C.	mean square due to regression
D.	none of the above
Answer» D. none of the above

Discussion

527.	A multiple regression model has
A.	only one independent variable
B.	more than one dependent variable
C.	more than one independent variable
D.	none of the above
Answer» C. more than one independent variable

Discussion

528.	A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2 constant), y will
A.	increase by 3 units
B.	decrease by 3 units
C.	increase by 4 units
D.	decrease by 4 units
Answer» D. decrease by 4 units

Discussion

529.	A term used to describe the case when the independent variables in a multiple regression model are correlated is
A.	regression
B.	correlation
C.	multicollinearity
D.	none of the above
Answer» D. none of the above

Discussion

530.	A regression model in which more than one independent variable is used to predict the dependent variable is called
A.	a simple linear regression model
B.	a multiple regression models
C.	an independent model
D.	none of the above
Answer» D. none of the above

Discussion

531.	Supervised learning differs from unsupervised clustering in that supervised learning requires
A.	at least one input attribute.
B.	input attributes to be categorical.
C.	at least one output attribute.
D.	output attributes to be categorical.
Answer» C. at least one output attribute.

Discussion

532.	Supervised learning and unsupervised clustering both require at least one
A.	hidden attribute.
B.	output attribute.
C.	input attribute.
D.	categorical attribute.
Answer» B. output attribute.

Discussion

533.	Computers are best at learning
A.	facts.
B.	concepts.
C.	procedures.
D.	principles.
Answer» B. concepts.

Discussion

534.	The process of forming general concept definitions from examples of concepts to be learned.
A.	deduction
B.	abduction
C.	induction
D.	conjunction
Answer» D. conjunction

Discussion

535.	The effectiveness of an SVM depends upon:
A.	selection of kernel
B.	kernel parameters
C.	soft margin parameter c
D.	all of the above
Answer» E.

Discussion

536.	What can be major issue in Leave-One-Out-Cross-Validation(LOOCV)?
A.	low variance
B.	high variance
C.	faster runtime compared to k-fold cross validation
D.	slower runtime compared to normal validation
Answer» C. faster runtime compared to k-fold cross validation

Discussion

537.	A student Grade is a variable F1 which takes a value from A,B,C and D. Which of the following is True in the following case?
A.	variable f1 is an example of nominal variable
B.	variable f1 is an example of ordinal variable
C.	it doesn\t belong to any of the mentioned categories
D.	it belongs to both ordinal and nominal category
Answer» C. it doesn\t belong to any of the mentioned categories

Discussion

538.	PCA works better if there is 1. A linear structure in the data 2. If the data lies on a curved surface and not on a flat surface 3. If variables are scaled in the same unit
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	1,2 and 3
Answer» D. 1,2 and 3

Discussion

539.	You are given sesimic data and you want to predict next earthquake , this is an example of
A.	supervised learning
B.	reinforcement learning
C.	unsupervised learning
D.	dimensionality reduction
Answer» B. reinforcement learning

Discussion

540.	Prediction is
A.	the result of application of specific theory or rule in a specific case
B.	discipline in statistics used to find projections in multidimensional data
C.	value entered in database by expert
D.	independent of data
Answer» B. discipline in statistics used to find projections in multidimensional data

Discussion

541.	Which of the folllowing is an example of feature extraction?
A.	construction bag of words from an email
B.	applying pca to project high dimensional data
C.	removing stop words
D.	forward selection
Answer» C. removing stop words

Discussion

542.	Which of the following is a reasonable way to select the number of principal components "k"?
A.	choose k to be the smallest value so that at least 99% of the varinace is retained. - answer
B.	choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).
C.	choose k to be the largest value so that 99% of the variance is retained.
D.	use the elbow method
Answer» B. choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer).

Discussion

543.	The most popularly used dimensionality reduction algorithm is Principal Component Analysis (PCA). Which of the following is/are true about PCA? 1. PCA is an unsupervised method 2. It searches for the directions that data have the largest variance 3. Maximum number of principal components
A.	1 & 2
B.	2 & 3
C.	3 & 4
D.	all of the above
Answer» E.

Discussion

544.	Feature can be used as a
A.	binary split
B.	predictor
C.	both a and b??
D.	none of the above
Answer» D. none of the above

Discussion

545.	A measurable property or parameter of the data-set is
A.	training data
B.	feature
C.	test data
D.	validation data
Answer» C. test data

Discussion

546.	Following are the descriptive models
A.	clustering
B.	classification
C.	association rule
D.	both a and c
Answer» E.

Discussion

547.	If machine learning model output doesnot involves target variable then that model is called as
A.	descriptive model
B.	predictive model
C.	reinforcement learning
D.	all of the above
Answer» B. predictive model

Discussion

548.	In simple term, machine learning is
A.	training based on historical data
B.	prediction to answer a query
C.	both a and b??
D.	automization of complex tasks
Answer» D. automization of complex tasks

Discussion

549.	The "curse of dimensionality" referes
A.	all the problems that arise when working with data in the higher dimensions, that did not exist in the lower dimensions.
B.	all the problems that arise when working with data in the lower dimensions, that did not exist in the higher dimensions.
C.	all the problems that arise when working with data in the lower dimensions, that did not exist in the lower dimensions.
D.	all the problems that arise when working with data in the higher dimensions, that did not exist in the higher dimensions.
Answer» B. all the problems that arise when working with data in the lower dimensions, that did not exist in the higher dimensions.

Discussion

550.	Select the correct answers for following statements. 1. Filter methods are much faster compared to wrapper methods. 2. Wrapper methods use statistical methods for evaluation of a subset of features while Filter methods use cross validation.
A.	both are true
B.	1 is true and 2 is false
C.	both are false
D.	1 is false and 2 is true
Answer» C. both are false

Discussion

Explore topic-wise MCQs in Computer Science Engineering (CSE).

This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

This clustering algorithm initially assumes that each data instance represents a single cluster.

With Bayes classifier, missing data items are

This supervised learning technique can process both numeric and categorical input attributes.

This technique associates a conditional probability value with each data instance.

Logistic regression is a ________ regression technique that is used to model data having a _____outcome.

The leaf nodes of a model tree are

Regression trees are often used to model _______ data.

Simple regression assumes a __________ relationship between the input attribute and output attribute.

The average squared difference between classifier predicted output and actual output.

The correlation coefficient for two real-valued attributes is â€“0.85. What does this value tell you?

Bootstrapping allows us to

Data used to optimize the parameter settings of a supervised learner model.

The standard error is defined as the square root of this computation.

Selecting data so as to assure that each class is properly represented in both the training and test set.

The average positive difference between computed and desired outcome values.

Which of the following is a common use of unsupervised clustering?

Which statement is true about prediction problems?

Classification problems are distinguished from estimation problems in that

Imagine a Newly-Born starts to learn walking. It will try to find a suitable policy to learn walking after repeated falling and getting up.specify what type of machine learning is best suited?

A nearest neighbor approach is best used

For a multiple regression model, SST = 200 and SSE = 50. The multiple coefficient of determination is

The multiple coefficient of determination is computed by

Another name for an output attribute.

The adjusted multiple coefficient of determination accounts for

A measure of goodness of fit for the estimated regression equation is the

A multiple regression model has

A multiple regression model has the form: y = 2 + 3x1 + 4x2. As x1 increases by 1 unit (holding x2 constant), y will

A term used to describe the case when the independent variables in a multiple regression model are correlated is

A regression model in which more than one independent variable is used to predict the dependent variable is called

Supervised learning differs from unsupervised clustering in that supervised learning requires

Supervised learning and unsupervised clustering both require at least one

Computers are best at learning

The process of forming general concept definitions from examples of concepts to be learned.

The effectiveness of an SVM depends upon:

What can be major issue in Leave-One-Out-Cross-Validation(LOOCV)?

A student Grade is a variable F1 which takes a value from A,B,C and D. Which of the following is True in the following case?

PCA works better if there is 1. A linear structure in the data 2. If the data lies on a curved surface and not on a flat surface 3. If variables are scaled in the same unit

You are given sesimic data and you want to predict next earthquake , this is an example of

Prediction is

Which of the folllowing is an example of feature extraction?

Which of the following is a reasonable way to select the number of principal components "k"?

The most popularly used dimensionality reduction algorithm is Principal Component Analysis (PCA). Which of the following is/are true about PCA? 1. PCA is an unsupervised method 2. It searches for the directions that data have the largest variance 3. Maximum number of principal components

Feature can be used as a

A measurable property or parameter of the data-set is

Following are the descriptive models

If machine learning model output doesnot involves target variable then that model is called as

In simple term, machine learning is

The "curse of dimensionality" referes

Select the correct answers for following statements. 1. Filter methods are much faster compared to wrapper methods. 2. Wrapper methods use statistical methods for evaluation of a subset of features while Filter methods use cross validation.

Logistic regression is a ____ regression technique that is used to model data having a _outcome.