Explore topic-wise MCQs in Testing Subject.

This section includes 657 Mcqs, each offering curated multiple-choice questions to sharpen your Testing Subject knowledge and support exam preparation. Choose a topic below to get started.

1.

____________ is a subproject with the aim of collecting and distributing free materials.

A. OSR
B. OPR
C. ORP
D. ORS
Answer» D. ORS
2.

____________ sink can be a text file, the console display, a simple HDFS path, or a null bucket where the data is simply deleted.

A. Collector Tier Event
B. Agent Tier Event
C. Basic
D. None of the mentioned
Answer» D. None of the mentioned
3.

Hama requires JRE _______ or higher and ssh to be set up between nodes in the cluster.

A. 1.6
B. 1.7
C. 1.8
D. 2.0
Answer» B. 1.7
4.

HCatalog is built on top of the Hive metastore and incorporates Hive’s is ____________

A. DDL
B. DML
C. TCL
D. DCL
Answer» B. DML
5.

___________ is the type supported for storing values in HCatalog tables.

A. HCatRecord
B. HCatColumns
C. HCatValues
D. All of the mentioned
Answer» B. HCatColumns
6.

_________________ property allow users to override the expiry time specified.

A. hcat.desired.partition.num.splits
B. hcatalog.hive.client.cache.expiry.time
C. hcatalog.hive.client.cache.disabled
D. hcat.append.limit
Answer» C. hcatalog.hive.client.cache.disabled
7.

Sally in data processing uses __________ to cleanse and prepare the data.

A. Pig
B. Hive
C. HCatalog
D. Impala
Answer» B. Hive
8.

_________ mode is used when you just have a single server and want to launch all the daemon processes.

A. Local Mode
B. Pseudo Distributed Mode
C. Distributed Mode
D. All of the mentioned
Answer» C. Distributed Mode
9.

With HCatalog _________ does not need to modify the table structure.

A. Partition
B. Columns
C. Robert
D. All of the mentioned
Answer» D. All of the mentioned
10.

____________ generates NGrams and counts frequencies for ngrams, head and tail subgrams.

A. CollocationDriver
B. CollocDriver
C. CarDriver
D. All of the mentioned
Answer» C. CarDriver
11.

Ambari makes Hadoop management simpler by providing a consistent, secure platform for operational control.

A. True
B. False
C. May be True or False
D. Can't Say
Answer» B. False
12.

Capacity Scheduler Viewhelps a Hadoop operator setup YARN workload management easily to enable multi-tenant and multi-workload processing.

A. True
B. False
C. May be True or False
D. Can't Say
Answer» B. False
13.

HCatalog supports the same data types as _________

A. Pig
B. Hama
C. Hive
D. Oozie
Answer» D. Oozie
14.

____________ is used when you want the sink to be the input source for another operation.

A. Collector Tier Event
B. Agent Tier Event
C. Basic
D. All of the mentioned
Answer» C. Basic
15.

__________ is a single-threaded server using standard blocking I/O.

A. TNonblockingServer
B. TSimpleServer
C. TSocket
D. None of the mentioned
Answer» C. TSocket
16.

Hama is a general ________________ computing engine on top of Hadoop.

A. BSP
B. ASP
C. MPP
D. None of the mentioned
Answer» B. ASP
17.

____________ Collection API allows for even distribution of custom replica properties.

A. BALANUNIQUE
B. BALANCESHARDUNIQUE
C. BALANCEUNIQUE
D. None of the mentioned
Answer» C. BALANCEUNIQUE
18.

A __________ in a social graph is a group of people who interact frequently with each other and less frequently with others.

A. semi-cluster
B. partial cluster
C. full cluster
D. none of the mentioned
Answer» B. partial cluster
19.

In how many ways Spark uses Hadoop?

A. 2
B. 3
C. 4
D. 5
Answer» B. 3
20.

Drill is designed from the ground up to support high-performance analysis on the ____________ data.

A. semi-structured
B. structured
C. unstructured
D. none of the mentioned
Answer» B. structured
21.

A __________ server and a data node should be run on one physical node.

A. groom
B. web
C. client
D. all of the mentioned
Answer» B. web
22.

A key of type ___________ is generated which is used later to join ngrams with their heads and tails in the reducer phase.

A. GramKey
B. Primary
C. Secondary
D. None of the mentioned
Answer» B. Primary
23.

Which of the following performs compression using zlib?

A. TZlibTransport
B. TFramedTransport
C. TMemoryTransport
D. None of the mentioned
Answer» B. TFramedTransport
24.

________ phase merges the counts for unique ngrams or ngram fragments across multiple documents.

A. CollocCombiner
B. CollocReducer
C. CollocMerger
D. None of the mentioned
Answer» B. CollocReducer
25.

Mahout provides ____________ libraries for common and primitive Java collections.

A. Java
B. Javascript
C. Perl
D. Python
Answer» B. Javascript
26.

_______ transport is required when using a non-blocking server.

A. TZlibTransport
B. TFramedTransport
C. TMemoryTransport
D. None of the mentioned
Answer» C. TMemoryTransport
27.

A ________ is used to manage the efficient barrier synchronization of the BSPPeers.

A. GroomServers
B. BSPMaster
C. Zookeeper
D. None of the mentioned
Answer» D. None of the mentioned
28.

For ___________ partitioning jobs, simply specifying a custom directory is not good enough.

A. static
B. semi cluster
C. dynamic
D. all of the mentioned
Answer» D. all of the mentioned
29.

The Crunch APIs are modeled after _________ which is the library that Google uses for building data pipelines on top of their own implementation of MapReduce.

A. FlagJava
B. FlumeJava
C. FlakeJava
D. All of the mentioned
Answer» C. FlakeJava
30.

Drill analyze semi-structured/nested data coming from _________ applications.

A. RDBMS
B. NoSQL
C. NewSQL
D. None of the mentioned
Answer» C. NewSQL
31.

Drill integrates with BI tools using a standard __________ connector.

A. JDBC
B. ODBC
C. ODBC-JDBC
D. All of the mentioned
Answer» C. ODBC-JDBC
32.

MapR __________ Solution Earns Highest Score in Gigaom Research Data Warehouse Interoperability Report.

A. SQL-on-Hadoop
B. Hive-on-Hadoop
C. Pig-on-Hadoop
D. All of the mentioned
Answer» B. Hive-on-Hadoop
33.

The web UI provides information about ________ job statistics of the Hama cluster.

A. MPP
B. BSP
C. USP
D. ISP
Answer» C. USP
34.

The tokens are passed through a Lucene ____________ to produce NGrams of the desired length.

A. ShngleFil
B. ShingleFilter
C. SingleFilter
D. Collfilter
Answer» C. SingleFilter
35.

Which of the following is true about mahout?

A. A mahout is one who drives an elephant as its master.
B. Apache Mahout is an open source project that is primarily used for creating scalable machine learning algorithms
C. Mahout lets applications to analyze large sets of data effectively and in quick time.
D. All of the above
Answer» E.
36.

Which of the following language is not supported by Spark?

A. Java
B. Pascal
C. Scala
D. Python
Answer» C. Scala
37.

Which of the following project is interface definition language for hadoop?

A. Oozie
B. Mahout
C. Thrift
D. Impala
Answer» D. Impala
38.

____________ can be used to generate stats over the results of arbitrary numeric functions.

A. stats.field
B. sta.field
C. stats.value
D. none of the mentioned
Answer» B. sta.field
39.

The ________ class allows developers to exercise precise control over how data is partitioned, sorted, and grouped by the underlying execution engine.

A. Grouping
B. GroupingOptions
C. RowGrouping
D. None of the mentioned
Answer» C. RowGrouping
40.

Hive does not have a data type corresponding to the ____________ type in Pig.

A. decimal
B. short
C. biginteger
D. datetime
Answer» D. datetime
41.

New ____________ type enables Indexing and searching of date ranges, particularly multi-valued ones.

A. RangeField
B. DateField
C. DateRangeField
D. All of the mentioned
Answer» D. All of the mentioned
42.

Crunch uses Java serialization to serialize the contents of all of the ______ in a pipeline definition.

A. Transient
B. DoFns
C. Configuration
D. All of the mentioned
Answer» C. Configuration
43.

PCollection, PTable, and PGroupedTable all support a __________ operation.

A. intersection
B. union
C. OR
D. None of the mentioned
Answer» C. OR
44.

Drill provides a __________ like internal data model to represent and process data.

A. XML
B. JSON
C. TIFF
D. None of the mentioned
Answer» C. TIFF
45.

What is true about Apache Flume?

A. Apache Flumeis a reliable and distributed system for collecting, aggregating and moving massive quantities of log data.
B. It has a simple yet flexible architecture based on streaming data flows
C. Apache Flume is used to collect log data present in log files from web servers and aggregating it into HDFS for analysis.
D. All of the above
Answer» E.
46.

Lucene index size is roughly _______ the size of text indexed.

A. 10%
B. 20%
C. 50%
D. 70%
Answer» C. 50%
47.

Apache Hama provides complete clone of _________

A. Pragmatic
B. Pregel
C. ServePreg
D. All of the mentioned
Answer» C. ServePreg
48.

Which of the following Uses JSON for encoding of data?

A. TCompactProtocol
B. TDenseProtocol
C. TBinaryProtocol
D. None of the mentioned
Answer» E.
49.

DoFns provide direct access to the __________ object that is used within a given Map or Reduce task via the getContext method.

A. TaskInputContext
B. TaskInputOutputContext
C. TaskOutputContext
D. All of the mentioned
Answer» C. TaskOutputContext
50.

The _________ collocation identifier is integrated into the process that is used to create vectors from sequence files of text keys and values.

A. lbr
B. lcr
C. llr
D. lar
Answer» D. lar