Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
The type of __________ strategy Cassandra performs on your data is configurable and can significantly affect read performance.
(a) compression
(b) collection
(c) compaction
(d) decompression
Correct answer is (c) compactionThe best I can explain: Using the SizeTieredCompactionStrategy or DateTieredCompactionStrategy tends to cause data fragmentation when rows are frequently updated.
Correct answer is (c) compaction
The best I can explain: Using the SizeTieredCompactionStrategy or DateTieredCompactionStrategy tends to cause data fragmentation when rows are frequently updated.
See less_________ will overwrite any existing data in the table or partition.
(a) INSERT WRITE
(b) INSERT OVERWRITE
(c) INSERT INTO
(d) None of the mentioned
Right answer is (c) INSERT INTOTo elaborate: INSERT INTO will append to the table or partition, keeping the existing data intact.
Right answer is (c) INSERT INTO
To elaborate: INSERT INTO will append to the table or partition, keeping the existing data intact.
See lessPoint out the correct statement.
(a) With TextInputFormat and KeyValueTextInputFormat, each mapper receives a variable number of lines of input
(b) StreamXmlRecordReader, the page elements can be interpreted as records for processing by a mapper
(c) The number depends on the size of the split and the length of the lines.
(d) All of the mentioned
The correct option is (d) All of the mentionedThe explanation: Large XML documents that are composed of a series of “records” can be broken into these records using simple string or regular-expression matching to find start and end tags of records.
The correct option is (d) All of the mentioned
The explanation: Large XML documents that are composed of a series of “records” can be broken into these records using simple string or regular-expression matching to find start and end tags of records.
See lessLucene provides scalable, high-Performance indexing over ______ per hour on modern hardware.
(a) 1 TB
(b) 150GB
(c) 10 GB
(d) None of the mentioned
Right choice is (b) 150GBEasy explanation: Lucene offers powerful features through a simple API.
Right choice is (b) 150GB
Easy explanation: Lucene offers powerful features through a simple API.
See lessStorm is benchmarked as processing one million _______ byte messages per second per node.
(a) 10
(b) 50
(c) 100
(d) 200
Correct answer is (c) 100Explanation: Storm is a distributed real-time computation system.
Correct answer is (c) 100
Explanation: Storm is a distributed real-time computation system.
See less______________ is another implementation of the MapRunnable interface that runs mappers concurrently in a configurable number of threads.
(a) MultithreadedRunner
(b) MultithreadedMap
(c) MultithreadedMapRunner
(d) SinglethreadedMapRunner
Correct choice is (c) MultithreadedMapRunnerEasiest explanation: A RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs, which it passes to the map function.
Correct choice is (c) MultithreadedMapRunner
Easiest explanation: A RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs, which it passes to the map function.
See lessPoint out the correct statement.
(a) Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers
(b) Cassandra has a “masterless” architecture, meaning all nodes are the same
(c) Cassandra also provides customizable replication, storing redundant copies of data across nodes that participate in a Cassandra ring
(d) All of the mentioned
Correct choice is (d) All of the mentionedTo explain: Cassandra provides automatic data distribution across all nodes that participate in a “ring” or database cluster.
Correct choice is (d) All of the mentioned
To explain: Cassandra provides automatic data distribution across all nodes that participate in a “ring” or database cluster.
See lessApache Storm added the open source, stream data processing to _________ Data Platform.
(a) Cloudera
(b) Hortonworks
(c) Local Cloudera
(d) MapR
Right answer is (b) HortonworksThe best explanation: The Storm community is working to improve capabilities related to three important themes: business continuity, operations and developer productivity.
Right answer is (b) Hortonworks
The best explanation: The Storm community is working to improve capabilities related to three important themes: business continuity, operations and developer productivity.
See less_________ hides the limitations of Java behind a powerful and concise Clojure API for Cascading.
(a) Scalding
(b) HCatalog
(c) Cascalog
(d) All of the mentioned
Right answer is (c) CascalogTo elaborate: Cascalog also adds Logic Programming concepts inspired by Datalog. Hence the name “Cascalog” is a contraction of Cascading and Datalog.
Right answer is (c) Cascalog
To elaborate: Cascalog also adds Logic Programming concepts inspired by Datalog. Hence the name “Cascalog” is a contraction of Cascading and Datalog.
See lessThe need for data replication can arise in various scenarios like ____________
(a) Replication Factor is changed
(b) DataNode goes down
(c) Data Blocks get corrupted
(d) All of the mentioned
Right answer is (d) All of the mentionedBest explanation: Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.
Right answer is (d) All of the mentioned
Best explanation: Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.
See less