MCQOPTIONS
Saved Bookmarks
This section includes 52 Mcqs, each offering curated multiple-choice questions to sharpen your Apache Hadoop knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
The framework groups Reducer inputs by key in _________ stage. |
| A. | sort |
| B. | shuffle |
| C. | reduce |
| D. | none of the mentioned |
| Answer» B. shuffle | |
| 2. |
The Hadoop MapReduce framework spawns one map task for each __________ generated by the InputFormat for the job. |
| A. | OutputSplit |
| B. | InputSplit |
| C. | InputSplitStream |
| D. | All of the mentioned |
| Answer» C. InputSplitStream | |
| 3. |
The output of the reduce task is typically written to the FileSystem via _____________ |
| A. | OutputCollector.collect |
| B. | OutputCollector.get |
| C. | OutputCollector.receive |
| D. | OutputCollector.put |
| Answer» B. OutputCollector.get | |
| 4. |
Applications can use the ____________ to report progress and set application-level status messages |
| A. | Partitioner |
| B. | OutputSplit |
| C. | Reporter |
| D. | All of the mentioned |
| Answer» D. All of the mentioned | |
| 5. |
Users can control which keys (and hence records) go to which Reducer by implementing a custom : |
| A. | Partitioner |
| B. | OutputSplit |
| C. | Reporter |
| D. | All of the mentioned |
| Answer» B. OutputSplit | |
| 6. |
The number of reduces for the job is set by the user via : |
| A. | JobConf.setNumTasks(int) |
| B. | JobConf.setNumReduceTasks(int) |
| C. | JobConf.setNumMapTasks(int) |
| D. | All of the mentioned |
| Answer» C. JobConf.setNumMapTasks(int) | |
| 7. |
The right level of parallelism for maps seems to be around _________ maps per-node |
| A. | 1-10 |
| B. | 10-100 |
| C. | 100-150 |
| D. | 150-200 |
| Answer» C. 100-150 | |
| 8. |
The Mapper implementation processes one line at a time via _________ method. |
| A. | map |
| B. | reduce |
| C. | mapper |
| D. | reducer |
| Answer» B. reduce | |
| 9. |
Map output larger than ___ percent of the memory allocated to copying map outputs. |
| A. | 10 |
| B. | 15 |
| C. | 25 |
| D. | 35 |
| Answer» D. 35 | |
| 10. |
______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce. |
| A. | mapred.job.shuffle.merge.percent |
| B. | mapred.job.reduce.input.buffer.percen |
| C. | mapred.inmem.merge.threshold |
| D. | io.sort.factor |
| Answer» C. mapred.inmem.merge.threshold | |
| 11. |
Which of the following is the default Partitioner for Mapreduce ? |
| A. | MergePartitioner |
| B. | HashedPartitioner |
| C. | HashPartitioner |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 12. |
____________ specifies the number of segments on disk to be merged at the same time. |
| A. | mapred.job.shuffle.merge.percent |
| B. | mapred.job.reduce.input.buffer.percen |
| C. | mapred.inmem.merge.threshold |
| D. | io.sort.factor |
| Answer» E. | |
| 13. |
Running a ___________ program involves running mapping tasks on many or all of the nodes in our cluster. |
| A. | MapReduce |
| B. | Map |
| C. | Reducer |
| D. | All of the mentioned |
| Answer» B. Map | |
| 14. |
________ is a utility which allows users to create and run jobs with any executable as the mapper and/or the reducer. |
| A. | Hadoop Strdata |
| B. | Hadoop Streaming |
| C. | Hadoop Stream |
| D. | None of the mentioned |
| Answer» C. Hadoop Stream | |
| 15. |
Which of the following node is responsible for executing a Task assigned to it by the JobTracker ? |
| A. | MapReduce |
| B. | Mapper |
| C. | TaskTracker |
| D. | JobTracker |
| Answer» D. JobTracker | |
| 16. |
___________ is used for writing blocks with single replica in memory. |
| A. | Hot |
| B. | Lazy_Persist |
| C. | One_SSD |
| D. | All_SSD |
| Answer» C. One_SSD | |
| 17. |
____________ is used for storing one of the replicas in SSD. |
| A. | Hot |
| B. | Lazy_Persist |
| C. | One_SSD |
| D. | All_SSD |
| Answer» D. All_SSD | |
| 18. |
__________ storage is a solution to decouple growing storage capacity from compute capacity. |
| A. | DataNode |
| B. | Archival |
| C. | Policy |
| D. | None of the mentioned |
| Answer» C. Policy | |
| 19. |
The configuration file must be owned by the user running : |
| A. | DataManager |
| B. | NodeManager |
| C. | ValidationManager |
| D. | None of the mentioned |
| Answer» C. ValidationManager | |
| 20. |
The ____________ requires that paths including and leading up to the directories specified in yarn.nodemanager.local-dirs |
| A. | TaskController |
| B. | LinuxTaskController |
| C. | LinuxController |
| D. | None of the mentioned |
| Answer» C. LinuxController | |
| 21. |
_________ is useful for iterating the properties when all deprecated properties for currently set properties need to be present. |
| A. | addResource |
| B. | setDeprecatedProperties |
| C. | addDefaultResource |
| D. | none of the mentioned |
| Answer» C. addDefaultResource | |
| 22. |
Which of the following adds a configuration resource ? |
| A. | addResource |
| B. | setDeprecatedProperties |
| C. | addDefaultResource |
| D. | addResource |
| Answer» E. | |
| 23. |
Which of the following writes MapFiles as output ? |
| A. | DBInpFormat |
| B. | MapFileOutputFormat |
| C. | SequenceFileAsBinaryOutputFormat |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 24. |
_________ is the base class for all implementations of InputFormat that use files as their data source . |
| A. | FileTextFormat |
| B. | FileInputFormat |
| C. | FileOutputFormat |
| D. | None of the mentioned |
| Answer» C. FileOutputFormat | |
| 25. |
Which of the following method add a path or paths to the list of inputs ? |
| A. | setInputPaths() |
| B. | addInputPath() |
| C. | setInput() |
| D. | none of the mentioned |
| Answer» C. setInput() | |
| 26. |
Hadoop has a library class, org.apache.hadoop.mapred.lib.FieldSelectionMapReduce, that effectively allows you to process text data like the unix ______ utility. |
| A. | Copy |
| B. | Cut |
| C. | Paste |
| D. | Move |
| Answer» C. Paste | |
| 27. |
HBase provides ___________ like capabilities on top of Hadoop and HDFS. |
| A. | TopTable |
| B. | BigTop |
| C. | Bigtable |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 28. |
_______ refers to incremental costs with no major impact on solution design, performance and complexity. |
| A. | Scale-out |
| B. | Scale-down |
| C. | Scale-up |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 29. |
__________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer |
| A. | Partitioner |
| B. | OutputCollector |
| C. | Reporter |
| D. | All of the mentioned |
| Answer» C. Reporter | |
| 30. |
_________ is the default Partitioner for partitioning key space. |
| A. | HashPar |
| B. | Partitioner |
| C. | HashPartitioner |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 31. |
_________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution. |
| A. | Map Parameters |
| B. | JobConf |
| C. | MemoryConf |
| D. | None of the mentioned |
| Answer» C. MemoryConf | |
| 32. |
The number of maps is usually driven by the total size of : |
| A. | inputs |
| B. | outputs |
| C. | tasks |
| D. | None of the mentioned |
| Answer» B. outputs | |
| 33. |
__________ maps input key/value pairs to a set of intermediate key/value pairs. |
| A. | Mapper |
| B. | Reducer |
| C. | Both Mapper and Reducer |
| D. | None of the mentioned |
| Answer» B. Reducer | |
| 34. |
_________ function is responsible for consolidating the results produced by each of the Map() functions/tasks. |
| A. | Reduce |
| B. | Map |
| C. | Reducer |
| D. | All of the mentioned |
| Answer» B. Map | |
| 35. |
The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks. |
| A. | DistributedLog |
| B. | DistributedCache |
| C. | DistributedJars |
| D. | None of the mentioned |
| Answer» C. DistributedJars | |
| 36. |
__________ is used to filter log files from the output directory listing. |
| A. | OutputLog |
| B. | OutputLogFilter |
| C. | DistributedLog |
| D. | DistributedJarsd) None of the mentioned |
| Answer» C. DistributedLog | |
| 37. |
A ________ node acts as the Slave and is responsible for executing a Task assigned to it by the JobTracker. |
| A. | MapReduce |
| B. | Mapper |
| C. | TaskTracker |
| D. | JobTracker |
| Answer» D. JobTracker | |
| 38. |
Jobs can enable task JVMs to be reused by specifying the job configuration : |
| A. | mapred.job.recycle.jvm.num.tasks |
| B. | mapissue.job.reuse.jvm.num.tasks |
| C. | mapred.job.reuse.jvm.num.tasks |
| D. | all of the mentioned |
| Answer» C. mapred.job.reuse.jvm.num.tasks | |
| 39. |
___________ part of the MapReduce is responsible for processing one or more chunks of data and producing the output results. |
| A. | Maptask |
| B. | Mapper |
| C. | Task execution |
| D. | All of the mentioned |
| Answer» B. Mapper | |
| 40. |
Although the Hadoop framework is implemented in Java , MapReduce applications need not be written in : |
| A. | Java |
| B. | C |
| C. | C# |
| D. | None of the mentioned |
| Answer» B. C | |
| 41. |
________ is a utility which allows users to create and run jobs with any executables as the mapper and/or the reducer. |
| A. | Hadoop Strdata |
| B. | Hadoop Streaming |
| C. | Hadoop Stream |
| D. | None of the mentioned |
| Answer» C. Hadoop Stream | |
| 42. |
The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to : |
| A. | ${HADOOP_LOG_DIR}/user |
| B. | ${HADOOP_LOG_DIR}/userlogs |
| C. | ${HADOOP_LOG_DIR}/logs |
| D. | None of the mentioned |
| Answer» C. ${HADOOP_LOG_DIR}/logs | |
| 43. |
__________ will clear the RMStateStore and is useful if past applications are no longer needed. |
| A. | -format-state |
| B. | -form-state-store |
| C. | -format-state-store |
| D. | none of the mentioned |
| Answer» D. none of the mentioned | |
| 44. |
The __________ is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues etc. |
| A. | Manager |
| B. | Master |
| C. | Scheduler |
| D. | None of the mentioned |
| Answer» D. None of the mentioned | |
| 45. |
The queue definitions and properties such as ________, ACLs can be changed, at runtime. |
| A. | tolerant |
| B. | capacity |
| C. | speed |
| D. | all of the mentioned |
| Answer» C. speed | |
| 46. |
The updated queue configuration should be a valid one i.e. queue-capacity at each level should be equal to : |
| A. | 0.5 |
| B. | 0.75 |
| C. | 1 |
| D. | 0 |
| Answer» D. 0 | |
| 47. |
Which of the following command runs ResourceManager admin client ? |
| A. | proxyserver |
| B. | run |
| C. | admin |
| D. | rmadmin |
| Answer» E. | |
| 48. |
Users can bundle their Yarn code in a _________ file and execute it using jar command. |
| A. | java |
| B. | jar |
| C. | C code |
| D. | xml |
| Answer» C. C code | |
| 49. |
Yarn commands are invoked by the ________ script. |
| A. | hive |
| B. | bin |
| C. | hadoop |
| D. | home |
| Answer» C. hadoop | |
| 50. |
The CapacityScheduler has a predefined queue called : |
| A. | domain |
| B. | root |
| C. | rear |
| D. | all of the mentioned |
| Answer» C. rear | |