Free Practice Questions for Cloudera Certified Administrator for Apache Hadoop (CCAH) (CCA-500)

QUESTION 6

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop streaming.
Which data serialization system gives the flexibility to do this?

A. CSV
B. XML
C. HTML
D. Avro
E. SequenceFiles
F. JSON

Correct Answer: E
Sequence files are block-compressed and provide direct serialization and deserialization of several arbitrary data types (not just text). Sequence files can be generated as the output of other MapReduce tasks and are an efficient intermediate representation for data that is passing from one MapReduce job to anther.

QUESTION 7

In CDH4 and later, which file contains a serialized form of all the directory and files inodes in the filesystem, giving the NameNode a persistent checkpoint of the filesystem metadata?

A. fstime
B. VERSION
C. Fsimage_N (where N reflects transactions up to transaction ID N)
D. Edits_N-M (where N-M transactions between transaction ID N and transaction ID N)

Correct Answer: C
Reference:http://mikepluta.com/tag/namenode/

QUESTION 8

For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log files stored?

A. Cached by the NodeManager managing the job containers, then written to a log directory on the NameNode
B. Cached in the YARN container running the task, then copied into HDFS on job completion
C. In HDFS, in the directory of the user who generates the job
D. On the local disk of the slave mode running the task

Correct Answer: D

QUESTION 9

You are migrating a cluster from MApReduce version 1 (MRv1) to MapReduce version 2 (MRv2) on YARN. You want to maintain your MRv1 TaskTracker slot capacities when you migrate. What should you do/

A. Configure yarn.applicationmaster.resource.memory-mb and yarn.applicationmaster.resource.cpu-vcores so that ApplicationMaster container allocations match the capacity you require.
B. You don’t need to configure or balance these properties in YARN as YARN dynamically balances resource management capabilities on your cluster
C. Configure mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum ub yarn-site.xml to match your cluster’s capacity set by the yarn-scheduler.minimum-allocation
D. Configure yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores to match the capacity you require under YARN for each NodeManager

Correct Answer: D

QUESTION 10

What does CDH packaging do on install to facilitate Kerberos security setup?

A. Automatically configures permissions for log files at & MAPRED_LOG_DIR/userlogs
B. Creates users for hdfs and mapreduce to facilitate role assignment
C. Creates directories for temp, hdfs, and mapreduce with the correct permissions
D. Creates a set of pre-configured Kerberos keytab files and their permissions
E. Creates and configures your kdc with default cluster values

Correct Answer: B

CCA-500 Dumps

CCA-500 Free Practice Test

Cloudera CCA-500: Cloudera Certified Administrator for Apache Hadoop (CCAH)

Practice Test