CCA-500 Dumps

CCA-500 Free Practice Test

Cloudera CCA-500: Cloudera Certified Administrator for Apache Hadoop (CCAH)

QUESTION 6

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop streaming.
Which data serialization system gives the flexibility to do this?

Correct Answer: E
Sequence files are block-compressed and provide direct serialization and deserialization of several arbitrary data types (not just text). Sequence files can be generated as the output of other MapReduce tasks and are an efficient intermediate representation for data that is passing from one MapReduce job to anther.

QUESTION 7

In CDH4 and later, which file contains a serialized form of all the directory and files inodes in the filesystem, giving the NameNode a persistent checkpoint of the filesystem metadata?

Correct Answer: C
Reference:http://mikepluta.com/tag/namenode/

QUESTION 8

For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log files stored?

Correct Answer: D

QUESTION 9

You are migrating a cluster from MApReduce version 1 (MRv1) to MapReduce version 2 (MRv2) on YARN. You want to maintain your MRv1 TaskTracker slot capacities when you migrate. What should you do/

Correct Answer: D

QUESTION 10

What does CDH packaging do on install to facilitate Kerberos security setup?

Correct Answer: B