Happy Learning: August 2014

Hi Reader,

Below are the questions that I faced in one of the recent interviews. Hope it helps

Hadoop Framework:

What is the difference between existing file system and HDFS? why do we need HDFS?
What are the different modes?
Configuration files and their properties
Where do we set Name node, Data node, task tracker address/location?
How many instances of Job tracker runs on a cluster?
What is the difference b/w job and a task?
How job tracker manages the jobs?
How many task trackers exist on a data node?
What happens if a job tracker fails?
What happens if the Name node fails?
How Secondary Name node will get the data present in Name node?

Map Reduce:

Word count example flow/ Map reduce job flow?
What are the phases of reducer?
What is speculative execution?
If two instances of same mapper gets completed at same time, what are the factors that job tracker consider in the selection of completed task?
What do you mean by combiner?
Where do we need to use combiner ?
Which class/Interface will be used to write Combiner?
What is partitioning?
How to implement customized partition method?
What is Map/Reduce side join? when do we go for it?Adv and disadvantages ?
What is distributed cache?

Hive:

When is Hive used?
How to change the location of schema while creating it?
What are the different properties that can be set while defining schema?
What are the types of partitions?
Explain a Scenario where we need a partition.
What is bucketing?
What are the properties that can be set in the Hive query?
What does explain plan contain?
How the number of mappers and reducers will be decided in a Hive query? Example??
How many number of Map reduce jobs will be created for a join query on 3 tables by same key?
What are the properties to be set for query optimization?
What is AVRO?
Write a query to find the top 2nd student details based on his marks
Write a query to filter all the duplicate records
Methods to implement for an UDF??
Commands to run before using UDF in a Hive query?
Explain the factors to be considered in Schema/Table design

PIG:

Other:

Please add more questions in comment if you have. All the very best :)

Happy Learning