PROVE IT !!If you know it then PROVE IT !! Skill Proficiency Test

Hadoop or Big Data Interview Questions and Answers (Part 1)

Spark Execution Engine – Logical Plan to Physical Plan

November 8, 2017

Managed Table vs. External Table In Hive

January 24, 2018

We hosted a webinar on November 11th 2017 answering several Hadoop or Big Data interview questions that were asked in real interviews. Couple weeks before the webinar we asked our wonderful Hadoop In Real World community to share interesting or challenging questions they were asked in real interviews. As a result we got several interesting and challenging questions from the community that were asked in real world interviews.

We had so much fun answering those questions in the webinar. The participants were super engaging and we even answered more questions that was asked live by the participants on the webinar.

We quite often hosts webinars like these, sign up below to get invitations to join one of our webinars.

Interesting & Challenging Questions

  1. Explain your recent project, roles & responsibilities?
  2. Explain MapReduce flow in detail
  3. Assume a 10 GB dataset, how many mappers and reducers will be created?
  4. How does data get transferred from Mapper to Reducer?
  5. If you have more than one reducer, how does data gets to the correct reducer?
  6. When to use Hive and when to use Spark?
  7. Does Hive support ACID & CRUD?
  8. When to use Partitions & when to use Buckets in Hive?
  9. What is Map Join & SMB Join in Hive?
  10. When to use columnar format and when to use row format?
  11. What is the difference between RDD & DataFrame?
  12. What is Oozie?
  13. What is Flume?
  14. What is Sentry?
  15. What is Kafka?
  16. How do you promote code to production?
  17. Why odd number of nodes – 1, 3, 5.. ?
  18. Is it possible to make a Mapper multithreaded?

We are certainly planning to host another session to cover more interview questions. So, if you have an interesting interview question that you would like us to answer, please email that question(s) to

Here is the full recording of the webinar. Enjoy!

Hadoop Team

We are a group of Senior Big Data Consultants who are passionate about Hadoop, Spark and Big Data technologies. Our collective experience ranges from finance, retail, social media and gaming. We have worked with clusters ranging from 100 all the way to over 1000 nodes.

Let’s block ads! (Why?)