Today, we’re surrounded by data. Existing tools were becoming inadequate to process such large data sets. Hadoop, and large-scale distributed data processing in general, is rapidly becoming an important skill set for many programmers. Hadoop is an open-source framework for writing and running distributed applications that process large amounts of data. Distributed computing is a wide and varied field, but the key distinctions of Hadoop are that it is – Accessible, Robust, Scalable and Simple. This course introduces Hadoop in terms of distributed systems and data processing systems. You will learn the basics of DFS and Hadoop Architecture. The course will give you an overview of the Map Reduce programming model using simple word counting examples.

Skills covered

  • checkMap reduce
  • checkHadoop and YARN architecture

Course Syllabus

Getting Started : Hadoop

  • playIntroduction to big data
  • playWhat is hadoop
  • playHadoop architecture
  • playHDFS basics
  • playDemo HDFS basics
  • playHadoop Yarn introduction
  • playIntroduction to map reduce
  • playMap reduce approach with demonstration
  • playHDFS in hadoop 1.x vs in hadoop 2.x YARN
  • playHDFS advanced concepts
  • playHDFS configuration files

Getting Started : Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *