Introduction to Big Data Hadoop and Ecosystem

Spark-Scala1

IBM Open Platform (IOP) with Apache Hadoop is the collaborative platform to enable Big Data solutions to be developed on the common set of Apache Hadoop technologies. You will also have an in-depth introduction to the main components of the ODP core, namely Apache Hadoop (inclusive of HDFS, YARN, and MapReduce) and Apache Ambari, as well as providing a treatment of the main opensource components that are generally made available with the ODP core in a production Hadoop cluster. The participant will be engaged with the product through interactive exercises.

Topics covered in this course include:

  • What is Big Data and Data Analytics
  • Overview about HDP
  • Introduction to Apache Ambari
  • Hadoop and the Hadoop Distributed File System (HDFS)
  • MapReduce and YARN
  • Apache Spark
  • Overview on Data File Formats, HBase, Pig, Hive, R and Python
  • ZooKeeper, Slider, and Knox
  • Flume and Sqoop
  • DataPlane Service
  • Stream Computing

Requirements

  • Knowledge in Computer and basic internet technologies

Intended Audience

  • Big data engineers, Data scientists, Developers or Programmers and administrators
Ibm Certification Courses