Our ability to collect and analyze data is evolving at an exponential rate. We collect vast quantities of data every second and are only beginning to understand the true potential impact it can have on our businesses. All this data is an ever-expanding mountain of gold, waiting to be mined and transferred into new, profound capabilities that will help us become more adept at predicting the future. Fundamentally, this capability transforms organizations from reactive environments — being managed by static and aged data — to automated continuous learning environments in real time. Today’s analytical capabilities don’t stop at the physical or virtual boundaries of any organization. Relationships across the entire business model and value chain — including customers, suppliers and partners — do and should share real-time data. This allows companies to extend or acquire knowledge and feedback, significantly reducing performance risk, waste and costs while driving performance and growth. In this course you will understand what real time data is, and how to analyze it to get useful insights using Spark and Spark Streaming.

Skills covered

  • checkPyspark
  • checkSpark
  • checkSpark Streaming
  • checkReal time Analytics

Course Syllabus

Spark Basics and Streaming

  • playIntroduction to Spark
  • playSpark vs hadoop
  • playSpark architecture
  • playRDDs
  • playSpark terminologies
  • playHands on PySpark
  • playSpark MLIB
  • playMoving from RDD to dataframe API
  • playClustering with pyspark
  • playMusic data case studies
  • playOverview of real time analytics and spark streaming
  • playSpark streaming architecture
  • playUnderstand real time analytics with twitter example
  • playAd tech case study

Spark Basics and Streaming

Leave a Reply

Your email address will not be published. Required fields are marked *