By Padma Priya Chitturi

ISBN-10: 1785880101

ISBN-13: 9781785880100

Key Features

  • Use Apache Spark for info processing with those hands-on recipes
  • Implement end-to-end, large-scale information research larger than ever before
  • Work with strong libraries similar to MLLib, SciPy, NumPy, and Pandas to realize insights out of your data

Book Description

Spark has emerged because the such a lot promising vast information analytics engine for information technology execs. the genuine strength and price of Apache Spark lies in its skill to execute information technological know-how initiatives with velocity and accuracy. Spark's promoting element is that it combines ETL, batch analytics, real-time flow research, desktop studying, graph processing, and visualizations. It helps you to take on the complexities that include uncooked unstructured info units with ease.

This consultant gets you cozy and assured acting information technology initiatives with Spark. you are going to find out about implementations together with disbursed deep studying, numerical computing, and scalable desktop studying. you may be proven powerful suggestions to problematical recommendations in info technology utilizing Spark's facts technological know-how libraries resembling MLLib, Pandas, NumPy, SciPy, and extra. those uncomplicated and effective recipes will allow you to enforce algorithms and optimize your work.

What you are going to learn

  • Explore the subjects of knowledge mining, textual content mining, normal Language Processing, details retrieval, and desktop learning.
  • Solve real-world analytical issues of huge info sets.
  • Address facts technology demanding situations with analytical instruments on a allotted approach like Spark (apt for iterative algorithms), which deals in-memory processing and extra flexibility for facts research at scale.
  • Get hands-on adventure with algorithms like class, regression, and suggestion on genuine datasets utilizing Spark MLLib package.
  • Learn approximately numerical and medical computing utilizing NumPy and SciPy on Spark.
  • Use Predictive version Markup Language (PMML) in Spark for statistical information mining models.

About the Author

Padma Priya Chitturi is Analytics Lead at Fractal Analytics Pvt Ltd and has over 5 years of expertise in gigantic facts processing. at present, she is a part of strength improvement at Fractal and chargeable for resolution improvement for analytical difficulties throughout a number of enterprise domain names at huge scale. ahead of this, she labored for an airways product on a real-time processing platform serving a million person requests/sec at Amadeus software program Labs. She has labored on understanding large-scale deep networks (Jeffrey dean's paintings in Google mind) for photograph type at the titanic facts platform Spark. She works heavily with massive facts applied sciences similar to Spark, typhoon, Cassandra and Hadoop. She used to be an open resource contributor to Apache Storm.

Table of Contents

  1. Big facts Analytics with Spark
  2. Tricky records with Spark
  3. Data research with Spark
  4. Clustering, category, and Regression
  5. Working with Spark MLlib
  6. NLP with Spark
  7. Working with glowing Water - H2O
  8. Data Visualization with Spark
  9. Deep studying on Spark
  10. Working with SparkR

Show description

Read Online or Download Apache Spark for Data Science Cookbook PDF

Best data modeling & design books

New PDF release: Writing and Querying MapReduce Views in CouchDB: Tools for

As a way to use CouchDB to help real-world functions, you will have to create MapReduce perspectives that allow you to question this document-oriented database for significant information. With this brief and concise publication, you will create numerous MapReduce perspectives that will help you question and mixture facts in CouchDB’s huge, dispensed datasets.

Download e-book for iPad: Data Modeling for MongoDB: Building Well-Designed and by Steve Hoberman

Congratulations! You accomplished the MongoDB software in the given tight time-frame and there's a celebration to have a good time your application's liberate into creation. even if individuals are congratulating you on the get together, you're feeling a few uneasiness inside of. to accomplish the undertaking on time required creating a lot of assumptions concerning the information, corresponding to what phrases intended and the way calculations are derived.

Get Machine Learning with R Cookbook - 110 Recipes for Building PDF

Key FeaturesApply R to simplify predictive modeling with brief and straightforward codeUse laptop studying to resolve difficulties starting from small to special dataBuild a coaching and checking out dataset from the churn dataset, employing diversified class methodsBook DescriptionThe R language is a robust open resource useful programming language.

Read e-book online Learning Predictive Analytics with R PDF

Familiarize yourself with key facts visualization and predictive analytic abilities utilizing RAbout This BookAcquire predictive analytic talents utilizing quite a few instruments of RMake predictions approximately destiny occasions through studying worthy details from information utilizing RComprehensible directions that target predictive version layout with real-world dataWho This publication Is ForIf you're a statistician, leader details officer, facts scientist, ML engineer, ML practitioner, quantitative analyst, and scholar of computer studying, this is often the ebook for you.

Additional info for Apache Spark for Data Science Cookbook

Example text

Download PDF sample

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

by Ronald

Rated 4.03 of 5 – based on 33 votes