By Padma Priya Chitturi
- Use Apache Spark for info processing with those hands-on recipes
- Implement end-to-end, large-scale information research larger than ever before
- Work with strong libraries similar to MLLib, SciPy, NumPy, and Pandas to realize insights out of your data
Spark has emerged because the such a lot promising vast information analytics engine for information technology execs. the genuine strength and price of Apache Spark lies in its skill to execute information technological know-how initiatives with velocity and accuracy. Spark's promoting element is that it combines ETL, batch analytics, real-time flow research, desktop studying, graph processing, and visualizations. It helps you to take on the complexities that include uncooked unstructured info units with ease.
This consultant gets you cozy and assured acting information technology initiatives with Spark. you are going to find out about implementations together with disbursed deep studying, numerical computing, and scalable desktop studying. you may be proven powerful suggestions to problematical recommendations in info technology utilizing Spark's facts technological know-how libraries resembling MLLib, Pandas, NumPy, SciPy, and extra. those uncomplicated and effective recipes will allow you to enforce algorithms and optimize your work.
What you are going to learn
- Explore the subjects of knowledge mining, textual content mining, normal Language Processing, details retrieval, and desktop learning.
- Solve real-world analytical issues of huge info sets.
- Address facts technology demanding situations with analytical instruments on a allotted approach like Spark (apt for iterative algorithms), which deals in-memory processing and extra flexibility for facts research at scale.
- Get hands-on adventure with algorithms like class, regression, and suggestion on genuine datasets utilizing Spark MLLib package.
- Learn approximately numerical and medical computing utilizing NumPy and SciPy on Spark.
- Use Predictive version Markup Language (PMML) in Spark for statistical information mining models.
About the Author
Padma Priya Chitturi is Analytics Lead at Fractal Analytics Pvt Ltd and has over 5 years of expertise in gigantic facts processing. at present, she is a part of strength improvement at Fractal and chargeable for resolution improvement for analytical difficulties throughout a number of enterprise domain names at huge scale. ahead of this, she labored for an airways product on a real-time processing platform serving a million person requests/sec at Amadeus software program Labs. She has labored on understanding large-scale deep networks (Jeffrey dean's paintings in Google mind) for photograph type at the titanic facts platform Spark. She works heavily with massive facts applied sciences similar to Spark, typhoon, Cassandra and Hadoop. She used to be an open resource contributor to Apache Storm.
Table of Contents
- Big facts Analytics with Spark
- Tricky records with Spark
- Data research with Spark
- Clustering, category, and Regression
- Working with Spark MLlib
- NLP with Spark
- Working with glowing Water - H2O
- Data Visualization with Spark
- Deep studying on Spark
- Working with SparkR
Read Online or Download Apache Spark for Data Science Cookbook PDF
Best data modeling & design books
As a way to use CouchDB to help real-world functions, you will have to create MapReduce perspectives that allow you to question this document-oriented database for significant information. With this brief and concise publication, you will create numerous MapReduce perspectives that will help you question and mixture facts in CouchDB’s huge, dispensed datasets.
Congratulations! You accomplished the MongoDB software in the given tight time-frame and there's a celebration to have a good time your application's liberate into creation. even if individuals are congratulating you on the get together, you're feeling a few uneasiness inside of. to accomplish the undertaking on time required creating a lot of assumptions concerning the information, corresponding to what phrases intended and the way calculations are derived.
Key FeaturesApply R to simplify predictive modeling with brief and straightforward codeUse laptop studying to resolve difficulties starting from small to special dataBuild a coaching and checking out dataset from the churn dataset, employing diversified class methodsBook DescriptionThe R language is a robust open resource useful programming language.
Familiarize yourself with key facts visualization and predictive analytic abilities utilizing RAbout This BookAcquire predictive analytic talents utilizing quite a few instruments of RMake predictions approximately destiny occasions through studying worthy details from information utilizing RComprehensible directions that target predictive version layout with real-world dataWho This publication Is ForIf you're a statistician, leader details officer, facts scientist, ML engineer, ML practitioner, quantitative analyst, and scholar of computer studying, this is often the ebook for you.
- Metamodeling for Method Engineering (Information Systems)
- An Introduction to Programming with IDL: Interactive Data Language
- Data Science Essentials in Python: Collect - Organize - Explore - Predict - Value (The Pragmatic Programmers)
- Advanced Machine Learning with Python
- Sharing Data and Models in Software Engineering
- Johdanto Tietovarastointiin (Finnish Edition)
Additional info for Apache Spark for Data Science Cookbook
Apache Spark for Data Science Cookbook by Padma Priya Chitturi