Large Scale Machine Learning with Spark

Nonfiction, Computers, Advanced Computing, Engineering, Computer Vision, Theory, Database Management
Cover of the book Large Scale Machine Learning with Spark by Md. Rezaul Karim, Md. Mahedi Kaysar, Packt Publishing
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Md. Rezaul Karim, Md. Mahedi Kaysar ISBN: 9781785883712
Publisher: Packt Publishing Publication: October 27, 2016
Imprint: Packt Publishing Language: English
Author: Md. Rezaul Karim, Md. Mahedi Kaysar
ISBN: 9781785883712
Publisher: Packt Publishing
Publication: October 27, 2016
Imprint: Packt Publishing
Language: English

Discover everything you need to build robust machine learning applications with Spark 2.0

About This Book

  • Get the most up-to-date book on the market that focuses on design, engineering, and scalable solutions in machine learning with Spark 2.0.0
  • Use Spark's machine learning library in a big data environment
  • You will learn how to develop high-value applications at scale with ease and a develop a personalized design

Who This Book Is For

This book is for data science engineers and scientists who work with large and complex data sets. You should be familiar with the basics of machine learning concepts, statistics, and computational mathematics. Knowledge of Scala and Java is advisable.

What You Will Learn

  • Get solid theoretical understandings of ML algorithms
  • Configure Spark on cluster and cloud infrastructure to develop applications using Scala, Java, Python, and R
  • Scale up ML applications on large cluster or cloud infrastructures
  • Use Spark ML and MLlib to develop ML pipelines with recommendation system, classification, regression, clustering, sentiment analysis, and dimensionality reduction
  • Handle large texts for developing ML applications with strong focus on feature engineering
  • Use Spark Streaming to develop ML applications for real-time streaming
  • Tune ML models with cross-validation, hyperparameters tuning and train split
  • Enhance ML models to make them adaptable for new data in dynamic and incremental environments

In Detail

Data processing, implementing related algorithms, tuning, scaling up and finally deploying are some crucial steps in the process of optimising any application.

Spark is capable of handling large-scale batch and streaming data to figure out when to cache data in memory and processing them up to 100 times faster than Hadoop-based MapReduce. This means predictive analytics can be applied to streaming and batch to develop complete machine learning (ML) applications a lot quicker, making Spark an ideal candidate for large data-intensive applications.

This book focuses on design engineering and scalable solutions using ML with Spark. First, you will learn how to install Spark with all new features from the latest Spark 2.0 release. Moving on, you'll explore important concepts such as advanced feature engineering with RDD and Datasets. After studying developing and deploying applications, you will see how to use external libraries with Spark.

In summary, you will be able to develop complete and personalised ML applications from data collections,model building, tuning, and scaling up to deploying on a cluster or the cloud.

Style and approach

This book takes a practical approach where all the topics explained are demonstrated with the help of real-world use cases.

View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

Discover everything you need to build robust machine learning applications with Spark 2.0

About This Book

Who This Book Is For

This book is for data science engineers and scientists who work with large and complex data sets. You should be familiar with the basics of machine learning concepts, statistics, and computational mathematics. Knowledge of Scala and Java is advisable.

What You Will Learn

In Detail

Data processing, implementing related algorithms, tuning, scaling up and finally deploying are some crucial steps in the process of optimising any application.

Spark is capable of handling large-scale batch and streaming data to figure out when to cache data in memory and processing them up to 100 times faster than Hadoop-based MapReduce. This means predictive analytics can be applied to streaming and batch to develop complete machine learning (ML) applications a lot quicker, making Spark an ideal candidate for large data-intensive applications.

This book focuses on design engineering and scalable solutions using ML with Spark. First, you will learn how to install Spark with all new features from the latest Spark 2.0 release. Moving on, you'll explore important concepts such as advanced feature engineering with RDD and Datasets. After studying developing and deploying applications, you will see how to use external libraries with Spark.

In summary, you will be able to develop complete and personalised ML applications from data collections,model building, tuning, and scaling up to deploying on a cluster or the cloud.

Style and approach

This book takes a practical approach where all the topics explained are demonstrated with the help of real-world use cases.

More books from Packt Publishing

Cover of the book Reactive Programming in Kotlin by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Oracle Business Intelligence 11g R1 Cookbook by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Machine Learning with TensorFlow 1.x by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Google Cloud Platform for Developers by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book HTML5 Games Development by Example: Beginners Guide by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Mastering JavaScript High Performance by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book OpenLayers 3 : Beginner's Guide by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Mastering Microservices with Java 9 - Second Edition by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Instant Slic3r by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Learning Continuous Integration with TeamCity by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Implementing Oracle API Platform Cloud Service by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Salesforce Essentials for Administrators by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book INSTANT Windows PowerShell by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Mastering phpMyAdmin 3.1 for Effective MySQL Management by Md. Rezaul Karim, Md. Mahedi Kaysar
Cover of the book Learn ARCore - Fundamentals of Google ARCore by Md. Rezaul Karim, Md. Mahedi Kaysar
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy