Data Algorithms

Recipes for Scaling Up with Hadoop and Spark

Nonfiction, Computers, Database Management, Data Processing
Cover of the book Data Algorithms by Mahmoud Parsian, O'Reilly Media
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Mahmoud Parsian ISBN: 9781491906132
Publisher: O'Reilly Media Publication: July 13, 2015
Imprint: O'Reilly Media Language: English
Author: Mahmoud Parsian
ISBN: 9781491906132
Publisher: O'Reilly Media
Publication: July 13, 2015
Imprint: O'Reilly Media
Language: English

If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You’ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects.

Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark.

Topics include:

  • Market basket analysis for a large set of transactions
  • Data mining algorithms (K-means, KNN, and Naive Bayes)
  • Using huge genomic data to sequence DNA and RNA
  • Naive Bayes theorem and Markov chains for data and market prediction
  • Recommendation algorithms and pairwise document similarity
  • Linear regression, Cox regression, and Pearson correlation
  • Allelic frequency and mining DNA
  • Social network analysis (recommendation systems, counting triangles, sentiment analysis)
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You’ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects.

Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark.

Topics include:

More books from O'Reilly Media

Cover of the book Building Internet Firewalls by Mahmoud Parsian
Cover of the book The Ruby Programming Language by Mahmoud Parsian
Cover of the book DNS und Bind im IPv6 kurz & gut by Mahmoud Parsian
Cover of the book Building Maintainable Software, C# Edition by Mahmoud Parsian
Cover of the book CouchDB: The Definitive Guide by Mahmoud Parsian
Cover of the book Beginning Perl for Bioinformatics by Mahmoud Parsian
Cover of the book Visualizing Streaming Data by Mahmoud Parsian
Cover of the book Photoshop Elements 8 for Windows: The Missing Manual by Mahmoud Parsian
Cover of the book Hot Seat by Mahmoud Parsian
Cover of the book C in a Nutshell by Mahmoud Parsian
Cover of the book Building Hypermedia APIs with HTML5 and Node by Mahmoud Parsian
Cover of the book NetBeans: The Definitive Guide by Mahmoud Parsian
Cover of the book QuickBase: The Missing Manual by Mahmoud Parsian
Cover of the book JUNOS High Availability by Mahmoud Parsian
Cover of the book Digital Audio Essentials by Mahmoud Parsian
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy