Data-Intensive Text Processing with MapReduce

Nonfiction, Computers, Advanced Computing, Natural Language Processing, Artificial Intelligence, Reference & Language, Language Arts, Linguistics
Cover of the book Data-Intensive Text Processing with MapReduce by Jimmy Lin, Chris Dyer, Morgan & Claypool Publishers
View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart
Author: Jimmy Lin, Chris Dyer ISBN: 9781608453436
Publisher: Morgan & Claypool Publishers Publication: October 10, 2010
Imprint: Morgan & Claypool Publishers Language: English
Author: Jimmy Lin, Chris Dyer
ISBN: 9781608453436
Publisher: Morgan & Claypool Publishers
Publication: October 10, 2010
Imprint: Morgan & Claypool Publishers
Language: English

Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks

View on Amazon View on AbeBooks View on Kobo View on B.Depository View on eBay View on Walmart

Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks

More books from Morgan & Claypool Publishers

Cover of the book Resource-Oriented Architecture Patterns for Webs of Data by Jimmy Lin, Chris Dyer
Cover of the book Special and General Relativity by Jimmy Lin, Chris Dyer
Cover of the book Numerical Solutions of Boundary Value Problems with Finite Difference Method by Jimmy Lin, Chris Dyer
Cover of the book Kinematic Labs with Mobile Devices by Jimmy Lin, Chris Dyer
Cover of the book Creating Materials with a Desired Refraction Coefficient by Jimmy Lin, Chris Dyer
Cover of the book P2P Techniques for Decentralized Applications by Jimmy Lin, Chris Dyer
Cover of the book Theory of Electromagnetic Pulses by Jimmy Lin, Chris Dyer
Cover of the book Automatic Parallelization by Jimmy Lin, Chris Dyer
Cover of the book An Introduction to the Physics of Nuclear Medicine by Jimmy Lin, Chris Dyer
Cover of the book An Introduction to Planetary Nebulae by Jimmy Lin, Chris Dyer
Cover of the book Computational Approaches in Physics by Jimmy Lin, Chris Dyer
Cover of the book HCI Theory by Jimmy Lin, Chris Dyer
Cover of the book Natural Language Processing for Social Media by Jimmy Lin, Chris Dyer
Cover of the book Testing iOS Apps with HadoopUnit by Jimmy Lin, Chris Dyer
Cover of the book Musical Sound, Instruments, and Equipment by Jimmy Lin, Chris Dyer
We use our own "cookies" and third party cookies to improve services and to see statistical information. By using this website, you agree to our Privacy Policy