Learning Cascading

Michael Covert, Victoria Loewengart

  • 出版商: Packt Publishing
  • 出版日期: 2015-05-30
  • 售價: $1,860
  • 貴賓價: 9.5$1,767
  • 語言: 英文
  • 頁數: 281
  • 裝訂: Paperback
  • ISBN: 1785288911
  • ISBN-13: 9781785288913
  • 下單後立即進貨 (約3~4週)

商品描述

Build reliable, robust, and high-performance big data applications using the Cascading application development efficiently

About This Book

  • Understand how Cascading fits into the big data landscape and hides the complexity of MapReduce to enable the development of streamlined, maintainable, and concise applications
  • Develop a real-life Cascading application that can be easily customized for your specific needs
  • Learn basic and advanced features of Cascading through a practical, hands-on approach with step-by-step instructions and code samples

Who This Book Is For

This book is intended for software developers, system architects and analysts, big data project managers, and data scientists who wish to deploy big data solutions using the Cascading framework. You must have a basic understanding of the big data paradigm and should be familiar with Java development techniques.

What You Will Learn

  • Familiarize yourself with tuples, pipes, taps, and flows and build your first Cascading application
  • Discover how to design, develop, and use custom operations
  • Design, develop, use, and reuse code with subassemblies and Cascades
  • Acquire the skills you need to integrate Cascading with external systems
  • Gain expertise in testing, QA, and performance tuning to run an efficient and successful Cascading project
  • Explore project management methodologies and steps to develop workable solutions
  • Discover the future of big data frameworks and understand how Cascading can help your software to evolve with it
  • Uncover sources of additional information and other tools that can make development tasks a lot easier

In Detail

Cascading is open source software that is used to create and execute complex data processing workflows on big data clusters. The book starts by explaining how Cascading relates to core big data technologies such as Hadoop MapReduce. Having instilled an understanding of the technology, the book provides a comprehensive introduction to the Cascading paradigm and its components using code examples. You will not only learn more advanced Cascading features, you will also write code to utilize them. Furthermore, you will gain in-depth knowledge of how to efficiently optimize a Cascading application. To deepen your knowledge and experience with Cascading, you will work through a real-life case study using Natural Language Processing to perform text analysis and search on large volumes of unstructured text. Throughout the book, you will receive expert advice on how to use the portions of the product that are undocumented or have limited documentation. By the end of the book, you will be able to build practical Cascading applications.