Learning Apache Mahout
- 出版商: Packt Publishing
- 出版日期: 2015-04-03
- 定價: $1,620
- 售價: 9.1 折 $1,480
- 貴賓價: 8.7 折 $1,406
- 語言: 英文
- 頁數: 275
- 裝訂: Paperback
- ISBN: 1783555211
- ISBN-13: 9781783555215
Acquire practical skills in Big Data Analytics and explore data science with Apache Mahout
About This Book
- Learn to use Apache Mahout for Big Data Analytics
- Understand machine learning concepts and algorithms and their implementation in Mahout.
- A comprehensive guide with numerous code examples and end-to-end case studies on Customer Analytics and Text Analytics.
Who This Book Is For
If you are a Java developer and want to use Mahout and machine learning to solve Big Data Analytics use cases then this book is for you. Familiarity with shell scripts is assumed but no prior experience is required.
What You Will Learn
- Configure Mahout on Linux systems and set up the development environment
- Become familiar with the Mahout command line utilities and Java APIs
- Understand the core concepts of machine learning and the classes that implement them
- Integrate Apache Mahout with newer platforms such as Apache Spark
- Solve classification, clustering, and recommendation problems with Mahout
- Explore frequent pattern mining and topic modeling, the two main application areas of machine learning
- Understand feature extraction, reduction, and the curse of dimensionality
In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big Data machine learning platforms. It implements machine learning algorithms on top of distributed processing platforms such as Hadoop and Spark.
Starting with the basics of Mahout and machine learning, you will explore prominent algorithms and their implementation in Mahout development. You will learn about Mahout building blocks, addressing feature extraction, reduction and the curse of dimensionality, delving into classification use cases with the random forest and Naive Bayes classifier and item and user-based recommendation. You will then work with clustering Mahout using the K-means algorithm and implement Mahout without MapReduce. Finish with a flourish by exploring end-to-end use cases on customer analytics and test analytics to get a real-life practical know-how of analytics projects.