Big Data Analytics with Java

Rajat Mehta

買這商品的人也買了...

商品描述

Key Features

  • Acquire real-world set of tools for building enterprise level data science applications
  • Surpasses the barrier of other languages in data science and learn create useful object-oriented codes
  • Extensive use of Java compliant big data tools like apache spark, Hadoop, etc.

Book Description

This book covers case studies such as sentiment analysis on a tweet dataset, recommendations on a movielens dataset, customer segmentation on an ecommerce dataset, and graph analysis on actual flights dataset.

This book is an end-to-end guide to implement analytics on big data with Java. Java is the de facto language for major big data environments, including Hadoop. This book will teach you how to perform analytics on big data with production-friendly Java. This book basically divided into two sections. The first part is an introduction that will help the readers get acquainted with big data environments, whereas the second part will contain a hardcore discussion on all the concepts in analytics on big data. It will take you from data analysis and data visualization to the core concepts and advantages of machine learning, real-life usage of regression and classification using Naive Bayes, a deep discussion on the concepts of clustering,and a review of simple neural networks on big data using deepLearning4j or plain Java Spark code. This book is a must-have book for Java developers who want to start learning big data analytics and want to use it in the real world.

What you will learn

  • Start from simple analytic tasks on big data
  • Get into more complex tasks with predictive analytics on big data using machine learning
  • Learn real time analytic tasks
  • Understand the concepts with examples and case studies
  • Prepare and refine data for analysis
  • Create charts in order to understand the data
  • See various real-world datasets

About the Author

The author is a VP (Technical Architect) in technology in JP Morgan Chase in New York. The author is a sun certified java developer and has worked on java related technologies for more than 16 years. Current role for the past few years heavily involves the usage of bid data stack and running analytics on it. Author is also a contributor in various open source projects that are available on his GitHub repository and is also a frequent write on dev magazines.

Table of Contents

  1. Big Data Analytics with Java
  2. First Steps on Data Analysis
  3. Data Visualization
  4. Basics of Machine Learning
  5. Regression on Big Data
  6. Naive Bayes and Sentiment Analysis
  7. Classification using Decision Trees
  8. Classification on ensemble of Decision Trees
  9. Recommendations on Big Data
  10. Clustering in Action on Big Data
  11. Building graphs on Big Data
  12. Streaming on Big Data
  13. Deep Learning Using Big Data

商品描述(中文翻譯)

主要特點


  • 獲得構建企業級數據科學應用程序的現實工具

  • 超越其他語言在數據科學中的障礙,學習創建有用的面向對象的代碼

  • 廣泛使用符合Java標準的大數據工具,如Apache Spark、Hadoop等

書籍描述

本書涵蓋了一些案例研究,例如對推文數據集進行情感分析,對movielens數據集進行推薦,對電子商務數據集進行客戶分割,以及對實際航班數據集進行圖形分析。

本書是一本從頭到尾實現Java大數據分析的指南。Java是主要大數據環境(包括Hadoop)的事實上的語言。本書將教您如何使用適合生產環境的Java進行大數據分析。本書基本上分為兩部分。第一部分是介紹,將幫助讀者熟悉大數據環境,而第二部分將對大數據分析的所有概念進行深入討論。它將從數據分析和數據可視化,到機器學習的核心概念和優勢,以及使用Naive Bayes進行回歸和分類的實際用例,對聚類概念進行深入討論,以及使用deepLearning4j或純Java Spark代碼在大數據上進行簡單神經網絡的回顧。本書是Java開發人員必備的書籍,他們想要開始學習大數據分析並在實際世界中應用它。

你將學到什麼


  • 從大數據上的簡單分析任務開始

  • 進行更複雜的預測性分析任務,使用機器學習進行大數據分析

  • 學習實時分析任務

  • 通過示例和案例研究理解概念

  • 準備和精煉數據進行分析

  • 創建圖表以理解數據

  • 查看各種實際數據集

關於作者

作者是紐約JP Morgan Chase的技術架構副總裁。作者是Sun認證的Java開發人員,從事Java相關技術工作超過16年。過去幾年的工作重點是使用大數據堆棧並在其上運行分析。作者還是各種開源項目的貢獻者,這些項目可以在他的GitHub存儲庫中找到,並且經常在開發雜誌上撰寫文章。

目錄


  1. 使用Java進行大數據分析

  2. 數據分析的第一步

  3. 數據可視化

  4. 機器學習基礎

  5. 大數據上的回歸分析

  6. Naive Bayes和情感分析

  7. 使用決策樹進行分類

  8. 集成多個決策樹進行分類

  9. 大數據上的推薦

  10. 大數據上的聚類

  11. 在大數據上構建圖形

  12. 大數據上的流式處理

  13. 使用大數據進行深度學習