Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours (Paperback)
暫譯: Sams 24小時自學大數據分析:使用 Microsoft HDInsight

Manpreet Singh, Arshad Ali

  • 出版商: SAMS
  • 出版日期: 2019-12-01
  • 售價: $1,650
  • 貴賓價: 9.5$1,568
  • 語言: 英文
  • 頁數: 592
  • 裝訂: Paperback
  • ISBN: 0672337274
  • ISBN-13: 9780672337277
  • 相關分類: 大數據 Big-dataData Science
  • 立即出貨 (庫存=1)

買這商品的人也買了...

相關主題

商品描述


With Microsoft HDInsight, business professionals and data analysts can rapidly leverage the power of Hadoop on a flexible, scalable cloud-based platform, using Microsoft's accessible business intelligence, visualization, and productivity tools. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques you'll need to provision, configure, monitor, troubleshoot, and use HDInsight, even if you're new to big data analytics. Each short, easy lesson builds on all that's come before: you'll learn all of HDInsight's essentials as you solve real data analytics problems. Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours covers all this, and much more:
  • Introduction of Big Data, NoSQL systems, its Business Value Proposition and use cases examples
  • Introduction to Hadoop, Architecture, Ecosystem and Microsoft HDInsight
  • Getting to know Hadoop 2.0 and the innovations it provides like HDFS2 and YARN
  • Quickly installing, configuring, and monitoring Hadoop (HDInsight) clusters in the cloud and automating cluster provisioning
  • Customize the HDInsight cluster and install additional Hadoop ecosystem projects using Script Actions
  • Administering HDInsight from the Hadoop command prompt or Microsoft PowerShell
  • Using the Microsoft Azure HDInsight Emulator for learning or development
  • Understanding HDFS, HDFS vs. Azure Blob Storage, MapReduce Job Framework and Job Execution Pipeline
  • Doing big data analytics with MapReduce, writing your MapReduce programs in your choice of .NET programming language such as C#
  • Using Hive for big data analytics, demonstrate end to end scenario and how Apache Tez improves the performance several folds
  • Consuming HDInsight data from Microsoft BI Tools over Hive ODBC Driver - Using HDInsight with Microsoft BI and Power BI to simplify data integration, analysis, and reporting
  • Using PIG for big data transformation workflows step by step
  • Apache HBase on HDInsight, its architecture, data model, HBase vs. Hive, programmatically managing HBase data with C# and Apache Phoenix
  • Using Sqoop or SSIS (SQL Server Integration Services) to move data to/from HDInsight and build data integration workflows for transferring data
  • Using Oozie for scheduling, co-ordination and managing data processing workflows in HDInsight cluster
  • Using R programming language with HDInsight for performing statistical computing on Big Data sets
  • Using Apache Spark's in-memory computation model to run big data analytics up to 100 times faster than Hadoop MapReduce
  • Perform real-time Stream Analytics on high-velocity big data streams with Storm
  • Integration of Enterprise Data Warehouse with Hadoop and Microsoft Analytics Platform System (APS), formally known as SQL Server Parallel Data Warehouse (PDW)
Step-by-step instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; "Did You Know?" tips offer insider advice and shortcuts; and "Watch Out!" alerts help you avoid problems. By the time you're finished, you'll be comfortable going beyond the book to create any HDInsight app you can imagine!

商品描述(中文翻譯)

使用 Microsoft HDInsight,商業專業人士和數據分析師可以快速利用 Hadoop 的強大功能,透過靈活且可擴展的雲端平台,使用 Microsoft 的商業智慧、視覺化和生產力工具。現在,只需 24 課,每課一小時或更短的時間,您就可以學會所有需要的技能和技術,以配置、設置、監控、故障排除和使用 HDInsight,即使您對大數據分析是新手。每一課短小易懂,並在之前的基礎上逐步建立:您將在解決實際數據分析問題的過程中學習 HDInsight 的所有基本知識。《Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours》涵蓋了所有這些內容,以及更多:

- 大數據的介紹、NoSQL 系統、其商業價值主張及使用案例示例
- Hadoop 的介紹、架構、生態系統及 Microsoft HDInsight
- 了解 Hadoop 2.0 及其提供的創新,如 HDFS2 和 YARN
- 快速安裝、配置和監控雲端中的 Hadoop (HDInsight) 集群,並自動化集群配置
- 自訂 HDInsight 集群並使用 Script Actions 安裝額外的 Hadoop 生態系統專案
- 從 Hadoop 命令提示符或 Microsoft PowerShell 管理 HDInsight
- 使用 Microsoft Azure HDInsight 模擬器進行學習或開發
- 理解 HDFS、HDFS 與 Azure Blob Storage、MapReduce 作業框架及作業執行管道
- 使用 MapReduce 進行大數據分析,並用您選擇的 .NET 程式語言(如 C#)編寫 MapReduce 程式
- 使用 Hive 進行大數據分析,展示端到端場景及 Apache Tez 如何顯著提升性能
- 通過 Hive ODBC 驅動程式從 Microsoft BI 工具消費 HDInsight 數據 - 使用 HDInsight 與 Microsoft BI 和 Power BI 簡化數據整合、分析和報告
- 使用 PIG 逐步進行大數據轉換工作流程
- 在 HDInsight 上使用 Apache HBase,其架構、數據模型、HBase 與 Hive 的比較,使用 C# 和 Apache Phoenix 程式化管理 HBase 數據
- 使用 Sqoop 或 SSIS (SQL Server Integration Services) 將數據移動到/從 HDInsight,並建立數據整合工作流程以轉移數據
- 使用 Oozie 在 HDInsight 集群中進行排程、協調和管理數據處理工作流程
- 使用 R 程式語言與 HDInsight 進行大數據集的統計計算
- 使用 Apache Spark 的內存計算模型進行大數據分析,速度比 Hadoop MapReduce 快 100 倍
- 對高速度的大數據流進行實時流分析,使用 Storm
- 將企業數據倉庫與 Hadoop 和 Microsoft Analytics Platform System (APS) 整合,該系統以前稱為 SQL Server Parallel Data Warehouse (PDW)

逐步的指導將引導您解決常見問題、問題和任務;問答、測驗和練習將建立並測試您的知識;“你知道嗎?”提示提供內部建議和捷徑;而“注意!”警示幫助您避免問題。當您完成時,您將能夠自信地超越書本,創建任何您能想像的 HDInsight 應用程式!