HADOOP. Integration in IBM, Microsoft and SAS

James Braselton

  • 出版商: CreateSpace Independ
  • 出版日期: 2014-08-06
  • 售價: $950
  • 貴賓價: 9.5$903
  • 語言: 英文
  • 頁數: 148
  • 裝訂: Paperback
  • ISBN: 1500755656
  • ISBN-13: 9781500755652
  • 相關分類: Hadoop




There are mountains of untapped potential in our information. Until now, it’s been too cost prohibitive to analyze these massive volumes. Of course, there’s also been a staggering opportunity cost associated with not tapping into this information, as the potential of this yet-to-be-analyzed information is near-limitless. And we’re not just talking the ubiquitous “competitive differentiation” marketing slogan here; we’re talking innovation, discovery, association, and pretty much any- thing else that could make the way you work tomorrow very different, with even more tangible results and insight, from the way you work today. People and organizations have attempted to tackle this problem from many different angles. Of course, the angle that is currently leading the pack in terms of popularity for massive data analysis is an open source project called Hadoop. Hadoop is shipped as part of IBM tools, SQL Server 2014 and SAS applications. Hadoop is shipped as part of the IBM InfoSphere BigInsights (BigInsights) platform. Quite simply, BigInsights embraces, hardens, and extends the Hadoop open source framework with enterprise-grade security, governance, availability, integration into existing data stores, tooling that simplifies and improves developer productivity, scalability, analytic toolkits, and more. BigInsights is (and will always be) based on the nonforked core Hadoop distribution, and backwards compatibility. Hadoop is also shipped as part of SAS tools. SAS incorporated Hadoop into their applications (SAS Base, SAS Data Integration, Sas Enterpris Guide and SAS Enterprise Miner). Same SAS aplications works in-memory on Hadoop (In-memory Statistics, SAS Visual Analytics and SAS Visual Statistics). Finaly, Hadoop is also shipped as part of Microsoft SQL Server 2014 and HDInsight. SQL Server 2014 works in-memory across Hadoop. HDInsight is a Hadoop-based platform that you can use to process data of all kinds in the cloud. In particular, HDInsight is useful for processing high volumes of structured and unstructured data, which traditional relational database systems typically cannot support for a variety of reasons. HDInsight allows you to quickly establish an infrastructure for big data analysis, whether you want to develop a proof of concept for a big data solution or support ongoing analytical requirements in a production environment. Furthermore, HDInsight integrates with Microsoft’s business-intelligence tools to enable users to enhance big data with additional sources and then explore and analyze the results to gain deeper insights. Most functionality within HDInsight and other Hadoop distributions is similar. Consequently, any current experience with Hadoop is largely transferable. Keep in mind that interaction with HDInsight requires you to use Windows Azure PowerShell commands, so a basic knowledge of PowerShell is required to work with the cluster.