Doing Data Science: Straight Talk from the Frontline (Paperback)

Cathy O'Neil, Rachel Schutt

買這商品的人也買了...

商品描述

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.

In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.

Topics include:

  • Statistical inference, exploratory data analysis, and the data science process
  • Algorithms
  • Spam filters, Naive Bayes, and data wrangling
  • Logistic regression
  • Financial modeling
  • Recommendation engines and causality
  • Data visualization
  • Social networks and data journalism
  • Data engineering, MapReduce, Pregel, and Hadoop

Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

商品描述(中文翻譯)

現在人們意識到數據在選舉或商業模型中的重要性,因此數據科學作為一個職業正在崛起。但是,如何開始在這個廣泛而跨學科的領域中工作,這個領域被炒作得如此模糊不清呢?這本富有洞察力的書籍是哥倫比亞大學《數據科學導論》課程的基礎,它告訴你需要知道的內容。

在這些章節長的講座中,來自Google、Microsoft和eBay等公司的數據科學家通過案例研究和他們使用的代碼分享新的算法、方法和模型。如果你熟悉線性代數、概率和統計,並且有編程經驗,這本書是數據科學的理想入門。

主題包括:

- 統計推斷、探索性數據分析和數據科學過程
- 算法
- 垃圾郵件過濾器、Naive Bayes和數據整理
- 邏輯回歸
- 金融建模
- 推薦引擎和因果關係
- 數據可視化
- 社交網絡和數據新聞學
- 數據工程、MapReduce、Pregel和Hadoop

《Doing Data Science》是課程講師Rachel Schutt(News Corp數據科學高級副總裁)和數據科學顧問Cathy O'Neil(Johnson Research Labs高級數據科學家)之間的合作,Cathy O'Neil參加並在課程中撰寫了博客。