Data Science Essentials in Python: Collect - Organize - Explore - Predict - Value

Dmitry Zinoviev

買這商品的人也買了...

商品描述

Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the NumPy and Pandas modules; describe and analyze data using statistical and network-theoretical methods; and see actual examples of data analysis at work. This one-stop solution covers the essential data science you need in Python.

Data science is one of the fastest-growing disciplines in terms of academic research, student enrollment, and employment. Python, with its flexibility and scalability, is quickly overtaking the R language for data-scientific projects. Keep Python data-science concepts at your fingertips with this modular, quick reference to the tools used to acquire, clean, analyze, and store data.

This one-stop solution covers essential Python, databases, network analysis, natural language processing, elements of machine learning, and visualization. Access structured and unstructured text and numeric data from local files, databases, and the Internet. Arrange, rearrange, and clean the data. Work with relational and non-relational databases, data visualization, and simple predictive analysis (regressions, clustering, and decision trees). See how typical data analysis problems are handled. And try your hand at your own solutions to a variety of medium-scale projects that are fun to work on and look good on your resume.

Keep this handy quick guide at your side whether you're a student, an entry-level data science professional converting from R to Python, or a seasoned Python developer who doesn't want to memorize every function and option.

What You Need:

You need a decent distribution of Python 3.3 or above that includes at least NLTK, Pandas, NumPy, Matplotlib, Networkx, SciKit-Learn, and BeautifulSoup. A great distribution that meets the requirements is Anaconda, available for free from www.continuum.io. If you plan to set up your own database servers, you also need MySQL (www.mysql.com) and MongoDB (www.mongodb.com). Both packages are free and run on Windows, Linux, and Mac OS.

商品描述(中文翻譯)

從儲存在SQL和NoSQL資料庫中的混亂、非結構化的文物轉變為整潔、井然有序的數據集,這本快速參考指南專為忙碌的數據科學家而設。了解文本挖掘、機器學習和網絡分析;使用NumPy和Pandas模組處理數值數據;使用統計和網絡理論方法描述和分析數據;並查看實際的數據分析示例。這個一站式解決方案涵蓋了你在Python中所需的基本數據科學知識。

數據科學是學術研究、學生報名和就業方面增長最快的學科之一。由於其靈活性和可擴展性,Python正迅速取代R語言成為數據科學項目的首選語言。通過這個模塊化、快速參考指南,將Python數據科學概念掌握在指尖,用於數據的獲取、清理、分析和存儲。

這個一站式解決方案涵蓋了Python、資料庫、網絡分析、自然語言處理、機器學習元素和可視化等基本內容。從本地文件、資料庫和互聯網中獲取結構化和非結構化的文本和數值數據。整理、重新排列和清理數據。使用關聯和非關聯資料庫、數據可視化和簡單的預測分析(回歸、聚類和決策樹)。了解如何處理典型的數據分析問題。並嘗試自己解決各種中型項目,這些項目既有趣又能提升你的履歷。

無論你是學生、從R轉向Python的初級數據科學專業人員,還是一位經驗豐富的Python開發人員,不想記住每個函數和選項,都可以將這個方便的快速指南放在身邊。

你需要一個Python 3.3或更高版本的合適發行版,其中至少包括NLTK、Pandas、NumPy、Matplotlib、Networkx、SciKit-Learn和BeautifulSoup。滿足要求的一個很好的發行版是Anaconda,可以免費從www.continuum.io獲得。如果你計劃設置自己的數據庫服務器,還需要MySQL(www.mysql.com)和MongoDB(www.mongodb.com)。這兩個軟件包都是免費的,並且可以在Windows、Linux和Mac OS上運行。