Thoughtful Data Science: Working with data by creating visually intuitive insights with Jupyter and Pixiedust
立即出貨 (庫存 < 3)
Approaching the practice of data science by scripting your own data pipeline and dashboards
- David teaches how to build a new data pipeline using Pixiedust
- How to get the most out of Jupyter notebooks
- Think about the data and their visualisations, before worrying about the algorithms
Data science has become the one scientific endeavor every business has to contend with today. We also need to learn why data algorithms work, but even more importantly, we need to be able to create new insights from our data that we can actually work with. The why is addressed in many publications today, but it is not easy to create insights such that the data scientist does not look like a mountebank creating opaque notebook code before getting to the visually compelling bits of data science: the data science process itself has to be transparent, easy to understand, and it has to be straightforward to optimise.
David Taieb created Pixiedust in Python to be able to teach non-data scientists to use Jupyter notebooks, without having to slog through the considerable amount of Jupyter code required to be able to create simple and sometimes not-so-simple insights into data. It is possible to use Pixiedust by just writing a few lines in HTML and CSS, while retaining the ability to drop or remove algorithms and visualisation options, adjust the data pipeline to the requirements posed by the data or just get some very quick results. The case studies represent a carefully graded ladder of progress, ranging all the way from data mined from social media to geo-analytical data helpful in business decision making.
It is, however, possible to use both Python and Scala to add features to the Pixiedust data pipeline, and ultimately, to bring the power of the Spark big data framework to the data scientist.
What you will learn
- How to write basic Pixiedust dashboards
- Building your own data pipelines without writing connecting pipeline code
- Learn how to use Jupyter notebooks without the pain
- Create compelling data visualisations in Pixiedust
- Write applications running on Spark, without writing Spark code
Who This Book Is For
To produce a functioning Pixiedust dashboard, only a modicum of HMTL and CSS is required. Fluency in data interpretation and visualization is also a necessary, since this book is addressed to data professionals, e.g. business and general data analysts. The later chapters also much to offer to the budding data scientist, and to developers on a path to becoming data scientists, since they get to play with Python code running in Jupyter notebooks.