Analyze, manipulate, and process datasets of varying sizes efficiently using Haskell
About This Book
- Create portable databases using SQLite3 and use these databases to quickly pull large amounts of data into your Haskell programs.
- Visualize data using EasyPlot and create publication-ready charts
- An easy-to-follow guide to analyze real-world data using the most commonly used statistical techniques
Who This Book Is For
If you are a developer, analyst, or data scientist who wants to learn data analysis methods using Haskell and its libraries, then this book is for you. Prior experience with Haskell and a basic knowledge of data science will be beneficial.
What You Will Learn
- Learn the essential tools of Haskell needed to handle large data
- Migrate your data to a database and learn to interact with your data quickly
- Clean data with the power of Regular Expressions
- Plot data with the Gnuplot tool and the EasyPlot library
- Formulate a hypothesis test to evaluate the significance of your data
- Evaluate the variance between columns of data using a correlation statistic and perform regression analysis
Haskell is trending in the field of data science by providing a powerful platform for robust data science practices. This book provides you with the skills to handle large amounts of data, even if that data is in a less than perfect state. Each chapter in the book helps to build a small library of code that will be used to solve a problem for that chapter. The book starts with creating databases out of existing datasets, cleaning that data, and interacting with databases within Haskell in order to produce charts for publications. It then moves towards more theoretical concepts that are fundamental to introductory data analysis, but in a context of a real-world problem with real-world data. As you progress in the book, you will be relying on code from previous chapters in order to help create new solutions quickly. By the end of the book, you will be able to manipulate, find, and analyze large and small sets of data using your own Haskell libraries.