Databricks Certified Associate Developer for Apache Spark Using Python: The ultimate guide to getting certified in Apache Spark using practical exampl
暫譯: Databricks 認證助理開發者:使用 Python 的 Apache Spark 完全指南,透過實務範例獲得認證
Shah, Saba
- 出版商: Packt Publishing
- 出版日期: 2024-06-14
- 售價: $1,500
- 貴賓價: 9.5 折 $1,425
- 語言: 英文
- 頁數: 274
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1804619787
- ISBN-13: 9781804619780
-
相關分類:
Spark
海外代購書籍(需單獨結帳)
商品描述
Learn the concepts and exercises needed to get certified as a Databricks Associate Developer for Apache Spark 3.0 and validate your skills as a Spark expert with an industry-recognized credential
Key Features- Understand the fundamentals of Apache Spark to help you design robust and fast Spark applications
- Delve into various data manipulation components for each phase of your data engineering project
- Prepare for the certification exam with sample questions and mock exams, and get closer to your goal
- Purchase of the print or Kindle book includes a free PDF eBook
With extensive data being collected every second, computing power cannot keep up with this pace of rapid growth. To make use of all the data, Spark has become a de facto standard for big data processing. Migrating data processing to Spark will not only help you save resources that will allow you to focus on your business, but also enable you to modernize your workloads by leveraging the capabilities of Spark and the modern technology stack for creating new business opportunities.
This book is a comprehensive guide that lets you explore the core components of Apache Spark, its architecture, and its optimization. You'll become familiar with the Spark dataframe API and its components needed for data manipulation. Next, you'll find out what Spark streaming is and why it's important for modern data stacks, before learning about machine learning in Spark and its different use cases. What's more, you'll discover sample questions at the end of each section along with two mock exams to help you prepare for the certification exam.
By the end of this book, you'll know what to expect in the exam and how to pass it with enough understanding of Spark and its tools. You'll also be able to apply this knowledge in a real-world setting and take your skillset to the next level.
What you will learn- Create and manipulate SQL queries in Spark
- Build complex Spark functions using Spark UDFs
- Architect big data apps with Spark fundamentals for optimal design
- Apply techniques to manipulate and optimize big data applications
- Build real-time or near-real-time applications using Spark Streaming
- Work with Apache Spark for machine learning applications
This book is for you if you're a professional looking to venture into the world of big data and data engineering, a data professional who wants to endorse your knowledge of Spark, or a student. Although working knowledge of Python is required, no prior Spark knowledge is needed. Additionally, experience with Pyspark will be beneficial.
商品描述(中文翻譯)
學習成為 Databricks Associate Developer for Apache Spark 3.0 所需的概念和練習,並以業界認可的證書驗證您作為 Spark 專家的技能
主要特點
- 了解 Apache Spark 的基本原理,幫助您設計穩健且快速的 Spark 應用程式
- 深入探討數據工程專案每個階段的各種數據操作組件
- 透過範例題和模擬考試準備認證考試,讓您更接近目標
- 購買印刷版或 Kindle 書籍可獲得免費 PDF 電子書
書籍描述
隨著每秒收集大量數據,計算能力無法跟上這種快速增長的步伐。為了充分利用所有數據,Spark 已成為大數據處理的事實標準。將數據處理遷移到 Spark 不僅能幫助您節省資源,讓您專注於業務,還能通過利用 Spark 的能力和現代技術堆疊來現代化您的工作負載,創造新的商業機會。
本書是一本全面的指南,讓您探索 Apache Spark 的核心組件、架構及其優化。您將熟悉 Spark dataframe API 及其數據操作所需的組件。接下來,您將了解什麼是 Spark streaming 以及它為何對現代數據堆疊如此重要,然後學習 Spark 中的機器學習及其不同的使用案例。此外,您還會在每個章節結尾發現範例問題以及兩個模擬考試,幫助您準備認證考試。
在本書結束時,您將知道考試的期望內容以及如何通過考試,並對 Spark 及其工具有足夠的理解。您還將能夠在實際環境中應用這些知識,將您的技能提升到一個新水平。
您將學到的內容
- 在 Spark 中創建和操作 SQL 查詢
- 使用 Spark UDFs 構建複雜的 Spark 函數
- 利用 Spark 基礎知識架構大數據應用程式以達到最佳設計
- 應用技術來操作和優化大數據應用程式
- 使用 Spark Streaming 構建實時或近實時應用程式
- 使用 Apache Spark 進行機器學習應用程式的開發
本書適合誰
如果您是一位專業人士,想要進入大數據和數據工程的世界,或是一位希望證明自己對 Spark 知識的數據專業人士,或是一名學生,那麼本書適合您。雖然需要具備 Python 的工作知識,但不需要先前的 Spark 知識。此外,擁有 Pyspark 的經驗將會是有益的。