Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake

Shiran, Tomer, Hughes, Jason, Merced, Alex

  • 出版商: O'Reilly
  • 出版日期: 2024-06-11
  • 定價: $2,450
  • 售價: 9.5$2,328
  • 語言: 英文
  • 頁數: 341
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098148622
  • ISBN-13: 9781098148621
  • 相關分類: JVM 語言
  • 立即出貨 (庫存=1)


Traditional data architecture patterns are severely limited. To use these patterns, you have to ETL data into each tool--a cost-prohibitive process for making warehouse features available to all of your data. This lack of flexibility forces you to adjust your workflow to the tool your data is locked in, which creates data silos and data drift. This book shows you a better way.

Apache Iceberg provides the capabilities, performance, scalability, and savings that fulfill the promise of an open data lakehouse. By following the lessons in this book, you'll be able to achieve interactive, batch, machine learning, and streaming analytics with this lakehouse. Authors Tomer Shiran, Jason Hughes, and Alex Merced from Dremio guide you through the process.

With this book, you'll learn:

  • The architecture of Apache Iceberg tables
  • What happens under the hood when you perform operations on Iceberg tables
  • How to further optimize Apache Iceberg tables for maximum performance
  • How to use Apache Iceberg with popular data engines such as Apache Spark, Apache Flink, and Dremio Sonar
  • How Apache Iceberg can be used in streaming and batch ingestion

Discover why Apache Iceberg is a foundational technology for implementing an open data lakehouse.



Apache Iceberg提供了實現開放式數據湖倉庫承諾的能力、性能、可擴展性和節省成本。通過遵循本書的教學,您將能夠在這個湖倉庫中實現互動式、批量、機器學習和流式分析。Dremio的Tomer Shiran、Jason Hughes和Alex Merced將指導您完成這個過程。

- Apache Iceberg表的架構
- 在Iceberg表上執行操作時的內部運作
- 如何進一步優化Apache Iceberg表以獲得最佳性能
- 如何將Apache Iceberg與流行的數據引擎(如Apache Spark、Apache Flink和Dremio Sonar)一起使用
- 如何在流式和批量載入中使用Apache Iceberg

發現為什麼Apache Iceberg是實現開放式數據湖倉庫的基礎技術。