Building Clustered Linux Systems
Robert W. Lucke
Praise for Building Clustered Linux Systems
"The author does an outstanding job of presenting a very complicated subject. I very much commend this work. The author sets the pace and provides vital resources and tips along the way. He also has a very good sense of humor that is crafted in the text in such a way that makes the reading enjoyable just when the subject may demand a break. This book should be a requirement for those that are clustering or considering clustering and especially those considering investing a great deal of financial resource toward that goal."
¿Joe Brazeal, Information Technician III, Southwest Power Pool
"This book is for Beginner and Intermediate level system administrators, engineers, and researchers, who want to learn how to build Linux clusters. The book covers everything very well."
¿Ibrahim Haddad, Senior Researcher, Ericsson Corporate Unit of Research
"Nothing that I know of exists yet that covers this subject in as much depth and detail. The practical ¿hands-on¿ approach of this book on how to build a Linux cluster makes this a very valuable reference for a very popular, highly demanded technology."
¿George Vish, II, Linux Curriculum Program Manager and Senior Education Consultant, HP
"In my opinion there is a significant lack of literature on this subject. Most of the currently available books are either dated or do not address the complete picture of the range of decisions that must go into building a Linux cluster. I feel comfortable recommending this to anyone interested in building a Linux cluster to better understand both the technical aspects of building and designing a Linux cluster, but also the business aspects of the same."
¿Randall Splinter Ph.D., Senior Solution Architect, HP
"The author has set a precedent in the cluster design and integration process that is lacking in the industry today."
--Stephen Gray, Senior Applications Engineer, Altair Engineering, Inc.
The Practical, Step-by-Step Guide to Building and Running Linux Clusters
Low-cost, high-performance Linux clusters are the best solution for an increasingly wide range of technical and business problems. Until now, however, building and managing Linux clusters has required more specialized knowledge than most IT organizations possess. This book dramatically lowers the learning curve, bringing together all the hands-on knowledge and step-by-step techniques you'll need to get the job done.
Using practical examples, Robert Lucke simplifies every facet of cluster design and integration: networking, hardware, architecture, operating environments, data sharing, applications, and more. Lucke, who helped prototype and implement one of the world's largest Linux clusters, systematically addresses the key issues you'll encounter and the key decisions you'll have to make. Coverage includes:
- Basic clustering concepts, hardware components, and architectural models
- A step-by-step cluster creation process: design, installation, and testing
- Choosing and implementing the optimal hardware configuration for your environment
- Life in the fast LAN: high-speed cluster interconnects
- Software issues: distributions, bootup, disks, partitioning, file systems, middleware, and more
Table of Contents:
List of Figures.
List of Tables.
I. INTRODUCTION TO CLUSTER CONCEPTS.
1. Parallel Power: Defining the Clustered System Approach.
Avoiding Difficulties with the Word Cluster.
Defining a Cluster.
The Evolution of a Clustered Solution.
Collapsed Network Computing for Engineering.
Scientific Cluster Computing.
Revisiting the Definition of Cluster.
Commercial Cluster Computing.
High Performance, High Throughput, and High Availability.
A Formal Definition of Cluster.
The Why and Wherefore of Clusters.
2. One Step at a Time: A Process for Building Clusters.
Building Clusters as a Complex Endeavor.
Talking about the "P Word".
Presenting a Formal Cluster Creation Process.
Formal Cluster Process Summary.
II. CLUSTER ARCHITECTURE AND HARDWARE COMPONENTS.
3. Underneath the Hood: Cluster Hardware Components and Architecture.
Hardware Categories in a Cluster.
A Survey of Cluster Hardware Configurations.
High-Throughput Cluster Configurations.
High-Availability Cluster Configurations.
High-Performance Cluster Configurations.
Common Cluster Hardware Architecture.
Cluster Hardware Architecture Summary.
4. Any Way You Slice It: Work and Master Nodes in a Cluster.
Criteria for Selecting Compute Slices.
An Example Compute Slice from Hewlett-Packard.
Thirty-two Bit and 64-Bit Compute Slices.
Memory and Cache Latency.
Number of Processors in a Compute Slice.
I/O Interface Capacity and Performance.
Compute Slice Operating System Support.
Master Node Characteristics.
Compute Slice and Master Node Summary.
5. Packet In: Cluster Networking Basics and Example Devices.
A Short View of Ethernet Networking History.
The Open System Interconnect (OSI) Communication Model.
Ethernet Network Topologies.
Internet Protocol and Addressing.
Ethernet Switching Technology.
Ethernet Networking Summary.
6. Tying It Together: Cluster Data, Management, and Control Networks.
Networked System Management and Serial Port Access.
Cluster Ethernet Network Design.
An Example Cluster Ethernet Network Design.
Cluster Network Design Summary.
7. Life in the Fast LAN: HSIs and Your Cluster.
HSI Latency and Bandwidth.
Examining HSI Topologies.
Ethernet for HSI.
Myricom's Myrinet HSI.
HSI Technology Summary and Comparison.
III. CLUSTER SOFTWARE ARCHITECTURE.
8. The Right Stuff: Linux as the Basis for Clusters.
Choosing a Cluster Operating System.
Introducing the Linux Operating System and Licensing.
Managing Open-Source Software "Churn".
Commercial Linux Distributions.
Free Linux Distributions.
Conclusions about Linux for Clusters.
9. Round and Round It Goes: Booting, Disks, Partitioning, and Local File Systems.
Disk Partitioning, Booting, and the BIOS.
Booting the Linux Kernel.
The Linux Initial RAM Disk Image.
Linux Local Disk Storage.
Linux File System Types.
The Linux /proc and devfs Pseudo File Systems.
The Linux ext2 and ext3 Physical File Systems.
Standard Mount Options for All File Systems.
The Temporary File System.
Other Available File System Types.
Advanced Performance Tuning.
A Word about SMART Monitoring for Disks.
Local Disks and File Systems Summary.
10. Supporting Role: Infrastructure Services and Administration.
The Big Infrastructure Picture.
Initializing Your Cluster's Software Infrastructure.
Infrastructure Implementation Recommendations.
Protecting Active Configuration Information.
Preparation for Infrastructure Installation.
Enabling and Starting Linux Services.
Infrastructure Services Summary.
11. Reach Out and Access Something: Remote Access Services, DHCP, and System Logging.
Continuing Infrastructure Installation.
"Traditional" User Login and Authentication.
Remote Access Services.
Using BSD Remote Access Services.
Kerberized Versions of BSD/ARPA Remote Services.
The Secure Shell.
The Parallel Distributed Shell.
Logging System Activity.
Access and Logging Services Summary.
12. Installment Plan: Introduction to Compute Slice Configuration and Installation.
Compute Slice Configuration Considerations.
One Thousand Pieces Flying in Close Formation.
The Single-System View.
A Generalized Network Boot Facility: pxelinux.
Configuring Network kickstart.
NFS Diskless Configuration.
Introduction to Compute Slice Installation Summary.
13. Improving Your Images: System Installation with SystemImager.
Using the SystemImager Software.
The SI flamethrower Facility.
System Installation with SI Summary.
14. To Protect and Serve: Providing Data to Your Cluster.
Introduction to Cluster File Systems.
A Survey of Some Open-Source Parallel File Systems.
Commercially Available Cluster File Systems.
Cluster File System Summary.
15. Stuck in the Middle: Cluster Middleware.
Introduction to Cluster Middleware.
The MPICH Library.
The Simple Linux Utility for Resource Management.
The Maui Scheduler.
The Ganglia Distributed Monitoring and Execution System.
Monitoring with Nagios.
Cluster Middleware Summary.
An Afterword on Linux High-Availability and Open-Source.
16. Put Tab A in Slot C: OSCAR, Rocks, OpenMOSIX, and the Globus Toolkit.
Introducing Cluster-Building Toolkits.
General Cluster Toolkit Installation Process.
Installing a Cluster with OSCAR.
Installing a Cluster with NPACI Rocks.
The OpenMOSIX Project.
Introduction to the Grid Concept.
The Globus Toolkit.
Cluster-Building Toolkit Summary.
IV. BUILDING AND DEPLOYING YOUR CLUSTER.
17. Dollars and Sense: Cluster Economics.
Setting the Ground Rules.
Cluster Cabling and Complexity.
Eight-Compute Slice Cluster Hardware Costs.
Sixteen-Compute Slice Cluster Hardware Costs.
Thirty-two-Compute Slice Hardware Costs.
Sixty-four-Compute Slice Hardware Costs.
One Hundred Twenty-eight-Compute Slice Hardware Costs.
The Land beyond 128 Compute Slices.
Hardware Cost Trends and Analysis.
Cluster Economics Summary.
18. Racking Your Brains: Example Cluster Rack Assembly Steps.
Examining the Cluster Assembly Process.
Some "Rules of Thumb" for Physical Cluster Assembly.
Detailed Cluster Assembly Steps.
Learning from the Example Steps.
Physical Assembly Conclusions.
19. Getting Your Cluster Wired: An ExampleCable-Labeling Scheme.
Defining the Cable Problem.
Different Classes of Cabling.
A First Pass at a Cable-Labeling Scheme.
Refining the Cable Documentation Scheme.
Calculating the Work in Cable Installation.
Minimizing Interrack Cabling.
Cable Labeling System Summary.
20. Physical Constraints: Heat, Space, and Power.
Identifying Physical Constraints for Your Cluster.
Space, the Initial Frontier.
System Power Utilization.
Taking the Heat.
Physical Constraints Summary.
Appendix A. Acronym List.
Appendix B. List of URLs and Software Sources.