Data Center Design and Management: A Comprehensive Guide for System Administrators

Chapter 1: Workstation Operating System Maintenance

Critical Issues

Maintaining workstation operating systems involves three key tasks:

  1. Initial system software and application loading
  2. System software and application updates
  3. Network parameter configuration

Automating these tasks is crucial for cost-effective site management.

Stop-Gap Measures

To prevent temporary solutions from becoming permanent, create a ticket to document the need for a permanent fix.

Cloning and Other Methods

Some sites use hard disk cloning to create new machines with identical software configurations. This involves setting up a “golden host” and copying its hard disk to new computers.

Chapter 2: Server Management

Definition

A server is a computer program or hardware that provides services to other programs or computers. Servers prioritize reliability and uptime due to their dependence on multiple clients.

Hardware for Servers

1. Buy Server Hardware

While desktop hardware may seem cost-effective, server hardware offers features that justify the higher cost, such as:

  • Extensibility
  • Enhanced CPU performance
  • High-performance I/O
  • Upgrade options
  • Rack mountability
  • No side-access needs

2. Choose Reliable Vendors

Consult with other system administrators to identify reputable vendors.

3. Consider Maintenance Contracts and Spare Parts

Explore maintenance contract options and understand return merchandise authorization (RMA) and cross-shipping processes.

4. Maintaining Data Integrity

Protect critical data and unique configurations on servers.

5. Put Servers in the Data Center

Install servers in a controlled environment with proper power, fire protection, networking, cooling, and physical security.

Server Appliances

Appliances are devices designed for specific tasks, such as file servers, web servers, and email appliances.

Redundant Power Supplies

Implement redundant power supplies to ensure system operation in case of failure. Consider n+1 redundancy or fully redundant systems with failover configurations.

Chapter 3: Data Center Design and Management

Data Center Basics

Data centers house shared resources and typically include systems for cooling, humidity control, power, and fire suppression. Consider the following aspects when designing a data center:

  1. Location
  2. Access
  3. Security
  4. Power and Cooling
  5. Fire Suppression
  6. Racks
  7. Wiring
  8. Labeling
  9. Communication
  10. Console Access
  11. Workbench
  12. Tools and Supplies
  13. Parking Spaces

Location

Choose a location that is not prone to natural disasters.

Access

Ensure compliance with local laws regarding access and consider equipment movement logistics.

Security

Implement access control measures such as proximity badge systems and consider two-person entry protocols for high-security environments.

Power and Cooling

Direct airflow using raised floors or ceiling-mounted systems to maintain optimal operating temperatures.

Fire Suppression

Install a fire suppression system and consider linking it with a power shutoff switch.

Racks

Organize equipment and optimize airflow using racks.

Wiring

Plan for cable management to maintain a tidy data center.

Labeling

Label all equipment clearly on both front and back.

Communication

Facilitate communication between system administrators, customers, and vendors.

Console Access

Utilize console servers for remote access to equipment.

Workbench

Consider using a workbench tool for database administration tasks.

Tools and Supplies

Maintain a fully stocked inventory of cables, tools, and spare parts.

Parking Spaces

Designate parking spaces for mobile items and label tool carts for organization.