Enhancing Server Reliability and Serviceability
Server Definition
A server is a computer program that provides services to other computer programs in the same or other computers. The computer that a server program runs in is also frequently referred to as a server.
-Unlike a workstation which is dedicated to a single customer, multiple customers depend on a server.
-So Reliability (repair time shorter, better environment) and uptime are a high priority.
-A server may have hundreds, thousands, or millions of clients relying on it.
1. Buy Server Hardware for Servers
It’s sometimes tempting to “save money” by purchasing desktop hardware and loading it with server software. Doing so may work in the short term but is not the best solution for the long term or in large installations. Server hardware usually costs more but has additional features that justify the cost.
Some features are:
1. Extensibility – have more physical space inside for hard drives and more slots for cards and CPUs.
2. More CPU performance – Servers have multiple CPUs and advanced features like pre-fetch, multi-stage processor checking, and the ability to dynamically allocate resources among CPUs.
3. High-performance I/O: Servers usually do more I/O. High-speed internal buses, network interfaces, etc.
4. Upgrade options: Servers are often upgraded, rather than simply replaced; they are designed for growth.
5. Rack mountable: Servers should be rack-mounted. The rack contains multiple mounting slots called bays, each designed to hold a hardware unit secured in place with screws.
6. No side-access needs
A rack-mounted host is easier to repair or perform maintenance on if tasks can be done while it remains in the rack. Such tasks can be done without access to the sides of the machine.
2. Choose Vendors Known for Reliable Products
Talk with other SA’s to find out which vendors they use and which ones they avoid.
3. Consider Maintenance Contracts and Spare Parts
Ex: Different types of maintenance contract options like onsite service with a 4-hour response time, 12-hour response time, or next day options. Vendors usually require notification and authorization for returning broken parts; this authorization is called returned merchandise authorization (RMA). Some vendors will not ship the replacement part until they receive the broken part. Better vendors will ship the replacement immediately and expect you to return the broken part within a certain amount of time. This is called cross-shipping.
4. Maintaining Data Integrity
Servers have critical data and unique configurations that must be protected.
5. Put Servers in the Data Center
Servers should be installed in an environment with proper power, fire protection, networking, cooling, and physical security. Physical space has to be allocated when it is purchased. A server room is any room that happens to be (mostly) used to store servers. A data center is a whole building dedicated to (and, in most cases, specially designed to) contain and support a large amount of computing hardware of some sort. The main difference is size, but it is linked to design, scale, and purpose. There will be several server rooms in almost any modern office building, but only very large companies and/or large companies whose business is about processing data will have data centers.
The difference is a bit like between a car and a coach: they have the same primary function, but coaches are a more specialized tool that is more efficient at moving large numbers of people on specific routes. Companies will own coaches only if they are large enough to have many people to move at once, or if their business is about moving people.
Enhancing Reliability and Serviceability
-*- Server Appliances
An appliance is a device designed specifically for a particular task. Toasters make toast, blenders blend, etc.
In computers: file server appliances, web server appliances, email appliances, etc.
-*- Redundant Power Supplies
After hard drives, the next most failure-prone component is the power supply.
It means that the system can be operational if one power supply is not functioning: n+1 redundancy. Sometimes a fully loaded system requires two power supplies to receive enough power. In this case, redundant means having three power supplies.– Fully Redundancy
Two complete sets of hardware are linked by a failover configuration. The first system is performing a service and the second system sits idle, waiting to take over in case the first one fails. This failover might happen manually or automatically.
-*- Hot Swap Components
Redundant components should be hot-swappable. Hot-swap refers to the ability to remove and replace components while the system is running. [no downtime required]. (Like changing the tire while the car is running down the highway)