The following companies provide (or are in the process of providing) commercial High-Availability software or capabilities for Linux:
IBM Tivoli System Automation for Multi-Platform is easily installed and configured, and lets you manage complex groups of interrelated resources across multiple nodes within a cluster. Its goal-driven recovery algorithms provide a degree of flexibility exceeding what other products provide. It also reacts to failures during the recovery process, adapting its in flight recovery to maintain your most critical applications. Based upon the same automation engine which drives IBM's SA for z/OS automation, it brings a new level of sophistication, where the automation software manages the availability of your applications in response to goals you set and the changing state of your systems.
SteelEye's LifeKeeper High-Availability product for Linux. LifeKeeper provides application recovery kits for simple yet comprehensive support of Oracle, Informix, Apache, Sendmail, SAP R/3 and other major applications. In addition, it can easily be customized to provide support for almost any application by using their generic application recovery kit. LifeKeeper supports virtually any data sharing arrangement from replication through SCSI systems like IBM's ServeRAID to fiber-channel disk systems. 30-day free trials are available. In the U.K Openminds are the SteelEye Competence and Support Centre distributing and supporting the LifeKeeper software throughout the region. Evaluations, Consultations and Web Demos are also available from Openminds.
High-Availability.com provides RSF-1 (Resilient Software Facility - 1). RSF-1 is available on every major Linux distribution, and provides agents for many popular software packages, and supports clusters of up to 64 nodes.
Veritas Cluster Server reduces planned and unplanned downtime, facilitating server consolidation, and managing a wide range of applications in heterogeneous environments. With support for up to 32 node clusters, VERITAS Cluster Server can protect everything from a single critical database instance, to the largest, globally dispersed, multi-application clusters.
Polyserve's MatrixServer is shared data clustering software that enables multiple Linux-based servers to function as a single, easy-to-use, highly available system. It is comprised of high availability services that increase system uptime, a true symmetric cluster file system that enables scalable data sharing, and cluster and storage management capabilities for managing servers and storage as one. Matrix Server delivers an scalability, availability, and manageability and supports database, file serving, web and media serving applications. An evaluation copy can be gotten from their web site.
HP's HP Service Guard for Linux. HP Serviceguard for Linux is a high availability solution that brings HP-UX 11i technologies to the Linux environment providing continuous access to critical applications, information and services.
Emic Application Clustering products provide a high level of reliability for mission critical software applications, protecting them from unwanted downtime. Emic clustering products include m/cluster for MySQL, a/cluster for Apache, lamp/cluster for the LAMP stack, and lamj/cluster for the LAMJ stack. Trial versions of Emic clustering software are available for download.
Radiant Data's PeerFS is a peer-to-peer file replication technology that allows multiple sources and multiple targets. This technology provides significant safety, providing reliable file sharing on a global basis. Read operations are performed locally, without tying up expensive network bandwidth - only the changed bytes within a file are sent over your network. PeerFS provides seamless failover among endpoints in the event of storage failures. Applications never stop running.
SOS's Standby Server provides an automatic server failover capability for your file servers. In the event of main Server Disaster Recovery Standby Serversserver failure, the standby server can take over file serving functions for the server, allowing users to continue working uninterrupted until the main server is restored. [Editor's note: analogous to a subset of DRBD + Heartbeat]
McObject's eXtremeDB High Availability is McObject's eXtremeDB in-memory embedded database with additional high availability subsystem based on a rugged, time-cognizant two-phase commit protocol that ensures changes to the main instance (MI) database and identical standby instances (SI) succeed or fail together. The architecture enables deployment of multiple fully synchronized databases within separate hardware instances, with connection via standard or proprietary communications protocol. eXtremeDB HA is available on numerous Linux variants, including MontaVista Carrier Grade Linux, and available source code ensures that the database can be easily ported to other distributions. The eXtremeDB HA protocol detects any main instance hardware or software failure and notifies a standby copy of the application to elect one standby database instance to take over the role of master database. Similarly, the protocol recognizes failure in standby database instances, resulting in termination of the SI's connection to the MI database and notification to the application so that corrective measures can be taken.
Babel's KeyCluster is a high availability (HA) software for mission critical applications running on Solaris (Sparc and x86), Linux and AIX. KeyCluster guarantees the services to continue and the access to the data in case of software or hardware errors with only a few seconds of downtime. Compared to other clustering software on the market, KeyCluster is cost effective, and notably easy to install and maintain.
Evidian's Safekit. A software solution to allow disaster protection on Linux, SafeKit 6.1 enables real-time replication of critical data plus failover, helping to build disaster-proof Linux applications at a tenth of the price of dedicated hardware solutions. More information can be found at http://www.evidian.com/safekit.
ShaoLin HA Cluster is an affordable and easy-to-use high-availability solution with the simplest management for elimination of system disruptions in the event of server failures. Through a shared image in shared storage, all cluster servers can be mastered concurrently. ShaoLin HA Cluster provides out-of-the-box high-availability, supports failover recovery for the total system rather than individual applications. It uses the state-of-the-art kernel heartbeat technology, a crash safe process driven by the system real time clock (RTC) to ensure accurate sub second system monitoring, control and failure detection.
Clustra Database: Clustra's core product, the Clustra Database, is a zero-downtime, relational SQL database.
TurboLinux High Availability Cluster. This is a commercial product which is similar in intent to the Linux Virtual Server project. In fact, it has used Wensong Zhang's kernel pieces in the past.
PRIMECLUSTER offers enterprises a way of linking servers together to maximize the availability and scalability of their IT infrastructure. A variety of mechanisms is provided to implement high availability and scalability on any layer of today's multi-tier enterprise applications.
Mod_Redundancy is an Apache-Module that creates High Availability for this webserver through a Master/Slave-Mechanism. As soon as the Slave notices that the Master is not available anymore, it takes over the IP-Address and the webservice automatically.
IBM's WebSphere Performance Pack: rumored to be soon ported to Linux.
North Fork Network's SANi.q. product combines with inexpensive hardware (IDE/SCSI disks, Ethernet networks) to create scalable, fault-tolerant, and easy-to-manage Storage Area Networks. An evaluation copy is available from their web site.
Red Hat Cluster Suite: n-node server clusters for failover; can load balance incoming IP network requests across a farm of servers.
Twin Peaks Replication and Clustering: provide a new solution for HA, Load Balance, DR, Online file backup.
Linux NetworX makes hardware specifically for Linux clusters.
VMIC Reflective Memory Reflective Memory is a high-speed, real-time, deterministic network. With Reflective memory, each node on the network has a local copy of shared data. The act of writing the reflective memory causes the local data on all the nodes to be updated.
BIGip: F5 network's turnkey load balancing/HA solution