For AtScale Clusters Only: Returning a Failed AtScale Database Instance to an AtScale Cluster

If an AtScale database instance in an AtScale cluster fails, you can bring that instance back into the cluster as a standby instance after resolving the issues that caused it to fail.

If the master database instance fails, the cluster fails over to one of the standby database instance on another node. Use the instructions in this topic for detecting when such a failover has happened, and for bringing a database instance back into the cluster as a standby database instance.

If a standby instance fails, you can use the instructions in this topic for detecting the failure and for bringing the database instance back into the cluster as a standby instance.

Important: The master database instance might not be running on the node that was designated as the master node during the installation of Clustered AtScale. Such a situation does not constitute a problem. The master database instance is initially created on the node that is brought up first during the installation; however, one of the other database instances in the cluster can become the master during the operation of the cluster, so long as there is only one master database instance at a time.

Checking the status of the AtScale database instances

An instance of the AtScale database runs on each node in an AtScale cluster. Only the master database instance is active. The others are on warm standby, with the master instance continuously replicating changes to the standby instances.

You can check the status of all database instances in an AtScale cluster by running the following command on any of the AtScale nodes to find out which node is currently running the master (Leader) database instance:

/opt/atscale/current/bin/database/postgres_nodes

The output is a table that looks like this:

ClusterMemberHostRoleStateTLLag in MB
atscale_postgres_clusteratscale-01atscale-01:10520Leaderrunning1 
atscale_postgres_clusteratscale-02atscale-02:10520 running1 

Bringing a failed database instance back into an AtScale cluster as a standby database instance

If a database instance has failed and the problem has been resolved, do the following to bring the instance back up as a standby database instance in the cluster.

  1. Make sure that the database instance is not a running member of the database cluster.

    /opt/atscale/current/bin/database/postgres_nodes

    The output is a table that looks like this:

    ClusterMemberHostRoleStateTLLag in MB
    atscale_postgres_clusteratscale-01atscale-01:10520 stopped unknown
    atscale_postgres_clusteratscale-02atscale-02:10520Leaderrunning2 
  2. On the failed node, move or remove the /opt/atscale/data/database directory.

  3. On the failed node, restart the database service.

    /opt/atscale/bin/atscale_service_control start database
  4. Make sure that the database instance is a running member of the database cluster.

    /opt/atscale/current/bin/database/postgres_nodes

    The output is a table that looks like this:

    ClusterMemberHostRoleStateTLLag in MB
    atscale_postgres_clusteratscale-01atscale-01:10520 running2 
    atscale_postgres_clusteratscale-02atscale-02:10520Leaderrunning2