Каталог статей
Меню сайта


Форма входа


Категории раздела
Oracle DB administering [46]
Oracle DB programming [15]
Oracle DB troubleshooting [11]


Поиск


Друзья сайта
  • Официальный блог
  • Сообщество uCoz
  • FAQ по системе
  • Инструкции для uCoz


  • Статистика

    Онлайн всего: 1
    Гостей: 1
    Пользователей: 0


    Приветствую Вас, Гость · RSS 09.05.2024, 15:52
    Главная » Статьи » Oracle DB » Oracle DB administering

    RAC voting disk

    http://oracleinaction.com/voting-disk/

     

    11g R2 RAC : VOTING DISK DEMYSTIFIED

                                          Voting disk in 11g

     
    In this post, I will write about voting disk – what does it contain, who updates it, how is it used, where is it stored and so on…
     
    Voting disk a key component of clusterware and its failure can lead to inoperability of the cluster.
     
    In RAC at any point in time the clusterware must know which nodes are member of the cluster so that
    - it can perform load balancing
    - In case a node fails, it can perform failover of resources as defined in the resource profiles
    - If a node joins, it can start resources on it as defined in OCR/OLR
    - If a node joins, it can assign VIP to it in case GNS is in use
    - If a node fails, it can execute callouts if defined
     
    and so on
    Hence, there must be a way by which clusterware can find out about the node membership.
     
      That is where voting disk comes into picture. It is the place where nodes mark their attendance. Consider an analogy where a manager wants to find out which of his subordinates are present. He can just check the attendance register and assign them their tasks accordingly. Similarly, CSSD process on every node makes entries in the voting disk to ascertain the membership of that node. The voting disk  records node membership information. If it ever fails, the entire clustered environment for Oracle 11g RAC will be adversely affected and a possible outage may result if the vote disks is/are lost.
     
    Also, in a cluster communication between various nodes is of paramount importance.  Nodes which can’t communicate with other nodes  should be evicted from the cluster. While marking their own presence, all the nodes also register the information about their communicability with other nodes in voting disk . This is called network heartbeat. CSSD process in each RAC node maintains its heart beat in a block of size 1 OS block, in the hot  block of voting disk  at a specific offset.  The written block has a header area with the node name.  The heartbeat counter increments every second on every write call. Thus heartbeat of various nodes is recorded at different offsets in the voting disk. In addition to maintaining its own disk block, CSSD processes also monitors the disk blocks maintained by the CSSD processes running in other cluster nodes. Healthy nodes will have continuous network and disk heartbeats exchanged between the nodes. Break in heart beat indicates a possible error scenario.If the disk block is not updated in a short timeout period, that node is considered unhealthy and  may be rebooted to protect the database information. In this case , a message to this effect is written in the kill block of the node. Each node  reads its kill block once per second, if the kill block is overwritten node commits  suicide.
     
    During reconfig (join or leave) CSSD monitors all nodes and determines whether  a node has a disk heartbeat, including those with no network heartbeat. If no disk  heartbeat is detected  then node is declared as dead.
     
     What is stored in voting disk?
    ——————————
    Voting disks contain static and dynamic data.
    Static data : Info about nodes in the cluster
    Dynamic data : Disk heartbeat logging
     
    It maintains and consists of important details about the cluster nodes membership, such as
    - which node is part of the cluster,
    - who (node) is joining the cluster, and
    - who (node) is leaving the cluster.
     
    Why is voting disk needed ?
    —————————
    The Voting Disk Files are used by Oracle Clusterware  by way of a health check .
     
    - by CSS to determine which nodes are currently members of the cluster
     
    - in concert with other Cluster components such as CRS to shut down, fence, or reboot either single or multiple nodes whenever network communication is lost between any node within the cluster, in order to prevent the dreaded split-brain condition in which two or more instances attempt to control the RAC database. It  thus protects the database information.
     
    - It will be used by the CSS daemon to arbitrate with peers that it cannot see over the private interconnect in the event of an outage, allowing it to salvage the largest fully connected subcluster for further operation.  It checks the voting disk to determine if there is a failure on any other nodes in the cluster. During this operation, NM will make an entry in the voting disk to inform its vote on availability. Similar operations are performed by other instances in the cluster. The three voting disks configured also provide a method to determine who in the cluster should survive. For example, if eviction of one of the nodes is necessitated by an unresponsive action, then the node that has two voting disks will start evicting the other node. NM alternates its action between the heartbeat and the voting disk to determine the availability of other nodes in the cluster.
     
    The Voting disk is the key communication mechanism within the Oracle Clusterware where all nodes in the cluster read and write heartbeat information. CSSD processes (Cluster Services Synchronization Daemon) monitor the health of  RAC nodes employing two distinct heart beats: Network heart beat and Disk heart beat. Healthy nodes will have continuous network and disk heartbeats exchanged between the  nodes. Break in heart beat indicates a possible error scenario. There are few different scenarios possible with missing heart beats:
    1. Network heart beat is successful, but disk heart beat is missed.
    2. Disk heart beat is successful, but network heart beat is missed.
    3. Both heart beats failed.
    In addition, with numerous nodes, there are other possible scenarios too. Few possible scenarios:
    1. Nodes have split in to N sets of nodes, communicating within the set, but not with members in other set.
    2. Just one node is unhealthy.
    Nodes with quorum will maintain active membership of the cluster and other node(s) will be fenced/rebooted.
    Why should we have an odd number of voting disks?
    ————————————————-
    The odd number of voting disks configured provide a method to determine who in the cluster should survive.
    A node must be able to access more than half of the voting disks at any time. For example, let’s have a two node cluster with an even number of let’s say 2 voting disks. Let Node1 is able to access voting disk1 and Node2 is able to access voting disk2 . This means that there is no common file where clusterware can check the heartbeat of both the nodes.  If we have 3 voting disks and both the nodes are able to access more than half i.e. 2 voting disks, there will be at least on disk which will be accessible by both the nodes. The clusterware can use that disk to check the heartbeat of both the nodes. Hence, each  node should be  able to access more than half the number of voting disks. A node not able  to do so will have to be evicted from the cluster by another node that has more than half the voting disks, to maintain the integrity of the cluster  . After the cause of the failure has been corrected and access to the voting disks has been restored, you can instruct Oracle Clusterware to recover the failed node and restore it to the cluster.
     
       Loss of more than half your voting disks will cause the entire cluster to fail !!
    Категория: Oracle DB administering | Добавил: basil (08.09.2015)
    Просмотров: 639 | Комментарии: 4 | Рейтинг: 0.0/0
    Всего комментариев: 0
    Имя *:
    Email *:
    Код *:
    Бесплатный конструктор сайтов - uCoz