This site is a work in progress — you can help! Please see the Site news for details.

Ciblint

From Linux-HA

Jump to: navigation, search

Contents

ciblint

The ciblint program examines your CIB in detail, looking for inconsistencies, possible errors, and things you might not have noticed. When it finds them, it prints them out. Not everything it finds is an error, but is probably worth investigating and making sure you understand. The version below works with current versions of Pacemaker.

Source to ciblint

media:Ciblint.gz You may have to change the value of the HA_LIBHBDIR constant in the code. This will eventually get fixed when this gets put under source control somewhere.

ciblint usage message

usage: ./ciblint [-C] -f cib-file 
       ./ciblint [-C] -L
       ./ciblint [-w] (-A|--list-meta_attributes-config-options
       ./ciblint [-w] (-l|--list-crm-config-options)
       ./ciblint [-w] -h --help

  -f cib-filename               analyze CIB from this XML file
  -L --live-cib                 analyze live CIB gotten via cibadmin -Q
  -C --ignore-non-defaults      don't print messages for non-default crm_config values
  -w --wiki-format              print usage or crm-config-options in wiki format
  -l --list-crm_config-options  print all valid names for use in <nvpair> sections
                                inside the <crm_config> section
  -A --list-meta_attributes-config-options
                                print all valid names for use in <nvpair> sections
                                inside <meta_attributes> sections

CIB file can either have a status section or not.  Either is acceptable.
This program is a work-in-progress, but for many CIBs it's probably useful now.

It currently looks for a number of classes of possible errors, including these:
  - Non-unique 'id' strings for the given <tag>
  - 'id' strings (outside the status section) which are not globally unique
  - Incorrect <nvpair> 'name's or 'value's
  - Duplicate <nvpair> 'name's in a list
  - Incorrect XML attribute names or values
  - references to non-existent resources
  - references to non-existent nodes
  - invalid values for the data type (integer, boolean, enum, etc.) involved
  - resources you're not monitoring
  - non-negative values for default-resource-failure-stickiness
  - STONITH not enabled
  - No STONITH resources configured
  - Use of for-testing-only ssh or external/ssh STONITH resource agents
  - validation of <nvpair> names and values in <meta_attributes> sections
  - validation of class and type for <primitive> resources
  - validation of <attributes> names and values in <nvpair>s for <primitive> resources
  - check if clone form resource names are used for non-clone resources
  - ensure that clone form resource names have integral clone numbers
  - check for ids with ":" characters in them in <primitive>, <group>, <clone>, or <master_slave> tags
  - check for resources with non-zero failcounts
  - check for <rule> tags with both score and score_attribute
  - check for missing values and attributes in expressions
  - make sure attributes mentioned in the CIB are defined somewhere in the cluster

More documentation can be found online at http://linux-ha.org/ciblint

The section above is the result of running ciblint -w -h

Special Notes

For ciblint to do the most checking, it will need to run some harmless lrmadmin commands as root. Currently, it will attempt to do that using sudo - which may result in you getting some password: prompts.

Sample Output

INFO: CIB has non-default value for expected-quorum-votes [3].  Default value is [2]
      Explanation of expected-quorum-votes option: The number of nodes
      expected to be in the cluster
      Used to calculate quorum in openais based clusters.
INFO: CIB has non-default value for startup-fencing [false].  Default value is [true]
      Explanation of startup-fencing option: STONITH unseen nodes
      Advanced Use Only! Not using the default is very unsafe!
INFO: CIB has non-default value for pe-input-series-max [5000].  Default value is [-1]
      Explanation of pe-input-series-max option: The number of other
      PE inputs to save
      Zero to disable, -1 to store unlimited.
INFO: CIB has non-default value for dc-deadtime [5s].  Default value is [60s]
      Explanation of dc-deadtime option: How long to wait for a
      response from other nodes during startup.
      The "correct" value will depend on the speed/load of your
      network and the type of switches used.
INFO: CIB has non-default value for no-quorum-policy [ignore].  Default value is [stop]
      Explanation of no-quorum-policy option: What to do when the
      cluster does not have quorum
      What to do when the cluster does not have quorum Allowed values:INFO: CIB has non-default value for expected-quorum-votes [3].  Default value is [2]
      Explanation of expected-quorum-votes option: The number of nodes
      expected to be in the cluster
      Used to calculate quorum in openais based clusters.
INFO: CIB has non-default value for startup-fencing [false].  Default value is [true]
      Explanation of startup-fencing option: STONITH unseen nodes
      Advanced Use Only! Not using the default is very unsafe!
INFO: CIB has non-default value for pe-input-series-max [5000].  Default value is [-1]
      Explanation of pe-input-series-max option: The number of other
      PE inputs to save
      Zero to disable, -1 to store unlimited.
INFO: CIB has non-default value for dc-deadtime [5s].  Default value is [60s]
      Explanation of dc-deadtime option: How long to wait for a
      response from other nodes during startup.
      The "correct" value will depend on the speed/load of your
      network and the type of switches used.
INFO: CIB has non-default value for no-quorum-policy [ignore].  Default value is [stop]
      Explanation of no-quorum-policy option: What to do when the
      cluster does not have quorum
      What to do when the cluster does not have quorum Allowed values:
      stop, freeze, ignore, suicide
INFO: CIB has non-default value for cluster-recheck-interval [15m].  Default value is [15min]
      Explanation of cluster-recheck-interval option: Polling interval
      for time based changes to options, resource parameters and
      constraints.
      The Cluster is primarily event driven, however the configuration
      can have elements that change based on time. To ensure these
      changes take effect, we can optionally poll the cluster's status
      for changes. Allowed values: Zero disables polling. Positive
      values are an interval in seconds (unless other SI units are
      specified. eg. 5min)
INFO: CIB has non-default value for batch-limit [10].  Default value is [30]
      Explanation of batch-limit option: The number of jobs that the
      TE is allowed to execute in parallel
      The "correct" value will depend on the speed and load of your
      network and cluster nodes.
WARNING: external/ssh STONITH resource NOT approved for production
INFO: Resource vip1 running on node xen-e
INFO: Resource ping-1:0 running on node xen-e
WARNING: Resource stateful-1:0 not running anywhere.
INFO: Resource ping-1:2 running on node xen-d
INFO: Resource app1 running on node xen-d
INFO: Resource d2 running on node xen-d
INFO: Resource migrator running on node xen-d
INFO: Resource ping-1:1 running on node xen-f
WARNING: Resource FencingChild not running anywhere.
INFO: Resource stateful-1:1 running on node xen-f
INFO: Resource stateful-1:2 running on node xen-d
INFO: Resource d1 running on node xen-d
      stop, freeze, ignore, suicide
INFO: CIB has non-default value for cluster-recheck-interval [15m].  Default value is [15min]
      Explanation of cluster-recheck-interval option: Polling interval
      for time based changes to options, resource parameters and
      constraints.
      The Cluster is primarily event driven, however the configuration
      can have elements that change based on time. To ensure these
      changes take effect, we can optionally poll the cluster's status
      for changes. Allowed values: Zero disables polling. Positive
      values are an interval in seconds (unless other SI units are
      specified. eg. 5min)
INFO: CIB has non-default value for batch-limit [10].  Default value is [30]
      Explanation of batch-limit option: The number of jobs that the
      TE is allowed to execute in parallel
      The "correct" value will depend on the speed and load of your
      network and cluster nodes.
WARNING: external/ssh STONITH resource NOT approved for production
INFO: Resource vip1 running on node xen-e
INFO: Resource ping-1:0 running on node xen-e
WARNING: Resource stateful-1:0 not running anywhere.
INFO: Resource ping-1:2 running on node xen-d
INFO: Resource app1 running on node xen-d
INFO: Resource d2 running on node xen-d
INFO: Resource migrator running on node xen-d
INFO: Resource ping-1:1 running on node xen-f
WARNING: Resource FencingChild not running anywhere.
INFO: Resource stateful-1:1 running on node xen-f
INFO: Resource stateful-1:2 running on node xen-d
INFO: Resource d1 running on node xen-d
Personal tools