一篇读懂BBU

| 分类 Storage  | 标签 megacli 

前言

之前已经写了一些关于RAID和MegaCli的文章,这些文章大部分属于常识,但是这些常识分散在互联网的各处,到了用的时候,不太容易查找。每次排查BBU的问题,都会讲这些常识遗忘,然后再浪费时间去查,所以我痛定思痛,把相关的知识点都整理到这里。

啥是BBU

BBU到底是个啥?

因为我们不是搞硬件的,我们可以粗略地把BBU当成一块电池,这块电池用来保护RAID卡上的Cache。一般来讲,RAID卡都会有1G的Cache,这一层Cache对于RAID的写入至关重要,使用RAID卡的Cache可以显著地提升RAID的写入性能。

               HW Configuration
                ================
SAS Address      : 500605b009e97840
BBU              : Present
Alarm            : Present
NVRAM            : Present
Serial Debugger  : Present
Memory           : Present
Flash            : Present
Memory Size      : 1024MB             <--------1024MB 即RAID Cache的大小

但是这里面有个问题,就是掉电保护。我们都知道文件系统fsync或者sync相关的系统调用,会将页高速缓存(Page Cache)中的dirty flush到底层块设备,对于RAID管理的块设备,如果使用了RAID Cache,那么sync也只保证写入RAID Cache,并不保证一定落在底层持久化的磁盘上。这就带来了致命的风险,即异常掉电的时候,RAID Cache中的数据可能会丢失。

这里面有一个dirty冲刷周期的参数:

Cache Flush Interval             : 4s

因此,掉电的时候,可能会有过去4秒的dirty内容还未刷入持久化的设备,

那怎么办?BBU就粉墨登场了。

注意,如果你的RAID卡没有BBU,却非要使用WriteBack 的Write Policy,那么掉电之后,丢数据几乎是肯定的。所以要享受RAID 卡上Cache带来的性能提升,就必须配置BBU。

掉电之后,BBU能保护RAID Cache中的数据多久

BBU是能保护RAID Cache中尚未来得及写入底层持久化设备的数据,但是BBU不是核反应堆,它的保护能力是有限度的。一般来讲BBU可以保护RAID Cache中的数据2天左右。

Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

上述内容的含义是说,BBU承诺48小时的保护。

我们曾经有过案例,关机是强制关机,开机之后发现有data loss,后来才知道,强制关机7天之后,才开机,BBU守护了RAID Cache中dirty 内容一段时间之后,终于油尽灯枯。

如果设置Write Policy为NO Write Cache if Bad BBU,即BBU剩余电量不足与保护Cache中的内容1天以上,那么会自动讲Write Policy从WB调整成WT。

注意,机器上电之后,RAID cache中的数据会写入底层的磁盘。

如何检查是否存在BBU

MegaCli AdpAllInfo -A0中会在HW Configuration部分显示是否存在BBU:

               HW Configuration
                ================
SAS Address      : 500605b009e97840
BBU              : Present                 <-------------- present表示存在BBU
Alarm            : Present
NVRAM            : Present
Serial Debugger  : Present
Memory           : Present
Flash            : Present
Memory Size      : 1024MB

RAID的WritePolicy 何时会变成WT

我们都知道WB的效能比WT的效能要高,有以下情况会变成WT

  • 用户设置WritePolicy 为WT
MegaCli LDSetProp WT -L2 -A0
  • 设置了Write Policy 为WB并且设置了NoCachedBadBBU,但是电池坏了或者有故障了
  • battery处在“low-charge state”阶段

简单来说,就是说电池的电量不足以保持24小时的数据时,就会讲策略从WB改成WT

seqNum: 0x00016bef
Time: Fri Apr 28 05:08:02 2017

Code: 0x000000c3
Class: 1
Locale: 0x08
Event Description: BBU disabled; changing WB virtual disks to WT, Forced WB VDs are not affected
Event Data:
===========
None

这种行为一般发生在Learn Cycle的 Discharging Cycle步骤,因为会主动放点,会导致电池处于low-charge state。在这个情况下,我们通过MegaCli命令是无法将Write Policy改成WB的:

root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -immediate -L2 -A0
                                     
Battery capacity is below the threshold value

Adapter 0: Set Property Failed on Vd object

Exit Code: 0x01

注意,并不是说电池电量满足可以保证24小时数据时就会恢复成WB,而是更迟一些,电池电量充满到85%之后,才会从WT改成WB。

seqNum: 0x00016bf6
Time: Fri Apr 28 05:09:04 2017

Code: 0x00000093
Class: 0
Locale: 0x08
Event Description: Battery started charging
Event Data:
===========
None


seqNum: 0x00016bf7
Time: Fri Apr 28 06:59:34 2017

Code: 0x000000c2
Class: 0
Locale: 0x08
Event Description: BBU enabled; changing WT virtual disks to WB
Event Data:
===========
None

理解BBU的的Capacity信息

root@node01:~# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -A0
                                     
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3986 mV
Current: 0 mA
Temperature: 35 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : None
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested	                  : Yes
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No

BBU GasGauge Status: 0x0180 
  Relative State of Charge: 89 %
  Charger System State: 1
  Charger System Ctrl: 0
  Charging current: 0 mA
  Absolute state of charge: 70 %
  Max Error: 0 %
  Battery backup charge time : 48 hours +

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 89 %
  Absolute State of charge: 70 %
  Remaining Capacity: 1055 mAh
  Full Charge Capacity: 1198 mAh
  Run time to empty: Battery is not being charged.  
  Average time to empty: 2 Hour, 7 Min. 
  Estimated Time to full recharge: Battery is not being charged.  
  Cycle Count: 12

BBU Design Info for Adapter: 0

  Date of Manufacture: 03/19, 2012
  Design Capacity: 1500 mAh
  Design Voltage: 4100 mV
  Specification Info: 0
  Serial Number: 562
  Pack Stat Configuration: 0x0000
  Manufacture Name: LS36691
  Firmware Version   : 
  Device Name: iBBU-09
  Device Chemistry: LION
  Battery FRU: N/A
  Transparent Learn = 0
  App Data = 0

BBU Properties for Adapter: 0

  Auto Learn Period: 28 Days
  Next Learn time: Thu Jun  1 14:17:44 2017
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Enabled
  BBU Mode = 5

Exit Code: 0x00
                           

首先要关注的信息是:

BatteryType: iBBU-09

这个显示的是BBU的型号信息。

BBU由锂离子电池和电子控制电路组成。 锂离子电池的寿命取决于其老化程度,从出厂之后,无论它是否被充电及它的充放电次数多与少,锂离子电池的容量将慢慢的减少。这意味着一个老电池无法像新电池那么持久。 也就决定了BBU的相对充电状态(Relative State of Charge)不会等于绝对充电状态(Absolute State of Charge)。

这个方面并不难理解,就想笔记本的电池一样,新买来的笔记本电池充满电之后,能够支撑笔记本使用5小时,但是随着时间流逝,使用很久之后,笔记本电池充饱电,也不能支撑笔记本使用5小时,可能只能支撑3小时,这是由于电池老化引起的。BBU也是同样的道理:

  Relative State of Charge: 89 %
  Absolute State of charge: 70 %
  Remaining Capacity: 1055 mAh
  Full Charge Capacity: 1198 mAh
  
BBU Design Info for Adapter: 0
  Design Capacity: 1500 mAh

这一段的含义是说,刚出厂的BBU,设计容量是1500 mAh,但是随着长时间的使用,电池开始老化,目前充饱电,最多只能充到1198 mAh, 另外当前的电量是1055 mAh。

Relative State of Charge = 1055 / 1198 ~ 89%
Absolute State of charge = 1055 / 1500 ~ 70%

另外一个和BBU Capacity相关的参数为:

Remaining Capacity Low                  : No

注意,一般来讲,BBU要保持住能够保护RAID Cache48小时的电量,如果保持不了,那么这个值会是Yes。 对于BBU 07这个值一般是600mAh,而BBU 08,这个值需要是674 mAh,BBU 09我暂时没有查到。

一般来讲,这个参数可以用来确定BBU是否需要替换。正常情况下,Remain Capacity Low应该是No,始终是No,因为会BBU会及时充放电,所以这个值不应该是Yes,如果这个值为Yes,这说明BBU需要更换了。

这种情况下会有以下现象发生:

  1. 一般来讲,会从eventlog中发现这样的输出:
Current capacity of  the battery is below threshold
  1. 如果Write Policy为WB,并且 No Write Cache if Bad BBU,那么BBU的状态被强制改成WT
  2. 无法将WT改成WB,会报如下错误:
root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -immediate -L2 -A0
                                     
Battery capacity is below the threshold value

Adapter 0: Set Property Failed on Vd object

Exit Code: 0x01

我们在现场,确实曾经发现有BBU出现问题

root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMD -Aall
                                     
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3893 mV
Current: 536 mA
Temperature: 46 C
Battery State: Degraded(Need Attention) 
		A manual learn is required.
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Charging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested	                  : Yes
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity x                  : Yes
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No

BBU GasGauge Status: 0x0100 
  Relative State of Charge: 34 %
  Charger System State: 1
  Charger System Ctrl: 0
  Charging current: 536 mA
  Absolute state of charge: 29 %
  Max Error: 0 %
  Battery backup charge time : 19 hours

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 35 %
  Absolute State of charge: 30 %
  Remaining Capacity: 452 mAh
  Full Charge Capacity: 1320 mAh
  Run time to empty: Battery is not being charged.  
  Average time to empty: 54 Min. 
  Estimated Time to full recharge: 2 Hour, 32 Min. 
  Cycle Count: 0

BBU Design Info for Adapter: 0

  Date of Manufacture: 07/08, 2015
  Design Capacity: 1500 mAh
  Design Voltage: 4100 mV
  Specification Info: 0
  Serial Number: 475
  Pack Stat Configuration: 0x0000
  Manufacture Name: LS36691
  Firmware Version   : 
  Device Name: iBBU-09
  Device Chemistry: LION
  Battery FRU: N/A
  Transparent Learn = 0
  App Data = 0

BBU Properties for Adapter: 0

  Auto Learn Period: 28 Days
  Next Learn time: Wed May  3 10:49:00 2017
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Enabled
  BBU Mode = 5

Exit Code: 0x00


root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMD GetBbuCapacityInfo -A0
                                     

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 37 %
  Absolute State of charge: 32 %
  Remaining Capacity: 485 mAh
  Full Charge Capacity: 1320 mAh
  Run time to empty: Battery is not being charged.  
  Average time to empty: 58 Min. 
  Estimated Time to full recharge: 2 Hour, 28 Min. 
  Cycle Count: 0

Exit Code: 0x00

注意上图,BBU 的Remain Capacity为 452 mAh,而出厂标称值为1500 mAh, Absolute State of charge: 30 %,这时候,其实已经出现问题了,注意如下输出:

Battery State: Degraded(Need Attention) 
		A manual learn is required.

检查BBU的健康情况

如何检查BBU是否需要更换呢?

RAID batteries - time to replace?这篇文章给出了一个思路:

root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMD GetBbuCapacityInfo -A0
                                     

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 37 %
  Absolute State of charge: 32 %             ---->查看Absolute State of charge的值
  Remaining Capacity: 485 mAh
  Full Charge Capacity: 1320 mAh
  Run time to empty: Battery is not being charged.  
  Average time to empty: 58 Min. 
  Estimated Time to full recharge: 2 Hour, 28 Min. 
  Cycle Count: 0

Exit Code: 0x00

文章给出了如下shell脚本

  bbugood() { [ $(megacli adpbbucmd GetBbuCapacityInfo ${1:-a0} | sed -ne '/^[aA]bsolute/{s/^[^[:digit:]]*\([[:digit:]]\+\).*$/\1/p}') -gt 54 ] ;}

这个shell语句有些问题,因为很明显Absolute并不出现在行首,行首还有额外的空格,因此可以使用如下方法获取Absolute State of charge的值:

/opt/MegaRAID/MegaCli/MegaCli64 adpbbucmd GetBbuCapacityInfo -a0 | sed -ne '/[aA]bsolute/{s/^[^[:digit:]]*\([[:digit:]]\+\).*$/\1/p}'
70

作者给出了经验数值,54%,低于54%则判定BBU not good。

注意,Learn Cycle的时候会先放电,然后充电,这个过程中可能会低于经验值,但是要进一步分析Battery State和Learn Cycle Status来判断是否BBU已经老化。

另外一个有价值的参数是:

Battery Replacement required            : No

当这个值为Yes的时候当然需要更换电池,但是这个参数一般比较滞后,很难通过它预判BBU需要更换。

BBU的Auto Relearn

为了记录电池的放电曲线,以便控制器了解电池的状态,例如最大和最小电压等,同时为了延长电池的寿命,默认会启用自动校准模式(AutoLearn Mode).

我们首先通过EventLog对 Auto Learn全过程做一个直观了解:

(注意BBU的时间和shell命令输出的时间比较,相差8个小时,不要被时间迷惑)

  • 预告阶段,在relearn start之前4天,2天,1天,5小时,分别预告,要开始relearn了
seqNum: 0x00012827
Time: Mon Apr 24 20:49:28 2017

Code: 0x0000009d
Class: 0
Locale: 0x08
Event Description: Battery relearn will start in 4 days
Event Data:
===========
None


seqNum: 0x00012828
Time: Wed Apr 26 20:48:58 2017

Code: 0x0000009e
Class: 0
Locale: 0x08
Event Description: Battery relearn will start in 2 day
Event Data:
===========
None

seqNum: 0x00012829
Time: Thu Apr 27 20:48:43 2017

Code: 0x0000009f
Class: 0
Locale: 0x08
Event Description: Battery relearn will start in 1 day
Event Data:
===========
None


seqNum: 0x0001282a
Time: Fri Apr 28 15:49:28 2017

Code: 0x000000a0
Class: 0
Locale: 0x08
Event Description: Battery relearn will start in 5 hours
Event Data:
===========
None
  • 第一阶段:控制器把BBU电池充满电(该步骤可能是放电后充电或直接充电,如果电池刚好满电,则直接进入第二阶段)
seqNum: 0x0001282b
Time: Fri Apr 28 20:50:38 2017

Code: 0x0000009b
Class: 0
Locale: 0x08
Event Description: Battery relearn pending: Battery is under charge
Event Data:
===========
None



第一阶段的典型时间是1分钟左右。

  • 第二阶段 开始校准, 对BBU电池执行放电。 注意,随着放电的进行,当电量低到一定程度的时候,会讲Write Policy从WB改成WT(Force WB的除外)。

注意,当relearn开始的时候,Learn Cycle Active开始从No变成Yes。

这个阶段进一步细分,其实可以细化成两个阶段

当放电量较少,剩余电量仍然超过安全门限,2A阶段:

Battery State: Optimal

随着时间流逝,放电量越来也多,剩余的电量越来越少,当剩余电量低于安全门限的时候,会将WT改成WB,此时

Battery State: Learning

我看了我的一台机器,当剩余电量为 888 mAh的时候,尚为2A阶段,当剩余电量为883 mAh的时候,已经为2B阶段了。

整个第二阶段,相关的event log如下:


seqNum: 0x0001282c
Time: Fri Apr 28 20:51:43 2017

Code: 0x00000097
Class: 0
Locale: 0x08
Event Description: Battery relearn started
Event Data:
===========
None


seqNum: 0x0001282d
Time: Fri Apr 28 20:52:48 2017

Code: 0x00000094
Class: 0
Locale: 0x08
Event Description: Battery is discharging
Event Data:
===========
None


seqNum: 0x0001282e
Time: Fri Apr 28 20:52:48 2017

Code: 0x00000098
Class: 0
Locale: 0x08
Event Description: Battery relearn in progress
Event Data:
===========
None


seqNum: 0x0001282f
Time: Fri Apr 28 21:17:45 2017

Code: 0x000000c3
Class: 1
Locale: 0x08
Event Description: BBU disabled; changing WB virtual disks to WT, Forced WB VDs are not affected
Event Data:
===========
None

从开始校准到校准结束,这个时间的典型值是1个半小时。

  • 放电完成后,完成校准,并重新开始充电, 直接达到最大电量, 整个Learn Cycle才算完成 注意: 如果第二或第三阶段被中断,重新校准的任务会停止,而不会重新执行
seqNum: 0x00012835
Time: Fri Apr 28 22:22:43 2017

Code: 0x00000099
Class: 0
Locale: 0x08
Event Description: Battery relearn completed
Event Data:
===========
None


seqNum: 0x00012836
Time: Fri Apr 28 22:23:48 2017

Code: 0x00000093
Class: 0
Locale: 0x08
Event Description: Battery started charging
Event Data:
===========
None


seqNum: 0x00012837
Time: Fri Apr 28 22:42:13 2017

Code: 0x000000c2
Class: 0
Locale: 0x08
Event Description: BBU enabled; changing WT virtual disks to WB
Event Data:
===========
None


seqNum: 0x0001283b
Time: Fri Apr 28 23:11:28 2017

Code: 0x000000f2
Class: 0
Locale: 0x08
Event Description: Battery charge complete
Event Data:
===========
None

一般充电20分钟之后,就可以讲Write Policy从WT改成WB。充电过程会继续,知道充电完成,一般充电过程过程会在1小时之内。

其实我们更关心的是,从WB改成WT,到WT改回WB,这段时间是多久,因为这段时间内,写入性能会显著下降,典型值是一个半小时左右。

seqNum: 0x0000fac8
Time: Fri Feb  3 21:18:10 2017

Code: 0x000000c3
Class: 1
Locale: 0x08
Event Description: BBU disabled; changing WB virtual disks to WT, Forced WB VDs are not affected
Event Data:
===========
None

seqNum: 0x0000fad0
Time: Fri Feb  3 22:44:48 2017

Code: 0x000000c2
Class: 0
Locale: 0x08
Event Description: BBU enabled; changing WT virtual disks to WB
Event Data:
===========
None

一般来讲,为了延长寿命,都会打开AutoLearn的模式,如果不打开,会降低BBU的寿命。

root@EBS-005-PUB01-JD-Xian:/var/log# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMD GetBbuProperties -A0
                                     

BBU Properties for Adapter: 0

  Auto Learn Period: 28 Days
  Next Learn time: Wed May  3 10:49:00 2017
  Learn Delay Interval:0 Hours
  Auto-Learn Mode: Enabled
  BBU Mode = 5

Exit Code: 0x00

另一个需要关心的问题是,多久之行一次Auto Learn,这取决于硬件,比如我们这个是28天一次。

手动发起Relearn

注意,我们可以通过如下命令,手动发起Relearn

/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -BbuLearn -aALL

这个命令一旦执行,以下事情就会发生:

  • Remaining Capacity会不断升高,慢慢逼近Full Charge Capacity。

    可以通过如下指令观察:

/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuCapacityInfo -A0

输出如下:

Every 2.0s: /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMd -GetBBUCapacityInfo -A0                                              Mon May 29 19:26:58 2017



BBU Capacity Info for Adapter: 0

  Relative State of Charge: 95 %
  Absolute State of charge: 75 %
  Remaining Capacity: 1127 mAh
  Full Charge Capacity: 1198 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 2 Hour, 15 Min.
  Estimated Time to full recharge: 26 Min.
  Cycle Count: 12

Exit Code: 0x00

Every 2.0s: /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMd -GetBBUCapacityInfo -A0                                              Mon May 29 19:38:59 2017



BBU Capacity Info for Adapter: 0

  Relative State of Charge: 98 %
  Absolute State of charge: 78 %
  Remaining Capacity: 1171 mAh
  Full Charge Capacity: 1198 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 2 Hour, 21 Min.
  Estimated Time to full recharge: 13 Min.
  Cycle Count: 12

Exit Code: 0x00
  • 等待稍许时间,Charging Status会变成 Charging
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 4091 mV
Current: 169 mA
Temperature: 38 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Charging           <----------变成charging状态
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : Yes
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No
  • 当充电完成,Charging Statue变成None。

# 11:46:00 完成充电

seqNum: 0x0000a640
Time: Mon May 29 11:46:00 2017

Code: 0x000000f2
Class: 0
Locale: 0x08
Event Description: Battery charge complete
Event Data:
===========
None

Every 2.0s: /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMd -GetBBUCapacityInfo -A0                                              Mon May 29 19:46:31 2017



BBU Capacity Info for Adapter: 0

  Relative State of Charge: 100 %
  Absolute State of charge: 79 %
  Remaining Capacity: 1188 mAh
  Full Charge Capacity: 1198 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 2 Hour, 23 Min.
  Estimated Time to full recharge: Battery is not being charged.
  Cycle Count: 12

Exit Code: 0x00



Every 2.0s: /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCmd -A0                                                                  Mon May 29 19:46:50 2017


BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 4067 mV
Current: 0 mA
Temperature: 38 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : None
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : Yes
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No

BBU GasGauge Status: 0x0200

  • 过一段时间,Relearn 开始,开始放电,此时 Learn Cycle Active 变成Yes,Charging Status变成 Discharging
seqNum: 0x0000a641
Time: Mon May 29 12:18:30 2017

Code: 0x00000097
Class: 0
Locale: 0x08
Event Description: Battery relearn started
Event Data:
===========
None



BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3989 mV
Current: -313 mA
Temperature: 37 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Discharging          <------------relearn的开始是discharging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : Yes
  Learn Cycle Active                      : Yes        <-----------relearn start ,此处从No变成Yes
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No

在这个Discharing过程中,电量逐渐降低:


Every 2.0s: /opt/MegaRAID/MegaCli/MegaCli64 -AdpBBUCMd -GetBBUCapacityInfo -A0                                              Mon May 29 20:38:16 2017



BBU Capacity Info for Adapter: 0

  Relative State of Charge: 91 %
  Absolute State of charge: 72 %
  Remaining Capacity: 1088 mAh                  <---------电量在逐渐减少
  Full Charge Capacity: 1198 mAh
  Run time to empty: 3 Hour, 32 Min.
  Average time to empty: 2 Hour, 11 Min.
  Estimated Time to full recharge: Battery is not being charged.
  Cycle Count: 12

Exit Code: 0x00

当电量低到一定程度时,就会将Write Policy 从WB改成WT,Remaining Capacity Low 从No变成Yes

seqNum: 0x0000a671
Time: Mon May 29 14:10:22 2017

Code: 0x000000c3
Class: 1
Locale: 0x08
Event Description: BBU disabled; changing WB virtual disks to WT, Forced WB VDs are not affected
Event Data:
===========
None


Mon May 29 22:10:50 CST 2017

BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3793 mV
Current: -297 mA
Temperature: 40 C
Battery State: Learning
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Discharging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : Yes
  Learn Cycle Active                      : Yes
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : Yes       <-------电量低到了一定程度,从Remaining Capacity Low 从NO变成Yes
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
BBU Capacity Info for Adapter: 0

  Relative State of Charge: 67 %
  Absolute State of charge: 58 %
  Remaining Capacity: 883 mAh
  Full Charge Capacity: 1319 mAh
  Run time to empty: 2 Hour, 58 Min.
  Average time to empty: 1 Hour, 46 Min.
  Estimated Time to full recharge: Battery is not being charged.
  Cycle Count: 12

最终BBU的电量会变得非常低,然后校准完成:

seqNum: 0x0000a67a
Time: Mon May 29 17:21:02 2017

Code: 0x00000099
Class: 0
Locale: 0x08
Event Description: Battery relearn completed
Event Data:
===========
None

校准完成以后,Learn Cycle Required和Learn Cycle Active 都变成No,同时,Battery State可能暂时变成Failed状态。

Tue May 30 01:21:15 CST 2017
^M
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3267 mV
Current: 0 mA
Temperature: 34 C
Battery State: Failed
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Discharging
  Voltage                                 : Low
  Temperature                             : OK
  Learn Cycle Requested                   : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : Yes
  
  ...
  BBU Capacity Info for Adapter: 0

  Relative State of Charge: 11 %
  Absolute State of charge: 9 %
  Remaining Capacity: 139 mAh
  Full Charge Capacity: 1316 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 17 Min.
  Estimated Time to full recharge: 5 Hour, 41 Min.
  Cycle Count: 13

注意这个状态不会停留太久。校准完成之后,就开始充电,因此,Battery State变成 Degraded(Charging),同时Charging Status变成Charging。

Tue May 30 01:22:15 CST 2017
^M
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3492 mV
Current: 522 mA
Temperature: 34 C
Battery State: Degraded(Charging)                          <----------状态变成Degraded(Charging)
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Charging                   <--------从Discharing变成Charging
  Voltage                                 : Low
  Temperature                             : OK
  Learn Cycle Requested                   : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : Yes
  Periodic Learn Required                 : No
  Transparent Learn                       : No

当充电电量涨到一定程度时,Remaining Capacity Low变成No,Battery Status从Degraded(Charing)变成Optimal,这个时候会将WT改回WB。


seqNum: 0x0000a67c
Time: Mon May 29 18:47:42 2017

Code: 0x000000c2
Class: 0
Locale: 0x08
Event Description: BBU enabled; changing WT virtual disks to WB       
Event Data:
===========
None


Exit Code: 0x00
Tue May 30 02:48:26 CST 2017
^M
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3981 mV
Current: 524 mA
Temperature: 38 C
Battery State: Optimal                             <-------------Battery State 从Degraded(Charging)变成Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : Charging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No      <-----------Remaining Capacity Low 变成No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No
                                                
BBU Capacity Info for Adapter: 0

  Relative State of Charge: 69 %
  Absolute State of charge: 59 %
  Remaining Capacity: 897 mAh                         <----------电量已经足够高了
  Full Charge Capacity: 1316 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 1 Hour, 48 Min.
  Estimated Time to full recharge: 1 Hour, 36 Min.
  Cycle Count: 13

继续充电,知道充电完成,此时Charging Status变成None

seqNum: 0x0000a684
Time: Mon May 29 19:06:07 2017

Code: 0x000000f2
Class: 0
Locale: 0x08
Event Description: Battery charge complete
Event Data:
===========
None



Tue May 30 03:06:28 CST 2017
^M
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 3955 mV
Current: 0 mA
Temperature: 38 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

  Charging Status              : None                   ---------------Charging Status从Charging变成None
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No

BBU Capacity Info for Adapter: 0

  Relative State of Charge: 80 %
  Absolute State of charge: 69 %
  Remaining Capacity: 1043 mAh
  Full Charge Capacity: 1316 mAh
  Run time to empty: Battery is not being charged.
  Average time to empty: 2 Hour, 5 Min.
  Estimated Time to full recharge: 5 Day, 16 Hour, 30 Min.

参考文献

  1. RAID batteries - time to replace?

上一篇     下一篇