Piotr Szymaniak
2014-02-26 13:32:02 UTC
Hi,
I got a system crash after some 160+ days uptime. After a hard reboot I
noticed my rrd database looks corrupted.
So I changed some recent checkpoints to snapshots, mounted them and...
all the rrd files are the same!
Here's some info about current state:
wloczykij ~ # lscp /dev/sda3 | grep ss
211211 2014-02-22 16:58:11 ss - 119 54904
211219 2014-02-22 18:18:28 ss - 124 54910
211811 2014-02-25 00:39:21 ss - 140 54922
211872 2014-02-25 09:47:16 ss - 160 54922
212008 2014-02-26 01:13:14 ss - 114 54929
212026 2014-02-26 03:22:45 ss - 28 54928
212042 2014-02-26 04:13:48 ss - 29 54928
212045 2014-02-26 04:24:00 ss - 29 54928
wloczykij ~ # mount | grep cp
/dev/sda3 on /tmp/211219 type nilfs2 (ro,cp=211219)
/dev/sda3 on /tmp/211211 type nilfs2 (ro,cp=211211)
/dev/sda3 on /tmp/212026 type nilfs2 (ro,cp=212026)
/dev/sda3 on /tmp/212045 type nilfs2 (ro,cp=212045)
wloczykij ~ # for sumrrd in 211219 211211 212026 212045; do md5sum /tmp/$sumrrd/var/www/grubelek.pl/termometr/temp0.rrd; done
71f60c620a493021bb5e1c32c555abe8 /tmp/211219/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/211211/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/212026/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/212045/var/www/grubelek.pl/termometr/temp0.rrd
This is bad news! What should I do next? All the rrd dumps have the same
modification date:
<lastupdate>1376166602</lastupdate> <!-- 2013-08-10 22:30:02 CEST -->
(looks previous boot before the crash?)
I just moved the rrd to btrfs and made a subvolume snapshot and after about an
hour rrd files are different:
wloczykij ~ # md5sum /home/services/termometr/temp0.rrd /home/snapshot-2014-02-26/services/termometr/temp0.rrd
2999dc7071d94e701d5246d79ccc488f /home/services/termometr/temp0.rrd
1621f31fb7c27f1f3c0b0d8f0f5ede9e /home/snapshot-2014-02-26/services/termometr/temp0.rrd
wloczykij ~ # nilfs-tune -l /dev/sda3
nilfs-tune 2.1.5
Filesystem volume name: (none)
Filesystem UUID: f18e80b1-f3c1-49ec-baa5-39c0edc4c0b9
Filesystem magic number: 0x3434
Filesystem revision #: 2.0
Filesystem features: (none)
Filesystem state: invalid or mounted
Filesystem OS type: Linux
Block size: 4096
Filesystem created: Sat Aug 13 10:36:21 2011
Last mount time: Wed Feb 26 09:33:53 2014
Last write time: Wed Feb 26 14:15:29 2014
Mount count: 59
Maximum mount count: 50
Reserve blocks uid: 0 (user root)
Reserve blocks gid: 0 (group root)
First inode: 11
Inode size: 128
DAT entry size: 32
Checkpoint size: 192
Segment usage size: 16
Number of segments: 465
Device size: 3908042752
First data block: 1
# of blocks per segment: 2048
Reserved segments %: 5
Last checkpoint #: 212170
Last block address: 546866
Last sequence #: 35128
Free blocks count: 227328
Commit interval: 600
# of blks to create seg: 0
CRC seed: 0x1a1e847d
CRC check sum: 0x57f59c5c
CRC check data size: 0x00000118
wloczykij ~ # uname -sr
Linux 3.4.56
Piotr Szymaniak.
I got a system crash after some 160+ days uptime. After a hard reboot I
noticed my rrd database looks corrupted.
So I changed some recent checkpoints to snapshots, mounted them and...
all the rrd files are the same!
Here's some info about current state:
wloczykij ~ # lscp /dev/sda3 | grep ss
211211 2014-02-22 16:58:11 ss - 119 54904
211219 2014-02-22 18:18:28 ss - 124 54910
211811 2014-02-25 00:39:21 ss - 140 54922
211872 2014-02-25 09:47:16 ss - 160 54922
212008 2014-02-26 01:13:14 ss - 114 54929
212026 2014-02-26 03:22:45 ss - 28 54928
212042 2014-02-26 04:13:48 ss - 29 54928
212045 2014-02-26 04:24:00 ss - 29 54928
wloczykij ~ # mount | grep cp
/dev/sda3 on /tmp/211219 type nilfs2 (ro,cp=211219)
/dev/sda3 on /tmp/211211 type nilfs2 (ro,cp=211211)
/dev/sda3 on /tmp/212026 type nilfs2 (ro,cp=212026)
/dev/sda3 on /tmp/212045 type nilfs2 (ro,cp=212045)
wloczykij ~ # for sumrrd in 211219 211211 212026 212045; do md5sum /tmp/$sumrrd/var/www/grubelek.pl/termometr/temp0.rrd; done
71f60c620a493021bb5e1c32c555abe8 /tmp/211219/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/211211/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/212026/var/www/grubelek.pl/termometr/temp0.rrd
71f60c620a493021bb5e1c32c555abe8 /tmp/212045/var/www/grubelek.pl/termometr/temp0.rrd
This is bad news! What should I do next? All the rrd dumps have the same
modification date:
<lastupdate>1376166602</lastupdate> <!-- 2013-08-10 22:30:02 CEST -->
(looks previous boot before the crash?)
I just moved the rrd to btrfs and made a subvolume snapshot and after about an
hour rrd files are different:
wloczykij ~ # md5sum /home/services/termometr/temp0.rrd /home/snapshot-2014-02-26/services/termometr/temp0.rrd
2999dc7071d94e701d5246d79ccc488f /home/services/termometr/temp0.rrd
1621f31fb7c27f1f3c0b0d8f0f5ede9e /home/snapshot-2014-02-26/services/termometr/temp0.rrd
wloczykij ~ # nilfs-tune -l /dev/sda3
nilfs-tune 2.1.5
Filesystem volume name: (none)
Filesystem UUID: f18e80b1-f3c1-49ec-baa5-39c0edc4c0b9
Filesystem magic number: 0x3434
Filesystem revision #: 2.0
Filesystem features: (none)
Filesystem state: invalid or mounted
Filesystem OS type: Linux
Block size: 4096
Filesystem created: Sat Aug 13 10:36:21 2011
Last mount time: Wed Feb 26 09:33:53 2014
Last write time: Wed Feb 26 14:15:29 2014
Mount count: 59
Maximum mount count: 50
Reserve blocks uid: 0 (user root)
Reserve blocks gid: 0 (group root)
First inode: 11
Inode size: 128
DAT entry size: 32
Checkpoint size: 192
Segment usage size: 16
Number of segments: 465
Device size: 3908042752
First data block: 1
# of blocks per segment: 2048
Reserved segments %: 5
Last checkpoint #: 212170
Last block address: 546866
Last sequence #: 35128
Free blocks count: 227328
Commit interval: 600
# of blks to create seg: 0
CRC seed: 0x1a1e847d
CRC check sum: 0x57f59c5c
CRC check data size: 0x00000118
wloczykij ~ # uname -sr
Linux 3.4.56
Piotr Szymaniak.
--
(...) postÄ pili tak, jakby odkryli zasady rzÄ dzÄ ce fizykÄ kwantowÄ ,
a nastÄpnie wykorzystali je do zaprojektowania nowej gry telewizyjnej
- a potem, co gorsza, doszli do wniosku, ÅŒe caÅa fizyka kwantowa tylko
do tego siÄ nadaje...
-- Stephen King, "Dreamcatcher"
(...) postÄ pili tak, jakby odkryli zasady rzÄ dzÄ ce fizykÄ kwantowÄ ,
a nastÄpnie wykorzystali je do zaprojektowania nowej gry telewizyjnej
- a potem, co gorsza, doszli do wniosku, ÅŒe caÅa fizyka kwantowa tylko
do tego siÄ nadaje...
-- Stephen King, "Dreamcatcher"