Category Archives: Storage

My weird obsession with computer storage

GRUB physical volume pv0 not found – one fix

I have had one of those days today…

One of the mount points on my server had gotten a little tight for space, so I decided to grow it before it became a problem. Naturally I use LVM so that’s not an issue.. well assuming you have free space on the LVM physical volume that is.. Thankfully this is a server running under virtualisation, so adding an additional physical volume was a simple task and the expansion went as you would expect, flawlessly.

Whilst I was at it, I thought I’d upgrade this box from Debian Squeeze to Wheezy too. Already running Wheezy elsewhere and an upgrade I have done plenty of times to trust (mostly 😉 ). That went well, but there were a few little grumbles around grub complaining about a missing PV that worried me. So I took a snapshot (including VMs memory) before running the magical restart. I am VERY glad I did!

Alas it seems there is a bug in GRUB 1.99 that from my Googling seems to have been around a while! It seems as though it isn’t too happy with an LVM configuration with multiple PVs. Reverting my non-booting box back to it’s running snapshot I was able to attempt ‘update-grub’ and be greeted with a long list of errors regards PV0 not being found.

Thankfully, I think I have found a work around! Although it’s not pretty and a bit time consuming. Essentially present a new PV to the box that is large enough to accommodate all existing PVs data (and probably a little extra growth room), then move your data to this new PV from all the original PVs with pvmove. This will take some time, running pvmove -v in another session/terminal you can at least see progress of the migration.

Once all the data is in your new PV, remove the old PVs from the active VG, then remove them from the system with pvremove. (They can still be connected and show as disks, just not PVs). With that done, run update-grub again, and fingers crossed you should have a bootable system once more.

I would certainly recommend you fire off a new snapshot including memory before rebooting, it’s an excellent safety net to get you back on the box.

Hopefully this will help someone else get out of this situation if they are unlucky to find themselves in it.

Brocade fabric switch error (AD VF conflict)

Just had this error on an ISL link between two switches, my initial Googling didn’t yield too much of use so I thought I may as well blog it and the solution (I used) here, hopefully be of help to others.

The scenario: Connecting a new Brocade 5100 switch with no configuration to an existing Brocade 5000 based fabric that uses Administrative Domains (AD).  Both the new 5100 and the 5000 are running the same release of FabOS, the 5100 has nothing but the barest of bare configurations (it’s IP address and authentication credentials).  With the ports connected, a switch show on the switches yields the following for the ISL port between them:

5100

LS Attributes:    [FID: 128, Base Switch: No, Default Switch: Yes, Address Mode 0]

Index Port Address Media Speed State     Proto
==============================================
  0   0   0a0000   id    N4   Online      FC  LS E-Port  segmented,(AD VF conflict)

5000

Address Mode:    0

Index Port Address Media Speed State     Proto
==============================================
  0   0   030000   id    N4   Online      FC  LS E-Port  segmented,(AD conflict)

I have included the last line of the information block as it gives a clue, the 5100 has additional lines here and more features, FID relates to Virtual Fabrics, a feature not supported on the 5000 in this case (I suspect not supported at all on the model but am not 100% sure of this).

The big clue for me is the “(AD VF conflict)” a bit of Googling revealed that it is not permissible to use Virtual Fabrics and Administrative Domains, I believe VF is the new AD, and hence mutually exclusive.  So the fix?

Simple, disable one or the other.  Now as the established fabric in this case is utilising AD, I am not going to disable that (as much as I would love to bin it 🙂 ) so disabling the VF on the new 5100 is the order of the day.  This is thankfully simple enough to do.

Use the command

fosconfig --show

to list all configured VFs and remove all but the default ones.  If you haven’t configured any, you won’t have any.

fosconfig --disable vf

will prompt you to confirm you wish to disable VF, this will require a reboot of the switch you are on, and a full reboot it is too, it will stop passing frames.  Once the reboot has completed, the switch should come back up and happily merge with your existing fabric.

Hopefully this will be of use to someone else, and as with anything I write on here, use this information at your own risk, I am not going to accept responsibility if you wipe out your companies fabric, or set fire to your gran doing this.

Storage Jenga

If you look after storage arrays you may have come to a situation like this.

You have a disk array, you want to turn it off (for whatever reason, age, replacement, it smells of old cheddar.. whatever).  Alas there is some customer still using a portion of it, you now need to wait for them to leave before you can decommision it.

Well I have had an idea to ‘help’ them in their exodus from the shelf.  Quite simply you enact storage Jenga.  How does this work?  easy, every day/couple of days/ weeks  (depending on your desired timescale/array size), you send a member of staff to site, once on site, that member of staff removes, physically a single disk from the array and leaves.  Now the customer has a chance of surviving this, depending on your configuration and hot spares, but each disk removed increases their odds of goodbye 🙂

I imagine this will work with maximum excitement with RAID 10 arrays with a couple of hot spares, possibly pulling a disk JUST before the time it takes to rebuild the hotspare 🙂

Wonder if management will let me try it some time? Hmmmmm 😀