Discussion:
c0t0d0 versus c0t${blah}d0 ... and partitions and mirroring in solaris 11 express
Edward Ned Harvey
2011-07-22 00:15:19 UTC
Permalink
Can anyone explain why this is, or needs to be?

In some systems, disks are named c0t0d0 etc

In some systems, they're named like this: c0t5000C5003424396Bd0



Worse yet...

In my present system,

foo=5000C5003424396B

bar=600C0FF000000000092C4D22A708D800

when I run format, I see c0t${foo}d0

If I ls /dev/rdsk/c0t${foo}* I see all the usual suspects... p0 to p4, and
s0 to s15 ... But I don't see any "d0" without any "p" or "s"

If I ls /dev/rdsk/*d0 then I see $bar



Here's why I care:

I installed s11e to a partition of a 2T drive. Now I want to mirror it, so
I want to replicate the fdisk partitions & partition slices onto the 2nd
disk... Nothing I do inside of "format" seems to make them identical, so I
considered using a low-level dd copy of the raw disk, but the device name
for the raw disk doesn't exist (nothing /dev/rdsk/*d0 matching the name of
the disk in my zpool), unless it's one of those other $bar things... In
which case I don't know which one is which. I have a 50/50 chance of doing
it right, or destroying the original. That's all assuming it's even valid
to attempt doing such a thing.



Right now, the only technique I can think of that will work is ... I'll
ignore the partition&slice tables on the original disk, and create a new
fdisk partition & partition slice scheme on the 2nd disk as I wish. Then
I'll zfs send the rpool onto the 2nd disk, install grub etc, and wipe out
the first disk. Boot from the 2nd disk. Then re-partition the first disk
the same as the 2nd disk and start mirroring. This sounds like a pointless
hassle that must be avoidable SOME how.



;-) Thanks for any answers/info/suggestions.
Benoit
2011-07-22 02:06:04 UTC
Permalink
Multipathing I/O on Solaris is certainly an interesting topic ...

A quick start : http://www.petertribble.co.uk/Solaris/mpxio.html
More at : http://hub.opensolaris.org/bin/view/Project+mpxio/WebHome
And the original documentation :
http://download.oracle.com/docs/cd/E19957-01/819-0139/

Best regards
-benoit
Post by Edward Ned Harvey
Can anyone explain why this is, or needs to be?
In some systems, disks are named c0t0d0 etc
In some systems, they're named like this: c0t5000C5003424396Bd0
Worse yet...
In my present system,
foo=5000C5003424396B
bar=600C0FF000000000092C4D22A708D800
when I run format, I see c0t${foo}d0
If I ls /dev/rdsk/c0t${foo}* I see all the usual suspects... p0 to p4,
and s0 to s15 ... But I don't see any "d0" without any "p" or "s"
If I ls /dev/rdsk/*d0 then I see $bar
I installed s11e to a partition of a 2T drive. Now I want to mirror
it, so I want to replicate the fdisk partitions & partition slices
onto the 2nd disk... Nothing I do inside of "format" seems to make
them identical, so I considered using a low-level dd copy of the raw
disk, but the device name for the raw disk doesn't exist (nothing
/dev/rdsk/*d0 matching the name of the disk in my zpool), unless it's
one of those other $bar things... In which case I don't know which
one is which. I have a 50/50 chance of doing it right, or destroying
the original. That's all assuming it's even valid to attempt doing
such a thing.
Right now, the only technique I can think of that will work is ...
I'll ignore the partition&slice tables on the original disk, and
create a new fdisk partition & partition slice scheme on the 2nd disk
as I wish. Then I'll zfs send the rpool onto the 2nd disk, install
grub etc, and wipe out the first disk. Boot from the 2nd disk. Then
re-partition the first disk the same as the 2nd disk and start
mirroring. This sounds like a pointless hassle that must be avoidable
SOME how.
;-) Thanks for any answers/info/suggestions.
------------------------------------------------------------------------
_______________________________________________
opensolaris-discuss mailing list
Edward Ned Harvey
2011-07-22 12:33:08 UTC
Permalink
Sent: Thursday, July 21, 2011 10:06 PM
Multipathing I/O on Solaris is certainly an interesting topic ...
A quick start : http://www.petertribble.co.uk/Solaris/mpxio.html
More at : http://hub.opensolaris.org/bin/view/Project+mpxio/WebHome
http://download.oracle.com/docs/cd/E19957-01/819-0139/
That is interesting. Thanks for the info... However...

The really interesting thing is that the lu itself is the s2 inside the
solaris partition... So it's not the whole disk...
sudo mpathadm list lu
/dev/rdsk/c0t5000C5003424396Bd0s2
Total Path Count: 1
Operational Path Count: 1
/dev/rdsk/c0t5000C5002637311Fd0s2
Total Path Count: 1
Operational Path Count: 1

So again, for clarity:

I installed sol11e, and during installation, opted to use a partition of the
hard drive instead of whole drive. End result is the fdisk partition
occupies 5% of the disk, and inside it, there is the solaris partition
slices, s2 occupies 100% of the 5% fdisk partition. I want to replicate
this config onto the 2nd disk identically for the purposes of mirroring.

Since the multipath device $foo (c0t5000C5003424396Bd0s2) is itself just
inside the fdisk partition, it means I can't use that device to replicate
the fdisk partition. I'll have to somehow figure out which d0 device $bar
(/dev/rdsk/c0t600C0FF000000000092C4D22A708D800d0) correlates to the first
disk versus the second disk.

I guess I'll just have to remove the 2nd disk, see which one disappears, and
go from there.
Peter Tribble
2011-07-22 12:37:52 UTC
Permalink
On Fri, Jul 22, 2011 at 1:15 AM, Edward Ned Harvey
Post by Edward Ned Harvey
Can anyone explain why this is, or needs to be?
In some systems, disks are named c0t0d0 etc
In some systems, they're named like this: c0t5000C5003424396Bd0
That's multipathing, usually by the name of MPXIO (or sometimes the
old unbundled STMS - which is where the name of the stmsboot command
comes from).

The c0t0d0 names tend to refer to physical locations; the longer ones
use the WWN or serial number of the device. Do an ls -l on the long
one and it'll probably point to a device node with scsi_vhci in the name.

Try:

mpathadm show LU /dev/rdsk/c0t5000C5003424396Bd0s2

and you'll get more information.

Note for the future: many current systems come with multipathing on
from the start. For example, the X4170 and T3 systems I'm currently
building come up with c0t5000C500397689DBd0 style names by default.
Mapping those back to physical slots is fun. Doesn't really matter on the
T3 which, being sparc, can boot sanely off any device. A bit more work
on the X4170 matching up the boot order in the BIOS with the devices
you can see from Solaris. (Hint: install the Hardware Management Pack
and then you can get everything from the Storage tab in the ilom web
interface.) That's just an aside, though.
Post by Edward Ned Harvey
Worse yet...
In my present system,
foo=5000C5003424396B
bar=600C0FF000000000092C4D22A708D800
when I run format, I see c0t${foo}d0
If I ls /dev/rdsk/c0t${foo}* I see all the usual suspects... p0 to p4, and
s0 to s15 ... But I don't see any "d0" without any "p" or "s"
If I ls /dev/rdsk/*d0 then I see $bar
That often happens. It shouldn't matter. If the d0 node doesn't exist, then
telling zfs to use it as a vdev will usually force it to be created.

However, if this is a root pool then you should be using slices rather
than the whole disk anyway. (And SMI labels rather than EFI labels
if I remember correctly.)
Post by Edward Ned Harvey
I installed s11e to a partition of a 2T drive.  Now I want to mirror it, so
I want to replicate the fdisk partitions & partition slices onto the 2nd
disk... Nothing I do inside of "format" seems to make them identical,
format, or fdisk? You need to get the fdisk identical first. And if they're
different drives, then good luck...
Post by Edward Ned Harvey
so I
considered using a low-level dd copy of the raw disk, but the device name
for the raw disk doesn't exist (nothing /dev/rdsk/*d0 matching the name of
the disk in my zpool), unless it's one of those other $bar things...  In
which case I don't know which one is which.  I have a 50/50 chance of doing
it right, or destroying the original.  That's all assuming it's even valid
to attempt doing such a thing.
The device names should match up the world wide number on the
drive. Or serial. Whatever, everything $foo refers to one drive, everything
$bar to the other one.

As I said, if this is an rpool then don't use d0 anyway. Which may well be
why d0 doesn't exist. If it's not, then give zfs the d0 name and it should
create it.
Post by Edward Ned Harvey
Right now, the only technique I can think of that will work is ... I'll
ignore the partition&slice tables on the original disk, and create a new
fdisk partition & partition slice scheme on the 2nd disk as I wish.  Then
I'll zfs send the rpool onto the 2nd disk, install grub etc, and wipe out
the first disk.  Boot from the 2nd disk.  Then re-partition the first disk
the same as the 2nd disk and start mirroring.  This sounds like a pointless
hassle that must be avoidable SOME how.
;-)   Thanks for any answers/info/suggestions.
_______________________________________________
opensolaris-discuss mailing list
--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Edward Ned Harvey
2011-07-22 14:29:29 UTC
Permalink
Sent: Friday, July 22, 2011 8:38 AM
Post by Edward Ned Harvey
I installed s11e to a partition of a 2T drive.  Now I want to mirror it, so
I want to replicate the fdisk partitions & partition slices onto the 2nd
disk... Nothing I do inside of "format" seems to make them identical,
format, or fdisk? You need to get the fdisk identical first. And if they're
different drives, then good luck...
I go into format, select the first disk (0) and go into fdisk. Initially
there was only one solaris2 partition occupying 5% of the drive. So I
created another one.

Then I switch to disk 1, and try to replicate the fdisk partitions from the
first disk, and nothing I do seems to make it quite the same...
The device names should match up the world wide number on the
drive. Or serial. Whatever, everything $foo refers to one drive, everything
$bar to the other one.
Nope, perhaps I wasn't perfectly clear. Let's try this:

Currently there are 2 disks attached.
I run "format" and I see:
c0t5000C5003424396Bd0
c0t5000C5002637311Fd0
That's what I'm calling $foo
ls /dev/rdsk/*d0
/dev/rdsk/c0t600C0FF000000000092C4D22A708D800d0
/dev/rdsk/c0t600C0FF000000000092C6D3F973A8E00d0
That's what I'm calling $bar
Notice it doesn't match.

I figure I can yank out the 2nd disk... And see which one disappears. Then
reattach it, and I can use dd to copy 1Mb from the first to the second disk.
That should take care of fdisk partitions (and maybe partition slices too,
I'm not sure.)

Side note:
You know why this is so dang confusing? Because fdisk partitions contain
partition slices, and there's no natural way to reference all those things
individually... Nevermind the ambiguous use of the term "partition" in both
cases. I think a more sane model would be something like this...
c0t0d0p0s0
c0t0d0p0s2
c0t0d0p1s0
c0t0d0p1s2
...
End side note. ;-)
As I said, if this is an rpool then don't use d0 anyway. Which may well be
why d0 doesn't exist. If it's not, then give zfs the d0 name and it should
create it.
I have no intent to use d0 as an argument to zpool in this case... But I
don't know any way to replicate the fdisk partitions without using d0.

Thanks for your feedback...
Edward Ned Harvey
2011-07-22 18:23:11 UTC
Permalink
Sent: Friday, July 22, 2011 8:38 AM
That often happens. It shouldn't matter. If the d0 node doesn't exist, then
telling zfs to use it as a vdev will usually force it to be created.
For the 2nd disk, that works.
sudo ls -l /dev/rdsk/c0t5000C5002637311Fd0
ls: cannot access /dev/rdsk/c0t5000C5002637311Fd0: No such file or
directory

sudo zpool create junk c0t5000C5002637311Fd0
sudo ls -l /dev/rdsk/c0t5000C5002637311Fd0
lrwxrwxrwx 1 root root 53 2011-07-22 14:01
/dev/rdsk/c0t5000C5002637311Fd0 ->
../../devices/scsi_vhci/***@g5000c5002637311f:wd,raw

However, the goal is to replicate the partition tables from the first disk
to the second disk. Which means (as far as I can tell) that the d0 must
exist for the first disk too... In order to have a device to copy *from*.

I'm still practicing my google-fu at the moment, but if anyone knows how to
create that device node for the first disk (containing rpool) that would be
much appreciated...

Thanks...
Peter Tribble
2011-07-23 11:51:18 UTC
Permalink
On Fri, Jul 22, 2011 at 7:23 PM, Edward Ned Harvey
Post by Edward Ned Harvey
Sent: Friday, July 22, 2011 8:38 AM
That often happens. It shouldn't matter. If the d0 node doesn't exist, then
telling zfs to use it as a vdev will usually force it to be created.
For the 2nd disk, that works.
       sudo ls -l /dev/rdsk/c0t5000C5002637311Fd0
       ls: cannot access /dev/rdsk/c0t5000C5002637311Fd0: No such file or
directory
       sudo zpool create junk c0t5000C5002637311Fd0
       sudo ls -l /dev/rdsk/c0t5000C5002637311Fd0
       lrwxrwxrwx 1 root root 53 2011-07-22 14:01
/dev/rdsk/c0t5000C5002637311Fd0 ->
However, the goal is to replicate the partition tables from the first disk
to the second disk.  Which means (as far as I can tell) that the d0 must
exist for the first disk too... In order to have a device to copy *from*.
The d0 device node is completely irrelevant to that.

To replicate the disk, you need to use fdisk to copy the partition
table. That uses
the ...p0 device. Assuming identical drives it should be:

fdisk -W outfile /dev/rdsk/c*disk1*p0

to save the partition table from the first disk, and

fdisk -F outfile /dev/rdsk/c*disk2*p0

to write it to the second.

Then to copy the solaris slice information use s2:

prtvtoc /dev/rdsk/c*disk1*s2 | fmthard -s - /dev/rdsk/c*disk2*s2

If the 2nd disk has an EFI label on, then you need to go into format -e
and relabel it with an SMI label.
--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Edward Ned Harvey
2011-07-23 13:33:06 UTC
Permalink
Sent: Saturday, July 23, 2011 7:51 AM
The d0 device node is completely irrelevant to that.
To replicate the disk, you need to use fdisk to copy the partition
table. That uses
Yes, that seems to have worked, thank you! :-)
...
So ... Is the p0 device synonymous with the d0 device?
Cindy Swearingen
2011-07-25 14:59:01 UTC
Permalink
Hi Ned,

The p* devices represent the fdisk partition.

The disk naming is very confusing so let's summarize:

c*t*d*p* - represents the larger fdisk partition. Do no use these
for your storage pools.

c*t*d* - represents the whole disk. Use these for your pools
except when creating a root pool for booting. Do not use s2, a
slice that represented the whole disk on the VTOC label for
anything.

c*t*d*s* - represents a disk slice. Only use this for a
root pool.

Thanks,

Cindy
Post by Edward Ned Harvey
Sent: Saturday, July 23, 2011 7:51 AM
The d0 device node is completely irrelevant to that.
To replicate the disk, you need to use fdisk to copy the partition
table. That uses
Yes, that seems to have worked, thank you! :-)
...
So ... Is the p0 device synonymous with the d0 device?
_______________________________________________
opensolaris-discuss mailing list
Edward Ned Harvey
2011-07-25 22:56:15 UTC
Permalink
Sent: Monday, July 25, 2011 10:59 AM
The p* devices represent the fdisk partition.
c*t*d*p* - represents the larger fdisk partition. Do no use these
for your storage pools.
c*t*d* - represents the whole disk. Use these for your pools
except when creating a root pool for booting. Do not use s2, a
slice that represented the whole disk on the VTOC label for
anything.
c*t*d*s* - represents a disk slice. Only use this for a
root pool.
Thank you - But I already knew all that. Here's what's weird, and where the
confusion is coming from:

First, fdisk only supports 4 partitions. So the existence of p0, p1, p2,
p3, p4 is weird. Cuz that's five.

Second, there is no d0 device. There is only p0-p4, and s0-s15

Third, the goal is to replicate the fdisk partitions from one disk to
another. I therefore expected I would need something broader than the p0-p4
devices... I thought I would need d0. But then somebody here told me to
simply use p0 in place of d0, and it worked. Which was rather mind blowing.
I thought p0 might be synonymous with d0, but on some device where p0 and d0
both exist, I checked... They're not the same.

If you simply reference the d0 in zpool create... Then it will create the d0
device. By looking at the major,minor numbers, it seems to be synonymous
with another random thing, like s5 or something... But not p0, not s2, and
not s0. So the logic of all of this evades me...
Peter Tribble
2011-07-26 20:03:16 UTC
Permalink
On Mon, Jul 25, 2011 at 11:56 PM, Edward Ned Harvey
Thank you - But I already knew all that.  Here's what's weird, and where the
First, fdisk only supports 4 partitions.  So the existence of p0, p1, p2,
p3, p4 is weird.  Cuz that's five.
That's because p1-p4 are the fdisk partitions, and p0 is the special device
that refers to the disk as a whole. Same naming convention, but 2 different
things. (At least they're not overloaded the way that s2 is for slices.)
If you simply reference the d0 in zpool create... Then it will create the d0
device.  By looking at the major,minor numbers, it seems to be synonymous
with another random thing, like s5 or something...  But not p0, not s2, and
not s0.  So the logic of all of this evades me...
Logic? There's probably logic if you look at the historical development
of the naming scheme, but I think you're trying to read to much into this.
There are separate names for different purposes.

Just use p* from fdisk.
Just use s* to refer to Solaris slices.
Just use d0 for zfs to refer to the whole drive.
--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Edward Ned Harvey
2011-07-25 22:01:26 UTC
Permalink
Sent: Friday, July 22, 2011 8:38 AM
(Hint: install the Hardware Management Pack
and then you can get everything from the Storage tab in the ilom web
interface.) That's just an aside, though.
Where does one obtain the HMP nowadays? The links on sun.com go to
support.oracle.com, where I login, and find nothing by the way of
downloads... And so on... And so on... Hassle hassle hassle.... In other
words, I have been trying for half an hour now to find it...

Thanks for any pointers...
Peter Tribble
2011-07-26 19:44:30 UTC
Permalink
On Mon, Jul 25, 2011 at 11:01 PM, Edward Ned Harvey
Sent: Friday, July 22, 2011 8:38 AM
(Hint: install the Hardware Management Pack
and then you can get everything from the Storage tab in the ilom web
interface.) That's just an aside, though.
Where does one obtain the HMP nowadays?  The links on sun.com go to
support.oracle.com, where I login, and find nothing by the way of
downloads...  And so on...  And so on...  Hassle hassle hassle....  In other
words, I have been trying for half an hour now to find it...
Thanks for any pointers...
Getting a bit off the subject but people running current Sun T and X series
servers ought to know about the HMP. I didn't until recently, so Oracle are
doing a really good job of publicizing it and making it easy to find
and install.
Yeah, right. Sufficiently so that I decided to put up a blog entry describing
what it is and how to get it:

http://ptribble.blogspot.com/2011/07/oracle-hardware-management-pack.html
--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Edward Ned Harvey
2011-07-27 01:18:20 UTC
Permalink
Sent: Tuesday, July 26, 2011 3:45 PM
Getting a bit off the subject but people running current Sun T and X series
servers ought to know about the HMP. I didn't until recently, so Oracle are
doing a really good job of publicizing it and making it easy to find
and install.
Yeah, right. Sufficiently so that I decided to put up a blog entry describing
http://ptribble.blogspot.com/2011/07/oracle-hardware-management-
pack.html
You da man!

Incidentally, there doesn't seem to be anything explicitly related to
solaris 11 express there. So I'm just aiming at solaris 10 and hoping for
the best. I'll probably blow up my system and have to start from scratch.
;-)

Are you just using the other OSes? Such as windows, linux, solaris 10...
You know, the supported OSes, unlike solaris 11...
Edward Ned Harvey
2011-07-27 04:00:21 UTC
Permalink
Post by Edward Ned Harvey
Incidentally, there doesn't seem to be anything explicitly related to
solaris 11 express there. So I'm just aiming at solaris 10 and hoping for
the best. I'll probably blow up my system and have to start from scratch.
Oh well, it doesn't seem to be supported (or possible to run) on solaris 11
express, at least for now. I get "The required IPMI driver software is not
installed on this system." ... but ipmitool is clearly present. And all the
software download matrices etc all say solaris 10. Nothing about
opensolaris or solaris 11 express.

So, who knows. Some day.
Edward Ned Harvey
2011-08-06 13:57:09 UTC
Permalink
Sent: Tuesday, July 26, 2011 3:45 PM
Getting a bit off the subject but people running current Sun T and X series
servers ought to know about the HMP. I didn't until recently, so Oracle are
doing a really good job of publicizing it and making it easy to find
and install.
Yeah, right. Sufficiently so that I decided to put up a blog entry describing
http://ptribble.blogspot.com/2011/07/oracle-hardware-management-
pack.html
So, interesting development... Due to lack of OHMP on solaris 11, we
decided to abandon solaris 11 and downgrade to solaris 10 until such time as
sol11 will hopefully eventually support it... So now the OHMP is installed
on sol10...

I first built the system using a 2-disk mirror, and tested OHMP in that
configuration. It all looked good. Then I added the other 10 disks, and
now things don't look so good...
Loading Image...
Loading Image...
Loading Image...

So I'm back to the drawing board again. How can I locate a particular
physical disk based on its device name, and vice-versa?

Historically I've done a loop of dd for a second, sleep for a second,
repeat. But this is far from ideal or foolproof. I'd like a better way...
Either a map, or a blinking orange light, or something like that.

I have also tried and failed to disable multipath by editing
/kernel/drv/fp.conf

Thanks....
Edward Ned Harvey
2011-08-30 03:30:07 UTC
Permalink
Post by Peter Tribble
people running current Sun T and X series
servers ought to know about the HMP. I didn't until recently, so Oracle are
doing a really good job of publicizing it and making it easy to find
and install.
Yeah, right. Sufficiently so that I decided to put up a blog entry describing
http://ptribble.blogspot.com/2011/07/oracle-hardware-management-
pack.html
Hey, for what it's worth - I installed the HMP, and long story short, it
didn't do me any good. (But it was definitely progress in the right
direction, thank you for suggesting it!)

I've been failing and failing, and finally somebody in the support community
forum put me onto this. This is a scripted way to do the job the HMP does.
But it's much faster, and I have a certain preference for using the cmd-line
utility instead of the web interface. But most importantly, IT WORKS! :-)
So I think this is preferable over the HMP...

When you login to MOS, there's a search field in the upper right, "Search
Knowledge Base." Paste 6981695 into that field, and only one document came
up.

Skip down to the bottom, click the "Instructions for Workaround"

There was no initial setup required on my system, because it's a standard
system (X4270 without HBA) and one of the included config files was already
correct for my system. So I'm using that now, and it works perfectly.
Observe:

[***@foo jumpstart_solution062011]# ./slot.sh -a
Oracle.slot.conf/slot.conf.X4270M2-12-PCIe03
disk=c2t5000C50034242B55d0 slot=2
disk=c2t5000C50034242E61d0 slot=11
disk=c2t5000C50034243F75d0 slot=10
disk=c2t5000C500263765B1d0 slot=9
disk=c2t5000C5003424594Dd0 slot=6
disk=c2t5000C5002637311Dd0 slot=1
disk=c2t5000C50026372801d0 slot=3
disk=c2t5000C50034243969d0 slot=0

Thank you!

Loading...