San Francisco

dave spink toolset


POWERPATH:

INSTALL UNINSTALL VERITAS TEST
HBA REPLACE CLEAN UP COMMANDS


INSTALL

See host below that has no PowerPath installed. This host has one HBA and one CX LUN mapped via both SP's i.e. one active path.

# format
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@0,0
       1. c1t1d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@1,0
       2. c3t0d0 drive type unknown
          /pci@1d,700000/fibre-channel@1/sd@0,0
       3. c3t2d0 DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pci@1d,700000/fibre-channel@1/sd@2,0

If planning to use Veritas; install it before PowerPath. If Veritas is already installed ensure /etc/system entries for vx* drivers are above the entries for PowerPath drivers.

# vi /etc/system
set sd:sd_max_throttle = 20
set sd:sd_io_time = 0x3C

# vi /kernel/drv/qla2300.conf
hba0-link-down-error=1;
hba0-link-down-timeout=60;
hba0-fast-error-reporting=1;

Installing PowerPath.

# mount -F hsfs -r /dev/dsk/cXtXdXzX /cdrom/cdrom0
# cd /cdrom/cdrom/UNIX/SOLARIS
# /usr/sbin/pkgadd -d .

Register PowerPath on the host.

# /etc/emcpreg -install
# umount /cdrom/cdrom0
# reboot -- -r

Verify PowerPath is installed.

# pkginfo -l EMCpower
   PKGINST:  EMCpower
      NAME:  EMC PowerPath
  CATEGORY:  system
      ARCH:  sparc
   VERSION:  5.2.0_b146
   BASEDIR:  /opt
    VENDOR:  EMC Corporation
    PSTAMP:  beavis951018123443
  INSTDATE:  Aug 06 2008 17:05
    STATUS:  completely installed
     FILES:      318 installed pathnames
                  42 directories
                 127 executables
              183371 blocks used (approx)

Verify PowerPath kernel extension is loaded on the host.

# modinfo | grep emcp
 39  13b6ebf   3686 247   1  emcpsf (PP SF 5.2.0.0.0)
 40  13b9f85  28d18 236   1  emcp (PP Driver 5.2.0.0.0)
 52  13fca02   2abb   -   1  emcpgpx (PP GPX Ext 5.2.0.0.0)
 53 784d6000  28ed7   -   1  emcpmpx (PP MPX Ext 5.2.0.0.0)
 54 78502000  d6eee   -   1  emcpsapi (PP SAPI Ext 5.2.0.0.0)
 55 785da000  10d49   -   1  emcpcg (PP CG Ext 5.2.0.0.0)
 56 785ea000   6c7f   -   1  emcpvlumd (PP VLUMD Manager 5.2.0.0.0)
 57 785f0000  1b9b0   -   1  emcpxcrypt (PP XCRYPT Manager 5.2.0.0.0)
 58 7860a000   b587   -   1  emcpdm (PP DM Manager 5.2.0.0.0)
 59 7837fdd0    4c1   -   1  emcpioc (PP PIOC 5.2.0.0.0)

Verify PowerPath Devices Are Configured on the host.

# format
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@0,0
       1. c1t1d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@1,0
       2. c3t0d0 DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pci@1d,700000/fibre-channel@1/sd@0,0
       3. c3t2d0 DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pci@1d,700000/fibre-channel@1/sd@2,0
       4. emcpower0a DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pseudo/emcp@0

Check powerpath policy is set, if not run "powermnt set policy".

# powermt display dev=all

Disable the setup script (optional) - the PowerPath setup script may interfere with other applications.

# vi /etc/profile
###/opt/EMCpower/scripts/emcp_setup.sh

# vi /etc/.login
###source /opt/EMCpower/scripts/emcp_setup.csh

Update profile.

MANPATH=$MANPATH:/opt/EMCpower/man
PATH=$PATH:/opt/EMCpower/bin/sparcv9
PATH=$PATH:/etc/emc/bin

Ensure sufficient stack size - the minimum required is 0x6000. If you ever install another application that modifies this setting make sure it is at least 0x6000. A stack size too small can cause a kernel panic and or stack overflow error messages.

# cat /etc/system
...
set emcp:bPxEnableInit=1
set lwp_default_stksize=0x6000
set rpcmod:svc_default_stksize=0x6000


UNINSTALL

Before removing devices from a server remove them from PowerPath with powermt remove. Then unmask/unmap. This way when the devices are presented back to the host they are brought in correctly for PowerPath to configure. When removing devices from PowerPath control it may complain about the devices being opened. Powerpath counts the number of open and close requests. If open requests exceeds close requests then the device can't be removed without first rebooting.

# umount /test
# vxvol -g netbdg stopall
# vxdg deport netbdg
# vxdisk list
# vxdisk rm c3t2d0
# powermt remove dev=c3t2d0

Remove packages.

# /usr/sbin/pkgrm EMCpower
# /etc/emcpv_cleanup
# reboot -- -r


VERITAS

DMP and Powerpath both control devices, but powerpath is really solely controlling it. In a dual environment, both systems will show that they have control of all of the luns (hence "both controlling") But because of the way the multipathing works, only one of the softwares can truly control how the I/O is split, and when a path is marked dead. In the case, powerpath sits significantly lower on the I/O tree than veritas does, so by agreement when both are present powerpath does all of the actual heavy lifting- the multipathing and failover. It reports any changes it makes to veritas, further on the i/o tree so that customers get the benefit of seeing it clearly from the VG level in addition to the driver level.

The ability to set both DMP and PP to control failover separately no longer exists on any of the currently supported software. DMP 3.5 and above is "powerpath aware" so no matter what you set it to, if powerpath is installed, it will defer to powerpath.

The procedure below applies to old versions of Veritas. The pseudo devices are now recommend and automatically detected.

# format
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@0,0
       1. c1t1d0 SUN72G cyl 14087 alt 2 hd 24 sec 424
          /pci@1f,700000/scsi@2/sd@1,0
       2. c3t0d0 DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pci@1d,700000/fibre-channel@1/sd@0,0
       3. c3t2d0 DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pci@1d,700000/fibre-channel@1/sd@2,0
       4. emcpower0a DGC-RAID10-0219 cyl 61438 alt 2 hd 256 sec 16
          /pseudo/emcp@0
# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:sliced     rootdisk     rootdg       online
c1t1d0s2     auto:sliced     rootmirror   rootdg       online
c3t2d0s2     auto            -            -            error
c3t2d0s2     auto            -            -            error
c3t2d0s2     auto            -            -            error

Use the following procedure to prevent a duplicate disk ID that can cause VERITAS commands to fail.

# vxddladm listjbod
VID      PID              Opcode Page Code Page Offset SNO length Policy
==========================================================================
SEAGATE  ALL PIDs             18        -1          36         12 Disk
SUN      SESS01               18        -1          36         12 Disk

# vxddladm addjbod vid=DGC pagecode=0x83 offset=8 length=16
# reboot -- -r

# vxddladm listjbod
VID      PID              Opcode Page Code Page Offset SNO length Policy
==========================================================================
SEAGATE  ALL PIDs             18        -1          36         12 Disk
SUN      SESS01               18        -1          36         12 Disk
DGC      ALL PIDs             18       131           8         16 Disk

# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
c1t0d0s2     auto:sliced     rootdisk     rootdg       online
c1t1d0s2     auto:sliced     rootmirror   rootdg       online
c3t2d0s2     auto:cdsdisk    -            -            online

If you remove PowerPath from the host, return to VERITAS DMP to its default state.

# vxddladm rmjbod vid=DGC
# init 6


TEST

Prepare a couple of scripts for read and write tests. Remove cables and perform SP LUN trespass. Use the following PowerPath commands during these tests:

# powermt display dev=all
# powermt display dev=all every=2
# powermt restore

Write failover test - the following script writes a bunch of 10M files. Run the script and via Navisphere trespass the LUN. The writes will continue without any disruption in service.

# cd /dspinkA
# vi t.sh
typeset -i num
num=200
cnt=1
while (( cnt < num ))
do
  mkfile 10M testfile$cnt
  (( cnt = cnt + 1 ))
done
# ./t.sh

Read failover test - the following script reads the files created by the write script. Run the script and repeat the LUN trespass via exercise. The reads will continue without any disruption in service.

# cd /dspinkA
# vi t.sh
typeset -i num
num=200
cnt=1
while (( cnt < num ))
do
  ls -l -R 
  (( cnt = cnt + 1 ))
done
# ./t.sh

If you have caused LUNs to trespass, restore LUNs to their original SP with the following PowerPath command:

# powermt restore

Message File Warnings - LUN trespasses can cause Solaris disk driver to log warning and or error messages. You can ignore these messages as PowerPath intercepts them and hides them from the application sending IO. See extract of /var/adm/messages when a LUN is trespassed.

Dec 29 14:51:35 d1pr9tmp        Error for Command: write                   Error Level: Fatal
Dec 29 14:51:35 d1pr9tmp scsi: [ID 107833 kern.notice]  Requested Block: 170848 Error Block: 170848
Dec 29 14:51:35 d1pr9tmp scsi: [ID 107833 kern.notice]  Vendor: DGC Serial Number: 030000DAC9CL
Dec 29 14:51:35 d1pr9tmp scsi: [ID 107833 kern.notice]  Sense Key: Not Ready
Dec 29 14:51:35 d1pr9tmp scsi: [ID 107833 kern.notice]  ASC: 0x4 ASCQ: 0x3, FRU: 0x0
Dec 29 14:51:35 d1pr9tmp emcp: [ID 801593 kern.notice] Info: Volume ...FE860673DA11 followed to SPA


HBA REPLACE

Lets assume you had previously configured the host with a single zone to SP B and created a storage group with LUN access. You later add a zone for SP A and confirm the host has connectivity to both ports. In order for the HBA LUN Masking software to get access via SP A you must register PowerPath on the alternate path. Sometimes you need this procedure when replacing a failed HBA.

Windows on CX HBA replacement

  1. Update switch zoning as HBA WWN for replacement card is different.
  2. Engineering Mode Cltrl Shift F12; enter password "messner". The words "Engineering Mode" appear on the tool bar.
  3. Right click the host storage group.
  4. Select properties, select host, and advanced.
  5. Tick new HBA path; this registers the HBA with powerpath.
  6. Update host connectivity; select hosts tab, select host, right click, select Update Now.
  7. Reboot Windows Server; second adaptor should be visible, however, new HBA will not have I/O traffic.

From Windows server command prompt verify boths paths are active.

# powermt display dev=all
# powermt config
# powermt check
# powermt save

Verify LUNs have trespassed back via Navisphere.



CLEAN UP

PowerPath sits on top of the kernel and acts as a traffic cop. However, it's configuration files are compiled at boot. Therefore, online commands do not always work and sometimes these files need to be recompiled and PowerPath needs to be reloaded. The easiest and most effective way to do this is through a reboot. There aren't any other methods that are EMC approved besides remove the powermt.custom file and run powermt config. This hardly ever works. This is why EMC recommends a reboot.

If PowerPath gets duplicates pseudo entries caused by incorrect unmapping and remapping, first move ALL of the powermt.custom files to a backup location.

# cd /etc
# mkdir powermt_dirback
# mv powermt.custom* powermt_dirback

Reboot host with configuration.

# reboot -- -r

When the server comes back up.

# powercf -q
# powermt config
# powermt save

Then run powermt display and confirm everything is correct, finally save configuration.

# powermt display dev=all	;check pseudo names exist e.g. Pseudo name=emcpower20a
# powermt display		;check I/O paths match and are optimal


COMMANDS

emcpreg -list ;see capabilities
emcpreg -check key ;check valid
powercf ;configure all detected emcpower devices, executed on boot
powermt config ;configure all detected logical devices, executed on boot
powermt display ;see logical device count, number of paths, errors etc
powermt display dev=all ;see pseudo devices, paths alive or dead, errors
powermt display dev=all every=2 ;see pseudo devices, paths alive or dead, every 2 seconds
powermt display ports ;see status of SP ports
powermt display paths ;see status of host paths
powermt remove ;removes devices from PowerPath control
powermt check ;check each path, if path is dead you will be prompted to remove it
powermt remove force dev=xxxx ;force delete of device with dead path
powermt check force ;deletes all dead paths
powermt save ;saves configuration across reboots /etc/powermt.custom file
powermt restore ;manually reconfigures path is dead path passes a test to alive
powermt set policy=co dev=all ;set policy on full license for CLARiion to overcome inactive paths due to license bug
powermt set policy=so dev=all ;set policy on full license for DMX to overcome inactive paths due to license bug
powermt set policy=re dev=all ;set policy on base license to overcome inactive paths due to license bug
powermt set mode=active dev=all ;set active mode, only works with Fibre Channel Devices
powermt display options ;to display storage class and state, see powermt manage
powermt manage class=clariion ;to place storage class under powerpath control, need reboot to take affect