|
dave spink toolset |
|
NEW SHARENew NFS Share Using Quotas Determine based on host IP which NAS server interface to use. If possible, run a dfshares command from host to verify they have network access to NAS server. Then set a NAS Server variable, in this example we are using "server_4". *nas server* # server_ifconfig ALL -all # mydm=server_4 *unix server* # dfshares us-tameclr014450-tlsf01 ***example of data mover NIC interface selected Set new share, size and pool. Review available space in pool. # newshr=santeam # mygb=1 # nas_pool -list # mypl=clar_r5_performance # nas_pool -size $mypl Select file system to use (review quotas assigned to fs). # nas_fs -list # nas_quotas -t -list -fs fs_dm4_01 # nas_fs -size fs_dm4_01 # myfs=fs_dm4_01 Create a new tree quota. # server_export $mydm | grep $newshr # echo $myfs $newshr # echo "nas_quotas -on -tree -fs $myfs -path /$newshr -comment uxnbpr18" # nas_quotas -on -tree -fs $myfs -path /$newshr -comment 'uxnbpr18' # nas_quotas -t -list -fs $myfs Set hard and soft block limits, first get node number. The blocks are in KB, hence this example has 1024MB file system hard limit. # nas_quotas -t -list -fs $myfs # myid=1 ***this example only # echo $myfs $myid # nas_quotas -edit -tree -fs $myfs -block 1048576:819200 -comment 'uxnbpr18' $myid # nas_quotas -t -report -fs $myfs Create NFS share permissions. Note, if using server_4 the directory export file is /nbsnas/server/server_3. You can validate this by reviewing root file systems. # nas_fs -list | more id inuse type acl volume name server 1 n 1 0 10 root_fs_1 2 y 1 0 12 root_fs_2 1 3 y 1 0 14 root_fs_3 2 4 y 1 0 16 root_fs_4 3 5 y 1 0 18 root_fs_5 4 6 n 1 0 20 root_fs_6 # cd /nbsnas/server/server_3 # cp export export-`date +%Y-%m-%d` # ls -l export* # vi export export "/fs_dm4_01/santeam" name="/santeam" root=10.26.102.18/23 access=10.26.102.18/23 # diff export export-`date +%Y-%m-%d` Export NFS share. # echo $newshr $mydm $myfs # server_export $mydm -Protocol nfs /$myfs/$newshr # server_export $mydm | grep $newshr Confirm export is visible and re-check share size before advising user. *unix server* # newshr=santeam # dfshares us-tameclr014450-tlsf01 | grep $newshr uxnbpr18:/opt/san_4145/drs/audit/ecc-0522> dfshares us-tameclr014450-tlsf01 | grep santeam us-tameclr014450-tlsf01:/fs_dm4_01/santeam us-tameclr014450-tlsf01 - - us-tameclr014450-tlsf01:/santeam us-tameclr014450-tlsf01 - - *nas sever* # nas_quotas -t -report -fs $myfs New CIFS Share Using Quotas Determine based on host IP which NAS server interface to use. If possible, run a browse command from host to verify they have network access to NAS CIFS server. Then set a NAS Server variable, in this example the CIFS server is running is a VDM on server_4 called vdm01. *nas server* # server_cifs server_4 # server_cifs server_2 # server_cifs server_4 | grep "CIFS service" ***check if CIFS on VDM # myvdm=vdm01 *windows server* \\USTPA3CIFS4450 ***example of data mover NIC this CIFS server uses Set new share and pool. Review available space in pool. # newshr=santest # nas_pool -list # mypl=clar_r5_performance # nas_pool -size $mypl Select file system to use (review quotas assigned to fs); this example is using cifs_dm4_01. # nas_fs -list | grep cifs # nas_quotas -t -list -fs cifs_dm4_01 # nas_fs -size cifs_dm4_01 # myfs=cifs_dm4_01 Create a new tree quota. # server_export $myvdm | grep $newshr # echo $myfs $newshr # echo "nas_quotas -on -tree -fs $myfs -path /$newshr -comment XYZ" # nas_quotas -on -tree -fs $myfs -path /$newshr -comment 'XYZ' # nas_quotas -t -list -fs $myfs Set hard and soft block limits, first get node number. The blocks are in KB, hence this example has 1024MB file system hard limit. # nas_quotas -t -list -fs $myfs # myid=2 ***this example only # echo $myfs $myid # nas_quotas -edit -tree -fs $myfs -block 1048576:819200 -comment 'test only' $myid # nas_quotas -t -report -fs $myfs Create CIFS share. Note, the first VMD you created export.shares file is /nbsnas/server/vdm/vdm_1. You can validate this by reviewing root file systems. # nas_fs -list | grep cifs 2781 y 1 0 4576 cifs_dm4_01 v1 2895 y 1 0 4811 cifs_dm2_01 v2 # cd /nbsnas/server/vdm/vdm_1 # cp export.shares export.shares-`date +%Y-%m-%d` # ls -l export.shares* # vi export.shares share "santest2" "/cifs_dm4_01/santest2" umask=022 maxusr=4294967295 comment="Internal CIFS Test Share" # diff export.shares export.shares-`date +%Y-%m-%d` Export CIFS share. # echo $newshr $myvdm $myfs # echo "server_export $myvdm -Protocol cifs -name $newshr /$myfs/$newshr" # server_export $myvdm -Protocol cifs -name $newshr /$myfs/$newshr # server_export $myvdm | grep $newshr Confirm export is visible and re-check share size before advising user. *windows* \\USTPA3CIFS4450\'sharename' *nas sever* # nas_quotas -t -report -fs $myfs New NFS Share with Dedicated File System Determine based on host IP which NAS server interface to use. If possible, run a dfshares command from host to verify they have network access to NAS server. Then set a NAS Server variable, in this example we are using "server_4". *nas server* # server_ifconfig ALL -all # mydm=server_4 *unix server* # dfshares ustpa3clr01-02 ***example of data mover NIC interface selected Set new share, size and pool. Review available space in pool. # newshr=econtent_wfp # mygb=5 # nas_pool -list # mypl=symm_std # nas_pool -size $mypl Create file system, first check the file system does not exist. # nas_fs -info $newshr # echo $mydm $newshr $mygb $mypl # echo "nas_fs -name $newshr -create size=${mygb}G pool=$mypl -auto_extend no -option slice=y" # nas_fs -name $newshr -create size=${mygb}G pool=$mypl -auto_extend no -option slice=y Create mountpoint and mount file system. # server_mountpoint $mydm -create /$newshr # server_mountpoint $mydm -list | grep $newshr # server_mount $mydm $newshr /$newshr # server_mount $mydm | grep $newshr Create NFS share permissions. Note, if using server_4 the directory export file is /nbsnas/server/server_3. # cd /nbsnas/server/server_3 # cp export export-`date +%Y-%m-%d` # ls -l export* # vi export export "/econtent_wfp" root=10.26.120.0/23:10.26.121.0/23 access=10.26.120.0/23:10.26.121.0/23 # diff export export-`date +%Y-%m-%d` Export NFS share. # echo $newshr $mydm # server_export $mydm -Protocol nfs /$newshr # server_export $mydm | grep $newshr Confirm export is visible and re-check share size before advising user. *unix server* # newshr=econtent_wfp # dfshares ustpa3clr01-02 | grep $newshr ustpa3clr01-02:/econtent_wfp ustpa3clr01-02 - - *nas sever* # nas_fs -size $newshr ADD SHARERun df -k from requesting server to get an existing NFS share. Use the existing share to determine data mover. *unix server* # df -k ustpa3clr01-02:/Stage_econtent_ind 20215136 1182008 19033128 6% /econtent/ind ustpa3clr01-02:/Stage_econtent_km 10331802 1726514 860528 17% /econtent/km *nas server* # oldshr=Stage_econtent_km # myfs=`server_mount ALL | grep $oldshr | awk '{print $1}'` # mydm=`nas_fs -info $myfs | grep rw_servers | awk '{print $2}'` Set new share, size and pool. Review available space in pool. # newshr=econtent_wfp # mygb=5 # mypl=`nas_fs -info $myfs | grep pool | awk '{print $3}'` # nas_pool -size $mypl Create file system, first check the file system does not exist. # nas_fs -info $newshr # echo $mydm $newshr $mygb $mypl # echo "nas_fs -name $newshr -create size=${mygb}G pool=$mypl -auto_extend no -option slice=y" # nas_fs -name $newshr -create size=${mygb}G pool=$mypl -auto_extend no -option slice=y Create mountpoint and mount file system. # server_mountpoint $mydm -create /$newshr # server_mountpoint $mydm -list | grep $newshr # server_mount $mydm $newshr /$newshr # server_mount $mydm | grep $newshr Create NFS share permissions. Note, if using server_4 the directory export file is /nbsnas/server/server_3. # echo $newshr $mydm $oldshr # cd /nbsnas/server/server_3 # cp export export-`date +%Y-%m-%d` # ls -l export* # grep $oldshr export > t.t # cat t.t # sed "s/$oldshr/$newshr/" t.t > x.x # cat x.x # cat x.x >> export # diff export export-`date +%Y-%m-%d` export "/econtent_wfp" root=10.26.120.0/23:10.26.121.0/23 access=10.26.120.0/23:10.26.121.0/23 # rm t.t x.x Export NFS share. # echo $newshr $mydm # server_export $mydm -Protocol nfs /$newshr # server_export $mydm | grep $newshr Confirm export is visible and re-check share size before advising user. *unix server* # newshr=econtent_wfp # dfshares ustpa3clr01-02 | grep $newshr ustpa3clr01-02:/econtent_wfp ustpa3clr01-02 - - *nas sever* # nas_fs -size $newshr INCREASE SHAREHave requestor provide df -k output in order to obtain NFS server and share name. ustpa3nsx01-nfs:/BatchImport 246G 51G 195G 21% /BatchImport <---- increase to 500 m Create some temporary variables - existing share, NFS alias name, NFS server name, Pool name and Increase size amount. # oldshr=BatchImport # dmsnic=ustpa3nsx01-nfs # mygb=354 # myfs=`server_mount ALL | grep $oldshr | awk '{print $1}'` # mydm=`nas_fs -info $myfs | grep servers | grep rw | cut -d= -f2 | sed 's/^.//'` # mypl=`nas_fs -info $myfs | grep pool | cut -d= -f2 | sed 's/^.//'` # echo $myfs $dmsnic $mydm $mypl $mygb Check current size, pool space free and that NFS alias name exists. # nas_fs -size $myfs # nas_pool -size $mypl # server_ifconfig ALL -a | grep $dmsnic Extend file system and check. # echo $myfs $dmsnic $mydm $mypl $mygb # echo "nas_fs -xtend $myfs size=${mygb}G pool=$mypl" # nas_fs -xtend $myfs size=${mygb}G pool=$mypl # nas_fs -size $myfs BAD SECTORSWhen the data mover detects any uncorrectable sector that data mover panics and marks the file system as unmountable. Since Celerra code level 5.5.23 there is a utility called Revector which gets invoked (call EMC and ask for it) when such a case occurs. You may have heard of the term Volcopy, the Revector is basically a more sophisticated version of the manual Volcopy process. Revector will zero out the uncorrectable sectors and provide the affected file(s) if there are any. The time taken to perform this task depends on the file system size. Determine if there was a data move panic. # server_log server_2 | more # server_log server_4 | more Determine what file systems are unmounted. # server_mount server_2 | grep -i un fs02 on /root_vdm_2/fs02 uxfs,perm,rw, Determine list of volume ID associated with file system. # nas_fs -l | grep -i fs02 45 y 1 0 254 fs02 v2 74 y 7 0 323 ckpt_fs02_schedule_ v2 121 y 7 0 323 ckpt_fs02_schedule_ v2 145 y 7 0 323 ckpt_fs02_schedule_ v2 168 y 7 0 323 ckpt_fs02_schedule_ v2 191 y 7 0 323 ckpt_fs02_schedule_ v2 214 y 7 0 323 ckpt_fs02_schedule_ v2 237 y 7 0 323 ckpt_fs02_schedule_ v2 260 y 7 0 323 ckpt_fs02_schedule_ v2 283 y 7 0 323 ckpt_fs02_schedule_ v2 306 y 7 0 323 ckpt_fs02_schedule_ v2 329 y 7 0 323 ckpt_fs02_schedule_ v2 352 y 7 0 323 ckpt_fs02_schedule_ v2 375 y 7 0 323 ckpt_fs02_schedule_ v2 419 y 7 0 323 ckpt_fs02_schedule_ v2 674 y 7 0 323 fs02_a v2 675 y 7 0 323 fs02_b v2 # nas_fs -i id=45 | grep rw_vdms rw_servers= server_2 rw_vdms = vdm02 Find volume ID in the logs that detected the bad sectors. Use this to cross reference (a second check) with umount file system above. $ server_log server_2 -s -f | grep -i sector 2008-05-17 01:52:39: VRPL: 4: Bad Sector PH registerSelf panicHandler 2008-05-17 01:52:39: STORAGE: 3: Volume:254 affected by bad sector block:2144603296 repair state 2 2008-05-17 01:52:43: ADMIN: 4: Command succeeded: log sectors=131072 74 2008-05-17 01:52:44: STORAGE: 3: Found bad sector 2144603296 on Vol:254 repair state 2 $ nas_volume -i id=254 id = 254 name = v254 acl = 0 in_use = True type = meta volume_set = v253,v807,v876,v1051 disks = d96,d48,d102,d54,d328,d302,d99 clnt_filesys= fs02 Call EMC support as request a volcopy (or Revector) operation. This can takes many hours to complete. From the control station you can monitor via command line. To determine which LUNs are affected run nas_disk, the output is in hex hence 0065 in LUN 101. To see how many uncorrectables exist on a SP contains you can run getlog command. # .server_config server_2 -v "volcopy display" # nas_disk -l | grep d96 96 y 819199 APM00045101629-0065 CLATA d96 1,2,3,4 # navicli -h 1629_spb getlog | grep -i uncorrectable When the volcopy is complete mount the file system, the data mover file .etc that tells data mover to mount on. # server_mount server_2 vdm02 fs02 # server_mount server | grep fs02 Refresh check points if needed. # nas_fs -list | grep fs02 # fs_ckpt fs02 -l # fs_ckpt ckpt_fs20_schedule_001 -refresh .... # fs_ckpt ckpt_fs20_schedule_014 -refresh # fs_ckpt fs02 -l MAP STORAGECheck devices not mapped and no masking entries. # mysid=1384, mydev=1CED # symdev -sid $mysid show $mydev | grep FA # symmaskdb -sid $mysid -dev $mydev list assign A protection feature in some versions of SYMAPI disallows control of devices of emulation type CELERRA_FBA. Hence modify options file to allow mapping. # vi /var/symapi/config/options SYMAPI_ALLOW_CELERRA_DEV_CTRL=ENABLE Set emulation type to NAS and configure. # symdev -sid 1384 show $mydev | grep Emulation # cat t.txt set device XXXX emulation=CELERRA_FBA; # symconfigure -sid $mysid -f fa_map.txt .... (preview, prepare, commit) # symdev -sid $mysid show $mydev | grep Emulation Map unused disk to FA, there is no masking required. # symcfg -sid $mysid list -fa 7a -p 1 -address -available # cat fa_map.txt map dev XXXX to dir 7a:1, lun=20A; map dev XXXX to dir 7a:0, lun=20A; map dev XXXX to dir 10a:0, lun=20A; map dev XXXX to dir 10a:1, lun=20A; # symconfigure -sid $mysid -f fa_map.txt .... (preview, prepare, commit) Probe to check devices are visible, is so configure. # server_devconfig ALL -p -s -a # server_devconfig ALL -c -s -a If you receive an error about symlocks the quickest method to resolve is rebooting the control station. First check the control station status i.e. 10 and 11 mean up. $ /nas/sbin/getreason 10 - slot_0 primary control station 11 - slot_1 secondary control station 5 - slot_2 contacted 5 - slot_3 contacted 5 - slot_4 contacted 5 - slot_5 contacted $ su - # reboot Confirm disk is available after rescan task is completed. # nas_disk -list | grep $mydev 1588 n 30712 000190101384-1CED STD d1588 1,2,3,4 UMAP STORAGEVerify that the disks are not in use. # nas_disk -l Permanently remove any standards that were freed up. # nas_disk -d Unmask volumes from the vcmdb. # symmask -sid $mysid -wwn $myhba0 -dir 04a -p 1 -force -celerra remove devs 025D:025F # symmask -sid $mysid -wwn $myhba1 -dir 13a -p 1 -force -celerra remove devs 025D:025F # symmask -sid $mysid refresh Write Disable Celerra emulated volumes. # for i in `cat hypers.txt` do echo $i symdev -sid $mysid write_disable $i -sa $myfa0 -p $myfap0 -noprompt symdev -sid $mysid write_disable $i -sa $myfa1 -p $myfap1 -noprompt done Unmap Celerra emulated volumes from the FA paths. # vi fa_unmap.txt unmap dev 025D from dir 04A:1, emulation=CELERRA_FBA; unmap dev 025E from dir 04A:1, emulation=CELERRA_FBA; unmap dev 025F from dir 04A:1, emulation=CELERRA_FBA; unmap dev 025D from dir 13A:1, emulation=CELERRA_FBA; unmap dev 025E from dir 13A:1, emulation=CELERRA_FBA; unmap dev 025F from dir 13A:1, emulation=CELERRA_FBA; # symconfigure -sid $mysid -f fa_unmap.txt preview # symconfigure -sid $mysid -f fa_unmap.txt prepare # symconfigure -sid $mysid -f fa_unmap.txt commit Change the CELERRA_FBA emulation to FBA emulation. # vi em_type.txt set device 025D emulation=fba; set device 025E emulation=fba; set device 025F emulation=fba; # symconfigure -sid $mysid -f em_type.txt preview # symconfigure -sid $mysid -f em_type.txt commit WEBUI ISSUESIf the Web Interface doesn't allow you access, first clear the browser cookies then perform the following. # rm -f /nas/tmp/*.fmt # /nas/sbin/js_kill -f # /nas/sbin/js_fresh_restart # /nas/sbin/httpd -D HAVE_PERL -D HAVE_SSL -f /nas/http/conf/httpd.conf # /nas/http/webui/etc/tomcat restart # killall apl_task_mgr # /nas/sbin/ch_stop To stop/start Jserver. # cat /nas/jserver/logs/system_log # /nas/sbin/js_shutdown # /nas/sbin/js_kill # /nas/sbin/js_fresh_restart DELETE SHAREDetermine mount status of share. # mysh=tempspace1 # mydm=`nas_fs -info $mysh | grep rw_servers | awk '{print $2}'` # server_mount $mydm | grep $mysh Disable client access to the Celerra file system. # server_export $mydm -Protocol nfs -unexport -perm /$mysh # server_export $mydm | grep $mysh Umount file system if required. # server_umount $mydm -perm $mysh # server_mount $mydm | grep $mysh Delete Celerra File System. # nas_fs -delete $mysh FAILOVER CHECKFirst check the control station status i.e. 10 and 11 mean up. $ /nas/sbin/getreason 10 - slot_0 primary control station 11 - slot_1 secondary control station 5 - slot_2 contacted 5 - slot_3 contacted 5 - slot_4 contacted 5 - slot_5 contacted Check logs for date/time. # grep -i "has panicked" /nas/log/sys_log* # grep HBA /nas/log/sys_log* | grep "Link Down" Important...if a data mover fails over to a standby, the identity (WWN and IP) swap over. See the before and after view below for NAS servers. If you run "fcp show" for "server_4" in a faulted state you are really seeing "server_5" WWN. Keep this in mind when reviewing switch port and zone members names. Server_4 now has Server_5 WWN's as a faulted condition exists. You must know this when tracing back to fabric and ports. # nas_server -l ***Before fail over id type acl slot groupID state name 1 1 1000 2 0 server_2 2 4 1000 3 0 server_3 3 1 1000 4 0 server_4 4 4 1000 5 0 server_5 # nas_server -l ***After fail over id type acl slot groupID state name 1 1 1000 2 0 server_2 2 4 1000 3 0 server_3 3 4 1000 4 2 server_4.faulted.server_5 ***no longer primary 4 1 1000 5 0 server_4 ***share location # .server_config server_4 -v "fcp show" ***WWN displayed belong to server_5 # .server_config server_4.faulted.server_5 -v "fcp show" ***WWN displayed belong to server_4 Check the bind pending state. If status is "Bind Pending" then either GBIC or cable is most likely problem. The first part of our fail over problem was with a failed GBIC. # .server_config server_4.faulted.server_5 -v "fcp bind show" ***prior to data mover GBIC replacement Persistent Binding Table Chain 0000: WWN 5006048accaafcf8 HBA 0 FA-09db Bind Pending Chain 0016: WWN 5006048accaafcf6 HBA 0 FA-07db Bind Pending Chain 0032: WWN 5006048accaafcf9 HBA 1 FA-10db Bound Chain 0048: WWN 5006048accaafcf7 HBA 1 FA-08db Bound Dynamic Binding Table Chain 0000: WWN 0000000000000000 HBA 0 ID 0 Inx 00:81 Pid 0000 D_ID 000000 Non Chain 0016: WWN 0000000000000000 HBA 0 ID 0 Inx 01:81 Pid 0016 D_ID 000000 Non Chain 0032: WWN 5006048accaafcf9 HBA 1 ID 1 Inx 02:01 Pid 0032 D_ID 745713 Non Chain 0048: WWN 5006048accaafcf7 HBA 1 ID 1 Inx 03:00 Pid 0048 D_ID 744f13 Sys # .server_config server_4.faulted.server_5 -v "fcp bind show" ***after data mover GBIC replacement Persistent Binding Table Chain 0000: WWN 5006048accaafcf8 HBA 0 FA-09db Bound Chain 0016: WWN 5006048accaafcf6 HBA 0 FA-07db Bound Chain 0032: WWN 5006048accaafcf9 HBA 1 FA-10db Bound Chain 0048: WWN 5006048accaafcf7 HBA 1 FA-08db Bound Dynamic Binding Table Chain 0000: WWN 0000000000000000 HBA 0 ID 0 Inx 00:81 Pid 0000 D_ID 000000 Non Chain 0016: WWN 0000000000000000 HBA 0 ID 0 Inx 01:81 Pid 0016 D_ID 000000 Non Chain 0032: WWN 5006048accaafcf9 HBA 1 ID 1 Inx 02:01 Pid 0032 D_ID 745713 Non Chain 0048: WWN 5006048accaafcf7 HBA 1 ID 1 Inx 03:00 Pid 0048 D_ID 744f13 Sys If you look closely to the above output there is still a problem. Chain 0 is bound, however, in the dynamic table note that Chain 0 has no Sys devices. This prevents the data mover from failing back correctly and may result in data mover panic. You can confirm this by checking for online status and probe chain output. # .server_config server_4.faulted.server_5 -v "fcp show" FCP ONLINE HBA 0: S_ID 737601 WWN: 5006016030603957 DX2 FCP scsi-0: HBA 0: CHAINS 0 - 15 OFFLINE FCP scsi-16: HBA 0: CHAINS 16 - 31 OFFLINE FCP ONLINE HBA 1: S_ID 747601 WWN: 5006016130603957 DX2 FCP scsi-32: HBA 1: D_ID 745713 FA-10db: 5006048accaafcf9 Class 3 FCP scsi-48: HBA 1: D_ID 744f13 FA-08db: 5006048accaafcf7 Class 3 # server_devconfig server_4.faulted.server_5 -p -s -a chain= 0, scsi-0 : no devices on chain chain= 35, scsi-35 stor_id= 000187870195 celerra_id= 00018787019523D7 # symmask -sid 0195 list logins | grep -i 5006016030603957 Logged On Identifier Type Node Name Port Name FCID In Fabric ---------------- ----- --------------------------------- ------ ------ ------ 5006016030603957 Fibre NULL NULL 737601 No Yes 5006016030603957 Fibre NULL NULL 737601 No Yes We confirmed the WWN are not logged into the array, although they have logged into the switch with valid zone members. Hence, the second part of our failure appears cable or port related. After blocking and unblocking the port the WWN logs into the array; note the Chain output now shows Sys devices. # .server_config server_4.faulted.server_5 -v "fcp show" FCP ONLINE HBA 0: S_ID 737601 WWN: 5006016030603957 DX2 FCP scsi-0: HBA 0: D_ID 734f13 FA-09db: 5006048accaafcf8 Class 3 FCP scsi-16: HBA 0: D_ID 735713 FA-07db: 5006048accaafcf6 Class 3 FCP ONLINE HBA 1: S_ID 747601 WWN: 5006016130603957 DX2 FCP scsi-32: HBA 1: D_ID 745713 FA-10db: 5006048accaafcf9 Class 3 FCP scsi-48: HBA 1: D_ID 744f13 FA-08db: 5006048accaafcf7 Class 3 # .server_config server_4.faulted.server_5 -v "fcp bind show" *** Persistent Binding Table *** Chain 0000: WWN 5006048accaafcf8 HBA 0 FA-09db Bound Chain 0016: WWN 5006048accaafcf6 HBA 0 FA-07db Bound Chain 0032: WWN 5006048accaafcf9 HBA 1 FA-10db Bound Chain 0048: WWN 5006048accaafcf7 HBA 1 FA-08db Bound Existing CRC: d73c2401, Actual: d73c2401, CRC Matchs *** Dynamic Binding Table *** Chain 0000: WWN 5006048accaafcf8 HBA 0 ID 0 Inx 00:01 Pid 0000 D_ID 734f13 Sys Chain 0016: WWN 5006048accaafcf6 HBA 0 ID 0 Inx 01:00 Pid 0016 D_ID 735713 Non Chain 0032: WWN 5006048accaafcf9 HBA 1 ID 1 Inx 02:00 Pid 0032 D_ID 745713 Non Chain 0048: WWN 5006048accaafcf7 HBA 1 ID 1 Inx 03:01 Pid 0048 D_ID 744f13 Sys The Chain 0 was lost which meant the control disks were not reachable. We need to recover them, first probe for the devices and if that works configure them as follows. # server_devconfig server_4.faulted.server_5 -p -s -a chain= 0, scsi-0 stor_id= 000187870195 celerra_id= 00018787019523D7 chain= 35, scsi-35 stor_id= 000187870195 celerra_id= 00018787019523D7 # server_devconfig server_4.faulted.server_5 -c -s -a ..done To fail data mover back, first check bound, online and chain status. Then issue fail over command. # nas_server -l ***After fail over id type acl slot groupID state name 1 1 1000 2 0 server_2 2 4 1000 3 0 server_3 3 4 1000 4 2 server_4.faulted.server_5 4 1 1000 5 0 server_4 # .server_config server_4.faulted.server_5 -v "fcp bind show" Persistent Binding Table Chain 0000: WWN 5006048accaafcf8 HBA 0 FA-09db Bound Chain 0016: WWN 5006048accaafcf6 HBA 0 FA-07db Bound Chain 0032: WWN 5006048accaafcf9 HBA 1 FA-10db Bound Chain 0048: WWN 5006048accaafcf7 HBA 1 FA-08db Bound Existing CRC: d73c2401, Actual: d73c2401, CRC Matchs Dynamic Binding Table Chain 0000: WWN 5006048accaafcf8 HBA 0 ID 0 Inx 00:01 Pid 0000 D_ID 734f13 Sys Chain 0016: WWN 5006048accaafcf6 HBA 0 ID 0 Inx 01:00 Pid 0016 D_ID 735713 Non Chain 0032: WWN 5006048accaafcf9 HBA 1 ID 1 Inx 02:00 Pid 0032 D_ID 745713 Non Chain 0048: WWN 5006048accaafcf7 HBA 1 ID 1 Inx 03:01 Pid 0048 D_ID 744f13 Sys # .server_config server_4.faulted.server_5 -v "fcp show" FCP ONLINE HBA 0: S_ID 737601 WWN: 5006016030603957 DX2 FCP scsi-0: HBA 0: D_ID 734f13 FA-09db: 5006048accaafcf8 Class 3 FCP scsi-16: HBA 0: D_ID 735713 FA-07db: 5006048accaafcf6 Class 3 FCP ONLINE HBA 1: S_ID 747601 WWN: 5006016130603957 DX2 FCP scsi-32: HBA 1: D_ID 745713 FA-10db: 5006048accaafcf9 Class 3 FCP scsi-48: HBA 1: D_ID 744f13 FA-08db: 5006048accaafcf7 Class 3 # server_standby server_4 -r mover server_4 : server_4 : going standby server_4.faulted.server_5 : going active replace in progress ...done failover activity complete commit in progress (not interruptible)...done server_4 : renamed as server_5 server_4.faulted.server_5 : renamed as server_4 # nas_server -l id type acl slot groupID state name 1 1 1000 2 0 server_2 2 4 1000 3 0 server_3 3 1 1000 4 0 server_4 4 4 1000 5 0 server_5 FSCK CHECKTo perform an fsck of a filesystem on the datamover, first determine which data mover the share is using. Only two fsck processes run on a single Data Mover simultaneously. # mysh=shared_app_bea_wshintprddms51_53 # mydm=`nas_fs -info $mysh | grep rw_servers | awk '{print $2}'` # server_mount $mydm | grep $mysh shared_app_bea_wshintprddms51_53 on /shared_app_bea_wshintprddms51_53 uxfs,perm,rw Run the fsck, please keep in mind, during the duration of the fsck (filesystem check) ... users will not be able to access this filesystem until fsck is complete. The fsck umounts the file system if mounted, runs fsck and then remounts. # mysh=shared_app_bea_wshintprddms51_53 # nas_fsck -start $mysh -mover server_2 # nas_fsck -list # nas_fsck -info $mysh SSL CERTRecreating SSL certificate on Control Station. # /nas/sbin/nas_config -ssl # /nas/sbin/httpd -D HAVE_PERL -D HAVE_SSL -f /nas/http/conf/httpd.conf # /nas/http/webui/etc/tomcat restart # date HEALTH CHECKRun these commands for checking NAS status. # nas_checkup # tail -200 /nas/log/sys_log # server_log server_2 # server_log server_4 # /nas/sbin/getreason # nas_server -list # nas_server -info -all # server_sysstat ALL # server_mount server_2 | grep -i un # server_mount server_4 | grep -i un # server_export server_2 # server_export server_4 # nas_fs -list # server_df server_2 # server_df server_4 # nas_pool -list # nas_pool -size xxxx # server_ifconfig $mydm -a # server_ping server_2 -i ustpa3clr01-1-nfs-t2 10.26.131.130 # .server_config ALL -verbose 'fcp show' # .server_config ALL -verbose 'fcp bind show' Run spcollect if needed. # cd /nas/tools # ./.get_spcollect # cd /nas/var/log # ls -l SP_COLLECT.zip SNAPSURE EXAMPLECreate file system, export, mount and create a file on Unix server. See NEW SHARE on this page if you need help. # df -k . Filesystem kbytes used avail capacity Mounted on ustpa3clr01-02:/santeam 1032560 103040 929520 10% /opt/san_ns704 Create a checkpoint (snapshot) of the file system, call it Monday. # fs_ckpt santeam -name Monday -Create Compare the size of SavVol with the size of File System. # nas_fs -size Monday # nas_fs -size santeam Verify that the checkpoint file system was automatically mounted to the data mover. Export the checkpoint so that you can mount it. # server_mount server_4 | egrep '(Monday|santeam)' santeam on /santeam uxfs,perm,rw Monday on /Monday ckpt,perm,ro # server_export server_4 -P nfs -o access=10.26.102.18/23,root=10.26.102.18/23 /Monday # server_export server_4 | egrep '(Monday|santeam)' export "/santeam" access=10.26.102.18/23 root=10.26.102.18/23 export "/Monday" access=10.26.102.18/23 root=10.26.102.18/23 Mount the read only checkpoint, check contents and create some more files in the file system. # /opt/san_ns704> ls -l -rw------T 1 dspink001 other 104857600 Jan 14 03:35 testfile1 -rw------T 1 dspink001 other 104857600 Jan 14 06:23 testfile2 # /opt/san_ckpt> ls -l drwxr-xr-x 2 root root 8192 Jan 14 03:17 lost+found -rw------T 1 dspink001 other 104857600 Jan 14 03:35 testfile1 From the Celerra retore the check point and view the file system contents. $ su # /nas/sbin/rootfs_ckpt Monday -R # /opt/san_ns704> ls -l -rw------T 1 dspink001 other 104857600 Jan 14 03:35 testfile1 Additional check points are created each time a restore is performed. This is to prevent an accidental restore overwriting needed data. $ nas_fs -list | egrep '(Monday|santeam)' 2818 y 1 0 5171 santeam 3 2820 y 7 0 5174 Monday 3 2821 y 7 0 5174 santeam_ckpt1 3 $ server_export server_4 -P nfs -o access=10.26.102.18/23,root=10.26.102.18/23 /santeam_ckpt1 # /opt/san_ckpt1> ls -l drwxr-xr-x 2 root root 8192 Jan 14 03:17 lost+found -rw------T 1 dspink001 other 104857600 Jan 14 03:35 testfile1 -rw------T 1 dspink001 other 104857600 Jan 14 06:23 testfile2 Clean up exports and file systems. $ server_export server_4 -P nfs -unexport -perm /santeam_ckpt1 $ server_export server_4 | grep santeam export "/santeam" access=10.26.102.18/23 root=10.26.102.18/23 $ server_umount server_4 santeam_ckpt1 $ server_mount server_4 | grep santeam santeam on /santeam uxfs,perm,rw santeam_ckpt1 on /santeam_ckpt1 ckpt,perm,ro,unmounted $ server_umount server_4 -perm santeam_ckpt1 $ server_mount server_4 | grep santeam santeam on /santeam uxfs,perm,rw $ nas_fs -delete santeam_ckpt1 $ nas_fs -list | egrep '(Monday|santeam)' 2818 y 1 0 5171 santeam 3 2820 y 7 0 5174 Monday 3 REPLICATOR EXAMPLEConfirm data mover interconnects are already configured. The cel commands worked for remotely linked Celerras. source# nas_cel -list id name owner mount_dev channel net_path CMU 0 ustpa3nsxclr0 0 10.26.58.225 00019010138407EC 1 target 0 10.12.126.127 0001901017480763 target# nas_cel -list id name owner mount_dev channel net_path CMU 0 USNYCZNSXCLR0 0 10.12.126.127 0001901017480763 1 source 0 10.26.58.225 00019010138407EC Setup Replication Source Site - Create file system, export, mount and create a file on Unix server. See NEW SHARE on this page if you need help. # df -k . Filesystem kbytes used avail capacity Mounted on ustpa3nsx04-nfs:/santeam 1032560 103040 929520 10% /opt/nas_2462 $ nas_fs -size santeam total = 1008 avail = 907 used = 100 ( 9% ) (sizes in MB) ( blockcount = 2097152 ) volume: total = 1024 (sizes in MB) ( blockcount = 2097152 ) Target Site - Create file system via samesize option. $ nas_fs -name santeam -type rawfs -create samesize=santeam:ce1=source pool=GFS_Storage_pool $ nas_fs -size santeam total = 1024 (sizes in MB) ( blockcount = 2097152 ) Target Site - Create mountpoint and mount file system. Note, the rawfs has status unmounted. $ server_mountpoint server_2 -create /santeam $ server_mount server_2 -o ro santeam /santeam $ server_mount server_2 | grep santeam santeam on /santeam rawfs,perm,ro, Source Site - Configure the checkpoint. $ fs_ckpt santeam -Create $ nas_fs -list | grep santeam 24751 y 1 0 83785 santeam 1 24826 y 7 0 83871 santeam_ckpt1 1 Source Site - Copy file system contents to remote site. $ fs_copy -start santeam_ckpt1 santeam:cel=target -option convert=no,monitor=off Source Site - Start Replication $ fs_replicate -start santeam santeam:cel=target -option dto=300,dhwm=300 $ fs_replicate -info santeam Source Site - Configure another checkpoint. $ fs_ckpt santeam -Create $ nas_fs -list | grep santeam 24751 y 1 0 83785 santeam 1 24826 y 7 0 83871 santeam_ckpt1 1 24886 y 7 0 83871 santeam_ckpt2 1 $ fs_ckpt santeam -list id ckpt_name creation_time inuse full(mark) used 24826 santeam_ckpt1 01/14/2009-23:22:02-EST y 90% 25% 24886 santeam_ckpt2 01/15/2009-00:14:40-EST y 90% 25% Source Site - Copy checkpoints incremental changes. $ fs_copy -start santeam_ckpt2 santeam:cel=target -fromfs santeam_ckpt1 IP Copy remaining (%) 100..Done. $ fs_replicate -info santeam Test Replication Source Site - Suspend replication; this may take a while. $ fs_replicate -suspend santeam santeam:cel=target Target Site - Convert file system to UXFS. $ nas_fs -Type uxfs santeam -Force $ server_mount server_2 | grep santeam santeam on /santeam uxfs,perm,ro $ nas_fs -size santeam total = 1008 avail = 907 used = 100 ( 9% ) (sizes in MB) ( blockcount = 2097152 ) volume: total = 1024 (sizes in MB) ( blockcount = 2097152 ) $ server_export server_2 -P nfs /santeam Target Site - Create a consistency checkpoint. $ fs_ckpt santeam -name santeam_consistency -Create Target Site - Remount as RW. $ server_mount server_2 -o rw santeam Target Server - Test File System. # ssh us-nycznbpr001 # mount 10.22.5.36:/santeam /opt/nas_2462 # cd /opt/nas_2462 # ls hello.txt lost+found testfile # umount 10.22.5.36:/santeam Undo Test Replication Target Site - Remount all filesystems as R/O. $ server_export server_2 -Protocol nfs -unexport /santeam $ server_mount server_2 -o ro santeam Target Site - Roll back to consistency state. $ su # /nas/sbin/rootfs_ckpt santeam_consistency -name santeam_restore -Restore -o automount=no Target Site - Convert filesystems back to RAWFS. # nas_fs -Type rawfs santeam -Force Target Site - Unmount consistency checkpoints and delete. $ server_umount server_2 -perm santeam_consistency $ nas_fs -delete santeam_consistency $ nas_fs -delete santeam_restore Source Site - Restart checkpoints; this may take a while. $ fs_replicate -restart santeam santeam:cel=target -o dto=300,dhwm=300 $ fs_replicate -info santeam Clean up Target Site. $ server_umount server_2 -perm santeam $ server_mount server_2 | grep santeam $ fs_replicate -abort santeam $ fs_replicate -list | grep santeam $ nas_fs -delete santeam $ nas_fs -list | grep santeam Source Site. $ fs_replicate -abort santeam $ fs_replicate -list | grep santeam $ server_umount server_2 -perm santeam_ckpt1 $ server_umount server_2 -perm santeam_ckpt2 $ server_mount server_2 | grep santeam santeam on /santeam uxfs,perm,rw $ nas_fs -delete santeam_ckpt1 $ nas_fs -delete santeam_ckpt2 $ nas_fs -list | grep santeam 24751 y 1 0 83785 santeam Restarting an In-Active File System. Check the delta set remains the same and less than current VRPL. Ensure replicator share is inactive and that playback at target is active. If so, restart replicaton. The script will abort the replication and create a new one with default options, so do not specify " -o .....".
# fs_replicate -list # mysh=UGPsapmnt # fs_replicate -info $mysh # fs_replicate -restart $mysh $mysh:cel=target SNMPEnter the following command to check the events that are bound to sending traps from the Celerra. # /nas/bin/nas_event -list -a trap Set trap configuration. # cat /nas/site/trap.cfg snmpmanager 10.26.26.170 ; communityname public snmpmanager 10.26.26.171 ; communityname public Start SNMP daemon. # /usr/sbin/snmptrapd -c /nas/sys/snmptrapd.conf -p 162 -u /var/run/snmptrapd.pid \ -o /nas/site/my_eventlog_messages.log >/dev/null 2>&1 & Send a test message. # /nas/sbin/nas_snmptrap /nas/site/trap.cfg -m /nas/sys/emccelerra.mib -r 1 -f 64 -i 5 -s 7 -d "test SNMP traps" where: config_file_path = path of the trap configuration file /nas/sys/emccelerra.mib = Celerra MIB file trap_number = unique trap number for the event facility_id = ID number of the facility generating the event event_id = event ID number severity_level = severity level of the event description = description of the trap (up to 255 characters) To determine which event trigger an action. # nas_event -list -action trap To see details of events. # nas_event -list -component -info # nas_event -list -component CS_PLATFORM -facility -info # nas_event -list -component CS_PLATFORM -facility ConnectHome # nas_event -list -component CS_PLATFORM -facility ConnectHome -id # nas_message -info 91813642291 CIFSCreate a VDM on server_2 and confirm root file system created. # nas_server -name vdm01 -type vdm -create server_2 -setstate loaded pool=symm_std # server_mount server_2 | grep root_fs_vdm_vmd01 Start CIFS on data mover. # server_setup $mydm -P cifs -o start Create CIFS server on your VDM selecting NIC interface to use. # server_cifs vdm01 -add compname=vdm01,domain=pwc.com,interface=cge0-1 Join the domain, and check status. You will need administrative access. Also check NTP is configured. # server_cifs vdm01 -J compname=vdm01,domain=pwc.com,admin=adminstrator # server_cifs vdm01 Setup and Share file system. Create directory to hide ./etc and lost&found. # nas_fs -name $myfs -create size=1G pool=$mypl -o slice=y # server_mountpoint vdm01 -create /$myfs # server_mount vdm01 $myfs /$myfs # server_mountpoint vdm01 -create /$myfs/dir # server_export vdm01 -P cifs -name $share_name /$myfs/dir Test UNC path. \\vdm01\$share_name # /nas/sbin/rootnas_fs -info root_fs_vdm_vdm01 DATA MOVERSee the control station status i.e. 10 and 11 mean up. The are 4 x Data Movers in slots 2, 3, 4, 5. $ /nas/sbin/getreason 10 - slot_0 primary control station 11 - slot_1 secondary control station 5 - slot_2 contacted 5 - slot_3 contacted 5 - slot_4 contacted 5 - slot_5 contacted Type 1 means this is an "primary" DataMover, Type 4 means "standby" DataMover. Note slot # match "server_#". $ nas_server -l id type acl slot groupID state name 1 1 0 2 0 server_2 2 4 0 3 0 server_3 3 1 0 4 0 server_4 4 4 0 5 0 server_5 See standby per data mover. $ nas_server -info -all | grep server name = server_2 standby = server_3, policy=auto name = server_3 standbyfor= server_2 name = server_4 standby = server_5, policy=auto name = server_5 standbyfor= server_4 The type of the DM in slot 2 now takes type=4, which signifies "standby". The type of the DM in slot 3 now takes type=1, which means "primary". The "server_2" name moves to slot 3. $ nas_server -l id type acl slot groupID state name 1 4 0 2 0 server_2.faulted.server_3 2 1 0 3 0 server_2 3 1 0 4 0 server_4 4 4 0 5 0 server_5 SENDMAILTo customise the way Celerra email event notifications are sent. Copy the default sendmail.mc file. # cp /etc/mail/sendmail.mc ~nasadmin/ Find the line in sendmail.mc that reads. dnl define(`SMART_HOST',`smtp.your.provider') Using vi, change this line to. define(`SMART_HOST',`uxsmpr02.nam.pwcinternal.com') Create and install the sendmail.cf file. # m4 sendmail.mc > sendmail.cf # cp sendmail.cf /etc ***in 5.5 code # cp snedmail.cf /etc/mail **in 5.6 code above Restart the sendmail service and send a test message. # /sbin/service sendmail restart # echo test from `hostname` | mail -v -s "`hostname` test only" david.r.spink@us.pwc.com # tail /var/log/maillog CS NBSNASYou may not be able to extend this file system; hence to manage verify the space usage by Jserver. # du -sh /nas/jserver/ If the command returns more than 500 MB used, then this needs to be addressed. The js_cleandbscript is a tool that can be used to help free up the space. # /nas/sbin/js_cleandb Other options are basic find and remove files. # find ./ -size +5145738c -exec ls -l {} \; NTPVerify if there is an NTP server is the environment. # server_date server_2 timesvc stats ntp As root verify and if needed configure the Celerra for the correct local timezone.> # /usr/sbin/timeconfig Make sure the nptpd service is stopped. # /sbin/service ntpd status Make a copy of the original file. Verify you can ping the new NTP server(s). # cp /etc/ntp.conf /etc/ntp.conf.orig # ping 10.26.12.62 # ping 10.26.65.22 Use vi to edit /etc/ntp.conf. # server 127.127.1.0 # local clock # fudge 127.127.1.0 stratum 10 server 10.26.12.62 # New NTP server server 10.26.65.22 # New NTP server Use vi to edit the /etc/ntp/step-tickers file to add the same NTP IP address. 10.26.12.62 10.26.65.22 Start NTP; once started it will slowly sync up. # /sbin/service ntpd start Synchronizing with time server: [OK] Starting ntpd: [OK] Make the setting presistent after a CS reboot. # /sbin/chkconfig ntpd --list # /sbin/chkconfig --level 345 ntpd on # /sbin/chkconfig ntpd --list Check the CS is now able to sync with the NTP server. # /usr/sbin/ntpd -p EMCOPYThe EMCOPY.EXE tool provides a user the way to copy a file or a directory and included subdirectories from and to a NTFS partition with security intact. emcopy \\ustpa3tlsfs01\tax_atx \\USTPA3CIFS4457\cifs_dm4_02$\tax_atx\ /o /i /s /de /c /r:0 /w:0 /th 64 /purge >>tax_atx.txt emcopy \\ustpa3tlsfs01\testproc_share \\USTPA3CIFS4457\cifs_dm4_02$\test\ /o /i /s /de /c /r:0 /w:0 /th 64 /purge >>testproc_share.txt emcopy \\ustpa3tlsfs01\procurement \\USTPA3CIFS4457\cifs_dm4_02$\procurement\ /o /i /s /de /c /r:0 /w:0 /th 64 /purge >>procurement.txt emcopy \\ustpa3tlsfs01\design_docs \\USTPA3CIFS4457\cifs_dm4_02$\design_docs\ /o /i /s /de /c /r:0 /w:0 /th 64 /purge >>design_docs.txt |