{{tag>Scripting bash automate automation NetApp}}
====== NetApp Script for Automated Volume Delete ======
{{:scripting:netapp-logo.png?200 |}}
In this article, I will discuss the NetApp filers and creating scripts to automate processes within the filers. As all System Administrators know, the first rule in being an admin, is to automate as much as possible. Well, that should be the 2nd rule. The first rule should be to understand the process. Knowing this, you create scripts to automate and schedule your repeated processes.
If you are not familiar with [[http://www.netapp.com/us/products/storage-systems/index.aspx | NetApp filers]], they are very nice enterprise storage devices (with a price tag to match). When you purchase and pay for renewals, you get support. However, in my recent experience with their support, I discovered they don’t support scripting in any way. Personally, I don’t understand this approach to support. I’m a reasonable person. I don’t expect them to support and debug my scripts for me. However, I would expect them to be open to helping with the commands in an automation project. For example, here I will be discussing the volume delete command via a script. They just wouldn’t help when the notion of a script was in place. I had to tell them to forget the script and discuss the commands as I’m logged into the filer. At this point, they helped enough that I could find a way to get my script going. Bottom line is they won’t help you if you say that you are working with a script.
==== Logging In ====
Now, let’s login to the filers and explore a bit!
The filers are interactive by nature. You can use the graphical / web-based tools, such as the OnCommand System Manager (OCSM), or you can login on the command line via SSH to administer your filers. Anyone that writes scripts knows, the command line (or API) is the way we’ll get our job done. In our case, we need to connect to the filer via SSH as shown below.
{{ scripting:netapp-login-ssh-2016-07-11-768x134.png }}
==== Command Basics ====
Now that we’re in, let’s explore some commands. Type the **questionmark** (?) and press enter to get a list of available commands.
nacluster::>?
up Go up one directory
cluster> Manage clusters
dashboard> DEPRECATED)-Display dashboards
event> Manage system events
exit Quit the CLI session
export-policy Manage export policies and rules
history Show the history of commands for this CLI session
job> Manage jobs and job schedules
lun> Manage LUNs
man Display the on-line manual pages
metrocluster> Manage MetroCluster
network> Manage physical and virtual network connections
qos> QoS settings
redo Execute a previous command
rows Show/Set the rows for this CLI session
run Run interactive or non-interactive commands in the nodeshell
security> The security directory
set Display/Set CLI session settings
snapmirror> Manage SnapMirror
statistics> Display operational statistics
storage> Manage physical storage, including disks, aggregates, and failover
system> The system directory
top Go to the top-level directory
volume> Manage virtual storage, including volumes, snapshots, and mirrors
vserver> Manage Vservers
From here we can enter command or get some help with these commands. For example, if you want to get some information about your system, but don’t know how to use the **system** command, type the following.
nacluster::> system ?
chassis> Chassis health monitor directory
cluster-switch> cluster switch health monitor directory
controller> Controller health monitor directory
feature-usage> Display feature information
health> System Health Management and Diagnosis commands
license> Manage licenses
node> The node directory
script> Capture CLI session to a file for later upload.
Analogous to the unix 'script' command
service-processor> Display and configure the Service Processor
services> Manage system services
smtape> Manage SMTape operations
snmp> The snmp directory
timeout> Manage the timeout value for CLI sessions
Now we see we have some subcommands to use with the system command. Alternatively, you can simply type **system **. This would provide a list of subcommands without the descriptions.
nacluster::> system
chassis cluster-switch controller feature-usage
health license node script
service-processor services smtape snmp
timeout
==== Basic NetApp Scripting ====
Now that the basics of running the command interpreter are out of the way, let’s run some simple scripts. This can be tricky since the SSH session is inherently interactive. There are 2 ways to automatically login. The 1st way is to create ssh keys and add it to the filer’s root volume. This is most secure, but can be a pain since you have to mount the root volume on a *nix machine and copy the id.pub file to a particular directory. If this is your method of choice, then go for it or call NetApp for support on how to do this.
For me, the 2nd method is easier. You will need to install the sshpass package. It should be in the standard repos. After you install it, then it’s simply another shell command you prepend to your ssh command. We’ll look at how to use it while building our script.
Here, we’ll be using the bash scripting languang since it’s available almost everywhere. We’ll start with the standard she-bang and set some variables.
#!/bin/bash
# set values
NAUSER='nauser'
PASS='password'
IP='192.168.1.10'
Then we add a simple command like getting the CLI session timeout value.
CMD='system timeout show'
sshpass -p $PASS ssh -o StrictHostKeyChecking=no $NAUSER@$IP "$CMD"
Now save our basic script as cli-test.sh and change the permissions so the owner has execute (x) permissions. Finally, we run it and get the following output. Simple!
$ ./cli-test.sh
CLI session timeout: 30 minutes
==== Use Case ====
Before we begin the script, let’s look at what we want to automate and why.
My use case is that I have a process to taking a backup from primary server A and want to restore that backup on secondary server B at a regular interval (eg. nightly). As any seasoned admin will tell you, an automation plan should be put into place. Since our two servers are physically in different datacenters (DCs), it doesn’t make much sense to mount the share from one DC to the other. The latency would likely cause delays and errors; especially if the link between them drops.
Another option is to copy or synchronize the data from server A to server B. Depending on the link bandwidth between the DCs, this can be slow and fail if the link is dropped. One could use the rsync program and schedule for a certain interval. That would allow for dropped/recovered links.
Finally, since we have a NetApp device in each DC, we can take advantage of the SnapMirror (SM) technology. With SM configured, the source device will create a snapshot of the source volume and send it to the destination device. As soon as the SM relationship is created, the destination volume is read-only. You cannot write to it. Many backup and restore programs do not need to write to a backup volume. However, some do, such as in Zimbra Collaboration Server (ZCS).
With the SM in place, we have 2 options, break the mirror and clone the volume. If the SM is broken, then the destination is automatically remounted as read-write. You may then perform your restore operation(s). The drawback to this approach is that for the time from when the SM was broken and when you “resume” or “resync” the volumes, there are no updates from the source. This may be acceptable in some situations. However, if server B is a “hot standby” and needs the latest data ASAP, then this may not be a good plan. One issue that can arise is that the SM may not be resumed or resync’d for one reason or another. Maybe the script failed. Maybe someone forgot to check it and/or resume it manually. Depending on how diligent you are, the volume may not be updated for an entire weekend.
Lastly, another approach is to use the FlexClone (FC) feature of the NetApp devices. This allows for the SM to stay synchronized at all times and create a new volume based upon a SM snapshot already stored at the destination device. The second compelling reason for using FCs is that they are deduplicated volumes. This means that for any given file on the clone volume, it’s just a pointer to the file on the source volume as long as the file has not been changed. If the file has been changed on either volume, then each has it’s own copy. What this means is that the clone doesn’t take up hardly an space. Yay!! Also, creating clones is a fast operation.
Now that we’ve discussed the advantages of using FlexClones, we’ll look at how to automate this task in a script.
==== NetApp FlexClone Scripting ====
We’ll reuse the same creds and the same sshpass command used above. We’ll just look at the relevant commands and create some functions to use in our scripts.
The process for creating and using a clone is:
Create a clone
- Mount the clone (on the NetApp device)
- Mount the new volume on the remote machine (Windows, Linux, Unix, etc.)
- Let’s look at how to script the processes. First up, is step 1 (create a clone).
Let’s see what options we have for the clone command.
nacluster::> vol clone create ?
-vserver Vserver Name
[-flexclone] FlexClone Volume
[ -type {RW|DP} ] FlexClone Type (default: RW)
[-parent-volume|-b] FlexClone Parent Volume
[ -parent-snapshot ] FlexClone Parent Snapshot
[ -junction-path ] Junction Path
[ -junction-active {true|false} ] Junction Active (default: true)
[ -space-guarantee|-s {none|volume} ] Space Guarantee Style
[ -comment ] Comment
[ -foreground {true|false} ] Foreground Process (default: true)
[ -qos-policy-group ] QoS Policy Group Name
The “vol clone create” command sometimes requires that you name the snapshot to base the clone on. Oops. Well, we need a way to find the latest snapshot (SS) in order to name it. Let’s figure out how to find the latest SS.
Below is the Netapp command to get a list of snapshots for the volume.
nacluster::> vol snapshot show -volume myvol
---Blocks---
Vserver Volume Snapshot Size Total% Used%
-------- -------- ------------------------------------- -------- ------ -----
vs1 myvol
daily.2016-07-09_0010 2.74GB 0% 0%
daily.2016-07-10_0010 3.11GB 0% 0%
daily.2016-07-11_0010 3.60GB 0% 0%
daily.2016-07-12_0010 3.63GB 0% 0%
daily.2016-07-13_0010 28.83MB 0% 0%
snapmirror.50f2e399-0576-11e5-a67d-00a098534c58_2147484721.2016-07-13_160000
200KB 0% 0%
snapmirror.50f2e399-0576-11e5-a67d-00a098534c58_2147484721.2016-07-13_180000
200KB 0% 0%
7 entries were displayed.
Let’s create a function based upon the above command and guarantee we have the latest SM snapshot.
function getLatestSM()
{
CMD="vol snapshot show -volume vs1_my_myvol"
LAST_SM=$(sshpass -p $PASS ssh -o StrictHostKeyChecking=no $NAUSER@$IP "$CMD" | grep snapmirror | tail -1 | awk '{print $1}')
echo "Latest SM: $LAST_SM"
}
In the above command string, I run the “vol snapshot show” command to get the list of snapshots for that volume. Then I filter for only snapmirror snapshots. Next, I filter to the last one in the list. And finally, I grab just the name of the snapshot.
Now that we know the name of the snapshot, we can safely name the snapshot we want in the cloneMirror function.
function cloneMirror()
{
getLatestSM
echo "Creating the clone and mounting it..."
CMD="vol clone create myvol_mirror_clone -b vs1_myvol_mirror -s none -vserver ddc-vs1 -parent-snapshot $LAST_SM; vol mount -vserver ddc-vs1 -volume myvol_mirror_clone -junction-path /myvol_mirror_clone"
sshpass -p $PASS ssh -o StrictHostKeyChecking=no $NAUSER@$IP "$CMD"
}
In this function, I first call the getLatestSM function that we just created. Then I setup the “vol clone create” command. You can refer to the help list above for the options I used here. I’ll just mention that I use the “-parent-snapshot” option and use the variable created from the getLatestSM function.
Next, I mount the volume on the NetApp device. Let’s see the help screen for the “vol mount” command so we can understand the options used in the above function.
nacluster::> vol mount ?
-vserver Vserver Name
[-volume] Volume Name
[-junction-path] Junction Path Of The Mounting Volume
[[-active] {true|false}] Activate Junction Path (default: true)
[ -policy-override {true|false} ] Override The Export Policy (default: false)
Lastly, we mount the volume on the Linux server and run our process.
==== Reverse Process ====
When we’re finished, we need to do a bit more than just reverse the above process.
- Unmount the volume from the Linux server
- Unmount the volume from the NetApp device
- Offline the volume
- Delete the volume
I combine the last 3 steps together in 1 function.
Unmounting and offline-ing the volume from the NetApp device is pretty straight forward. However, deleting the volume is a bit tricky. The NetApp developers require that you to be in “Advanced” mode before you can delete a volume without a confirmation prompt. Below are the options for the “set advanced” mode command. I discovered that I could use the “-confirmations off” option with this command to allow the automation of the “vol delete” command without a confirmation prompt.
nacluster::> set advanced ?
[ -confirmations {on|off} ] Confirmation Messages
[ -showallfields {true|false} ] Show All Fields
[ -showseparator ] Show Separator
[ -active-help {true|false} ] Active Help
[ -units {auto|raw|B|KB|MB|GB|TB|PB} ] Data Units
[ -rows ] Pagination Rows ('0' disables)
[ -vserver ] Default Vserver
[ -node ] Default Node
[ -stop-on-error {true|false} ] Stop On Error
Now we can code the function that performs the 3 steps to delete the volume.
function deleteClone()
{
echo "Unmounting, setting the vol offline, changing to advanced mode, then deleting the volume..."
# Offline the vol first, then change the "advanced" mode, then delete vol with -force to avoid confirmation
CMD='vol unmount -volume myvol_mirror_clone -vserver ddc-vs1; vol offline -vserver ddc-vs1 -volume myvol_mirror_clone; set advanced -confirmations off; vol del -vserver ddc-vs1 -volume myvol_mirror_clone -force'
sshpass -p $PASS ssh -o StrictHostKeyChecking=no $NAUSER@$IP "$CMD"
}
==== Conclusion ====
Creating a script to automate processes is not too difficult; though you might run into some road blocks. If you know how to ask the right questions, you might be able to coerce the support team into giving up enough information to complete your script.
We’ve looked at using the sshpass command, which is the cornerstone of automating any process with NetApp storage devices. We looked at plans we could use to automate our process and decided to a FlexClone. Then we created some functions to use in our script.