User Tools

Site Tools


virtual_machines

Virtual Machines on Pinnacle/Karpinski

One of the new features introduced in the Pinnacle/Karpinski clusters is the ability to spin up virtual machines. Virtual machines allow users to:

  • run operating systems versions other than that installed on the Pinnacle compute nodes
  • root access to the virtual machine - ability to install any software
  • mount the underlying Pinnacle lustre file system directly on the VM (/home/$USER and /scratch )
  • suspend the virtual machine with it's current memory state (executing programs) and restart at a later time (or another job)
  • remote desktop access to the VM

Because all virtual machines have to be created and started by root we developed a set of scripts executable by users with the following functionality:

vm-clone.sh       - clone a VM template and save in user VM library
vm-delete.sh      - delete a VM from the user VM library
vm-list.sh        - list VM's in the user library
vm-get-ip.sh      - retrieve the ip address of a currently running VM
vm-log.sh         - show the startup log of a currently running VM
vm-bootup-info.sh - 

User Virtual Machine Library

The user VM library is stored at /storage/$USER/.virtual-machines. This directory is owned by root and can only be modified by user through the vm-*.sh scripts. Each VM is defined by a set of 2 files in that directory:

<vm-name>.qcow2     - the hard drive file of the VM  
<vm-name>.xml       - the XML definition file of the VM

The vm-list.sh script will list the contents of the library:

pinnacle-l1:pwolinsk:~$ vm-list.sh 

pwolinsk's VMS                    STATE       VM IP              HOST 
======================================================================
centos7.6-desktop-pwolinsk        SHUT OFF   
centos7.6-lustre-pwolinsk         SHUT OFF   
centos7.6-lustre-pwolinsk-1       SHUT OFF   
library-pwolinsk                  SHUT OFF   
ubuntu-18.04-desktop-pwolinsk     SHUT OFF   
ubuntu-18.04-lustre-pwolinsk      RUNNING     172.16.254.127     c1329

Virtual machines are stored in /storage/pwolinsk/.virtual-machines.
Total storage on disk: 35G	total

pinnacle-l1:pwolinsk:~$

Creating a new VM

VMs are created from VM templates using the vm-clone.sh script. To get a listing of available templates run vm-clone.sh without any arguments:

pinnacle-l1:pwolinsk:~$ vm-clone.sh 
Usage: vm-clone.sh <template-vm>
    where <template-vm> is one of:

     tmpl-centos7.6
     tmpl-centos7.6-desktop
     tmpl-centos7.6-lustre
     tmpl-ubuntu-18.04
     tmpl-ubuntu-18.04-desktop
     tmpl-ubuntu-18.04-lustre

pinnacle-l1:pwolinsk:~$ 

Currently we have a total of 6 templates, using two different operating systems Centos 7.6 and Ubuntu 18.04. For each OS we have 3 different options: Base level packages without desktop or Lustre support:

  • tmpl-centos7.6, tmpl-ubuntu-18.04 - full root access
  • tmpl-centos7.6-desktop, tmpl-ubuntu-18.04-dekstop - full root access, desktop suppport
  • tmpl-centos7.6-lustre, tmpl-ubuntu-18.04-lustre - no root access, local file system mounted

Additional VM templates will be added on request.

pinnacle-l1:pwolinsk:~$ vm-clone.sh tmpl-ubuntu-18.04
Cloning tmpl-ubuntu-18.04 for pwolinsk as ubuntu-18.04-pwolinsk.....
Found tmpl-ubuntu-18.04 defined on c1329.  Cloning....
Allocating 'ubuntu-18.04-pwolinsk.qcow2'                                                       |  10 GB  00:00:07     

Clone 'ubuntu-18.04-pwolinsk' created successfully.

pinnacle-l1:pwolinsk:~$ 

Starting the VM

A special job queue named cloud72 has been set up on Pinnacle to run all user VM jobs. The name of the VM to be started has to be specified in the name of the job script. The VM is started in the job prolog and destroyed in the job epilog.

You can start the VM by starting an interactive job on a node in cloud72 queue. In this example we are using 4 cores for the VM and specifying 1 hour run time:

pinnacle-l1:pwolinsk:~$ srun -N1 --tasks-per-node=4 -p cloud72 -t 1:00:00 -J ubuntu-18.04-pwolinsk --pty vm-bootup-info.sh
Waiting for startup log (/scrfs/storage/pwolinsk/home/cloud-12350.log)...found.
Waiting for VM to finish booting.Success
***********************************************************
 You are running an interactive bash session on c1329    
 which is the host for your VM (ubuntu-18.04-pwolinsk).     
 Terminating this bash session will stop the VM.      
 It can be restarted at any time using this command   

 srun -p cloud72 -t 1:00:00 -J ubuntu-18.04-pwolinsk -c cloud --pty vm-bootup-info.sh

ubuntu-18.04-pwolinsk is Ready. "ssh ubuntu@172.16.254.127" password: ubuntu
***********************************************************
c1329:pwolinsk:~$ 

To log into the VM:

c1329:pwolinsk:~$ ssh ubuntu@172.16.254.127
Warning: Permanently added '172.16.254.127' (ECDSA) to the list of known hosts.
ubuntu@172.16.254.127's password: 
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-48-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

Last login: Wed Oct  2 14:58:31 2019
ubuntu@vm-ubuntu-18:~$ 
ubuntu@vm-ubuntu-18:~$ cat /proc/cpuinfo  |grep processor
processor	: 0
processor	: 1
processor	: 2
processor	: 3
ubuntu@vm-ubuntu-18:~$ 

Stopping the VM

Ending the interactive session on the cloud72 compute node (c1329 in this example), will stop the job and the job epilog script will destroy (stop) the VM.

ubuntu@vm-ubuntu-18:~$ exit
logout
Connection to 172.16.254.127 closed.
c1329:pwolinsk:~$ exit
exit

The cloud72 queue has a limit of 72 hours, so VM jobs are treated just like any other job in the queue. They cannot run indefinitely.

virtual_machines.txt · Last modified: 2020/01/27 20:31 by root