Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
moving_data [2021/03/10 20:11] pwolinsk |
moving_data [2024/11/08 21:14] (current) pwolinsk |
==== Small files (<100MB) ==== | ==== Small files (<100MB) ==== |
| |
There data transfer protocols are allowed to move data to the main storage on the AHPCC clusters: | There data transfer protocols are supported to move data to and from the main storage on the AHPCC clusters: |
| |
* **''scp''** (secure copy) | * **''scp''** (secure copy) |
* **''sftp''** (secure ftp) | * **''sftp''** (secure ftp) |
* **''rsync''** | * **''rsync''** |
| |
| In addition the **''wget''** and **''curl''** commands are available to download data to your account using a public URL. |
| |
| |
=== Linux and MacOS === | === Linux and MacOS === |
To upload a data file from the current directory on your local desktop machine to your /storage directory on **pinnacle**: | To upload a data file from the current directory on your local desktop machine to your /storage directory on **pinnacle**: |
<code> | <code> |
pawel@localdesktop$# scp localfile.dat pwolinsk@pinnacle.uark.edu:/storage/pwolinsk/ | pawel@localdesktop$# scp localfile.dat pwolinsk@hpc-portal2.hpc.uark.edu:/storage/pwolinsk/ |
</code> | </code> |
To download a data file from your /storage directory on **pinnacle** to the current directory on your local desktop machine: | To download a data file from your /storage directory on **pinnacle** to the current directory on your local desktop machine: |
<code> | <code> |
pawel@localdesktop$# scp pwolinsk@pinnacle.uark.edu:/storage/pwolinsk/remotefile.dat . | pawel@localdesktop$# scp pwolinsk@hpc-portal2.hpc.uark.edu:/storage/pwolinsk/remotefile.dat . |
</code> | </code> |
| |
| |
<code> | <code> |
C:\Users\Pawel> c:\Users\Pawel\Downloads\pscp.exe filetoupload.txt pwolinsk@pinnacle.uark.edu: | C:\Users\Pawel> c:\Users\Pawel\Downloads\pscp.exe filetoupload.txt pwolinsk@hpc-portal2.hpc.uark.edu: |
</code> | </code> |
| |
The code above uses secure copy protocol to upload a file "filetoupload.txt" to the home directory of user pwolinsk on pinnacle.uark.edu. | The code above uses secure copy protocol to upload a file "filetoupload.txt" to the home directory of user pwolinsk on hpc-portal2.hpc.uark.edu. |
| |
Another popular windows transfer client (GUI) is WinSCP: | Another popular windows transfer client (GUI) is WinSCP: |
scp, sftp, rsync have the advantage of being very simple and do not require any initial setup. However large data sets often fail to transfer correctly using these protocols. GLOBUS does require some initial setup but is much more reliable. It has many features, such as splitting the transfer into multiple simultaneous streams, encrypting the data in flight, automatically retransmitting data on network failure/timeouts, verifying data integrity after transfer. In addition our installation of GLOBUS is connected to the **100Gb/s** network (while our cluster login nodes used for scp/sftp/rsync are on the 10Gb/s network). | scp, sftp, rsync have the advantage of being very simple and do not require any initial setup. However large data sets often fail to transfer correctly using these protocols. GLOBUS does require some initial setup but is much more reliable. It has many features, such as splitting the transfer into multiple simultaneous streams, encrypting the data in flight, automatically retransmitting data on network failure/timeouts, verifying data integrity after transfer. In addition our installation of GLOBUS is connected to the **100Gb/s** network (while our cluster login nodes used for scp/sftp/rsync are on the 10Gb/s network). |
| |
GLOBUS service moves data between GLOBUS Endpoints. Each Endpoint is a server process running on a machine which can send and receive data. One such endpoint, named **UARK-Pinnacle**, is set up on the Pinnacle cluster. It is a //public// endpoint (visible to all GLOBUS users), and accessible by anyone with an account on Pinnacle. To transfer data between your account on Pinnacle and your personal workstation/laptop you will need to set up a //private// GLOBUS endpoint, which is only visible and accessible by you. This requires the installation of GLOBUS personal connect server on your workstation/laptop. | GLOBUS service moves data between GLOBUS Endpoints. Each Endpoint is a server process running on a machine which can send and receive data. One such endpoint, named **UARK-Pinnacle-2024**, is set up on the Pinnacle cluster. It is a //public// endpoint (visible to all GLOBUS users), and accessible by anyone with an account on Pinnacle. To transfer data between your account on Pinnacle and your personal workstation/laptop you will need to set up a //private// GLOBUS endpoint, which is only visible and accessible by you. This requires the installation of GLOBUS personal connect server on your workstation/laptop. |
| |
=== GLOBUS personal connect server === | === GLOBUS personal connect server === |
To transfer data between the //private// GLOBUS endpoint (which you just created) on your workstation/laptop and the //public// endpoint on Pinnacle we have to find and connect to both endpoints using the "File Manager" in the GLOBUS portal. | To transfer data between the //private// GLOBUS endpoint (which you just created) on your workstation/laptop and the //public// endpoint on Pinnacle we have to find and connect to both endpoints using the "File Manager" in the GLOBUS portal. |
- Click on "File Manager" in the vertical menu on the right in the GLOBUS portal | - Click on "File Manager" in the vertical menu on the right in the GLOBUS portal |
- Near the top of the page in the "Collection" text entry type in 'UARK-Pinnacle'. As you start typing you will see a list of endpoints below which match the search string you are entering. Eventually you should see 'UARK-Pinnacle' endpoint in the list. Select it. | - Near the top of the page in the "Collection" text entry type in 'UARK-Pinnacle-2024'. As you start typing you will see a list of endpoints below which match the search string you are entering. Eventually you should see 'UARK-Pinnacle-2024' endpoint in the list. Select it. |
- You will then be asked to authenticate to use the endpoint. Click "Continue" and you'll be redirected to an AHPCC themed login page. Use your AHPCC account user name and password to log in. | - You will then be asked to authenticate to use the endpoint. Click "Continue" and you'll be redirected to an AHPCC themed login page. Use your AHPCC account user name and password to log in. |
- After a successful login you will see a window with a listing of your home directory on Pinnacle. | - After a successful login you will see a window with a listing of your home directory on Pinnacle. |
| |
=== Connecting to your personal Endpoint on workstation/laptop === | === Connecting to your personal Endpoint on workstation/laptop === |
You are already connected to the //public// UARK-Pinnacle Endpoint and logged into your account on Pinnacle. To transfer files between your local workstation/laptop and UARK-Pinnacle endpoint, you will also need to find and open your //private// endopint. If you already see both 'UARK-Pinnacle' and your private endpoint in the file manager in the GLOBUS portal you can drag and drop files between the windows in the file manager, which will start transfer of data. If not, you will have to find and connect to your //private// endpoint on your workstation/laptop: | You are already connected to the //public// UARK-Pinnacle Endpoint and logged into your account on Pinnacle. To transfer files between your local workstation/laptop and UARK-Pinnacle-2024 endpoint, you will also need to find and open your //private// endopint. If you already see both 'UARK-Pinnacle-2024' and your private endpoint in the file manager in the GLOBUS portal you can drag and drop files between the windows in the file manager, which will start transfer of data. If not, you will have to find and connect to your //private// endpoint on your workstation/laptop: |
- At the top right of the File Manager in the GLOBUS portal, in the Panels section click on the middle icon symbolizing 2 windows side by side. | - At the top right of the File Manager in the GLOBUS portal, in the Panels section click on the middle icon symbolizing 2 windows side by side. |
- In the empty window in the "File Manager" in the "Collection" text box enter the name of your //private// GLOBUS endpoint you created. Click on the name of your endpoint once it shows up in the list below to connect. (**NOTE:** the GLOBUS connect personal server has to be started on your workstation/laptop to connect to it via the GLOBUS portal). | - In the empty window in the "File Manager" in the "Collection" text box enter the name of your //private// GLOBUS endpoint you created. Click on the name of your endpoint once it shows up in the list below to connect. (**NOTE:** the GLOBUS connect personal server has to be started on your workstation/laptop to connect to it via the GLOBUS portal). |
| |
With both endpoints connected you can now drag and drop files between the endpoints. | With both endpoints connected you can now drag and drop files between the endpoints. |