What is Cloud?


‘The Cloud’, or more simply “Cloud”, is quite a common term these days. However, in the context of STRIDES and as it pertains to life science computing, we are using the following definition:

What is Cloud?

The Cloud refers to computational infrastructure and services that are owned and located 'off site' by a third party 'cloud services provider' (CSP), which in turn allows other parties to purchase and use these resources and services on a 'pay as you go' basis. Example commercial CSPs would be Amazon Web Services, Google Cloud Platform, Microsoft Azure, Digital Ocean, Heroku, and others.

Key features of the Cloud

  • Elastic - resources such as servers, or storage capacity, can be provisioned at will, used for some period of time and then disposed of
  • Pay per use - no upfront costs, fees are incurred only when a service is being used
  • Self-service and on demand - the customer can instantiate cloud resources whenever they need them

For the purposes of this document, we focus on the Cloud as an Infrastructure as a Service (IaaS) platform where organizations, or individual researchers, create an account, set up an appropriate payment mechanism and are then able to access and use these infrastructure components as desired.

In the Cloud

You will often hear people say they are using things ‘in the cloud’. This typically means they are using packaged services in the cloud, so called ‘Software as a Service’ (SaaS) platforms such as Dropbox, OneDrive, GMail, and similar tools. These commercial tools are almost certainly hosted on cloud infrastructure but this is a different use case than the one covered in this document. Here we are focusing on the ‘lower level’ assembly of cloud building blocks (compute, storage, etc) to build systems to address biomedical use cases.

Common Cloud computing components across the main CSPs

Component Description Amazon Web Serices (AWS) Google Microsoft Azure
Compute Servers of various sizes and capabilities Elastic Compute Cloud (EC2) Compute Engine Virtual Machines
Disk Storage Disks that can be attached to compute instances Elastic Block Store (EBS) Persistent Disk Disk Storage
Object Storage File storage Simple Storage Service (S3) Cloud Storage Blob Storage
Archive Storage Long-term archival file storage Glacier Cloud Storage - Coldline Archive Storage

Implications of the Cloud model

  • Data movement and importance of the network
  • Cost models (monthly, based on usage)
  • Risks/implications
    • Enough rope to hang yourself (need guardrails)
    • Ability to run up bills
    • Security -
    • Easy to get going, however, appropriate IT knowledge, sysadmin skills, development skills, etc. are still needed, plus specific cloud architecture and distributed system skills
    • Cloud environments are always evolving so staying up to date can be challenging.