What is the Cloud?

The Cloud

In the context of STRIDES, and as it pertains to life science computing, we are using the following definition for the Cloud:

The Cloud refers to computational infrastructure and services that are owned and located off-site by a third-party cloud services provider (CSP), which allows other parties to purchase and use these resources and services on a pay-as-you-go basis.

Examples of commercial CSPs: Amazon Web Services, Google Cloud Platform, Microsoft Azure, GitHub, and Slack among others.

Key Features of the Cloud

  • Elastic — Resources such as servers, or storage capacity, can be provisioned at will, used for some period of time and then disposed.
  • Pay-per-use — No upfront costs; fees are incurred only when a service is being used.
  • Self-service and on-demand — The customer can initiate cloud resources whenever they need them.

For the purposes of this initiative, we focus on the Cloud as an Infrastructure as a Service (IaaS) platform where organizations, or individual researchers, create an account, set up an appropriate payment mechanism and are then able to access and use these infrastructure components as desired.

In the Cloud

You will often hear people say they are using things ‘in the cloud.’ This typically means they are using packaged services in the cloud — called Software as a Service (SaaS) platforms — such as Dropbox, OneDrive, Gmail, and similar tools. These platforms are almost certainly hosted on cloud infrastructure; STRIDES focuses on the lower-level assembly of cloud building blocks (compute, storage, etc) to build systems to address biomedical use cases.

Common Cloud Computing Components Across the Main CSPs

Component Description Amazon Web Serices Google Microsoft Azure
Compute Servers of various sizes and capabilities Elastic Compute Cloud (EC2) Compute Engine Virtual Machines
Disk Storage Disks that can be attached to compute instances Elastic Block Store (EBS) Persistent Disk Disk Storage
Object Storage File storage Simple Storage Service (S3) Cloud Storage Blob Storage
Archive Storage Long-term archival file storage Glacier Cloud Storage - Coldline Archive Storage

Implications of the Cloud Model

  • Data movement and importance of the network
  • Cost models (monthly, based on usage)
  • Risks/Implications
    • Need guardrails
    • Ability to run up bills
    • Security
    • Easy to begin (appropriate IT knowledge, sysadmin skills, development skills, are needed plus specific cloud architecture and distributed system skills)
    • Cloud environments are always evolving so staying up to date can be challenging