Module 6

All about AWS Databases

Database considerations

Scalability
Storage requirements
Data characteristics
Durability

How much throughput is needed?

Will the database need to be sized in gigabytes, terabytes, or petabytes of data

What is your data model? What are your data access patterns?

What level of data durability, availability, and recoverability is required?

Will it scale?

Do you need low latency?

Do regulatory obligations apply?

Relational and non-relational databases

Relational = SQL/Tables ; Non-relational = Objects/JSON documents/fast lol

Amazon Database Options

Relational Databases
Non-relational databases

:(

:(

Database Capacity Planning

  1. Analyze current storage capacity

  2. Predict capacity requirements

  3. Determine if horizontal scaling, vertical scaling, or a combination is needed

Amazon Relational Database Service (RDS)

  • Is a managed relational database service to deploy and scale relational databases

  • Supports multiple database engines

  • Uses Amazon Elastic Block Store (Amazon EBS)volumes for database and log storage

Benefits of RDS

  • You don't need to provision your infrastructure or install and maintain database software

  • You can scale up or down the compute and memory resources powering your deployment

  • You can configure automated backups, database snapshots, and automatic host replacement

  • You can isolate your database in your own virtual network

RDS Architecture

Amazon Aurora (sub type of RDS)

  • Is a relational database management system (RDBMS) built for the cloud with full MySQL and PostgreSQL compatibility

  • Is managed by Amazon RDS

  • Provides high performance and availability at one-tenth of the cost

  • Delivers Multi-AZ deployments with Aurora Replicas

Can be serverless, up to 15 read replicas, 100% compatiable

Aurora clusters

Amazon RDS use case:

We have tables, fixed properties and a strict format

RDS Best practices

Amazon RDS Proxy

  • Fully managed and highly available

More scalable
More resilient
More secure

Pools and shares database connections for improved application scaling

Reduces database failover times for Aurora and Amazon RDS databases by up to 66 percent for Amazon RDS Multi-AZ databases

Enforces IAM authentication and stores credentials in AWS Secrets Manager

Connection Pooling for Improved Scalability

We can store database user credentials in Secrets Manager. The application should be the only principal that can read the credentials. This prevents source code leakage to contain credentials

Backing up data in RDS

Two options backups and snapshots:

  • Backup

    • Restore a database instance to a specific point in time

    • Daily during your backup window (transaction logs are captured every 5 minutes)

    • The default is 7 days but it can be set to up to 35 days. The backups can be automatically deleted after any retention period

    • Cannot be shared (needs to be copied to a manual snapshot first)

  • Database Snapshots

    • Back up a database instance in a known state, and then restore it to that specific state.

    • User-initiated (as frequently as the user chooses)

    • Kept until the user explicitly deletes it

    • Can be shared (shared snapshots can be copied by other AWS accounts)

RDS encryption for backups

Amazon DynamoDB

  • Is a fully managed, serverless, NoSQL database

  • Supports key-value and document data models

  • Delivers millisecond performance and can automatically scale tables to adjust for capacity

  • Is used for developing applications, and mission-critical workloads that prioritize speed, scalability, and data durability

JSON document; has a partition and sort key

Use cases

Features

  • Serverless performance with limitless scalability

    • Secondary indexes provide flexibility on how to access your data.

    • Amazon DynamoDB Streams is ideal for an event-driven architecture.

    • Multi-Region, multi-active data replication with global tables

  • Built-in security and reliability

    • DynamoDB encrypts all customer data at rest by default.

    • Point-in-time recovery protects data from accidental operations

Structure

Example

Multi-region Replication

Global tables provide a multi-region, multi-active database for fast local read and write performance for global applications

Best practices

Preventative
Detective

Use IAM roles to authenticate access

Use AWS CloudTrail to monitor AWS managed AWS KMS key usage

Use IAM policies for DynamoDB base authorization

Monitor DynamoDB operations by using CloudTrail

Use IAM policy conditions for fine-grained access control

Monitor DynamoDB configuration with AWS Config

Use a VPC endpoint and policies to access DynamoDB

Monitor DynamoDB compliance with AWS Config rules.

Amazon Redshift

  • Is a fully managed, cloud-based data warehousing service designed to handle petabyte-scale analytics workloads

  • Achieves optimum query performance with columnar storage

  • Has an Amazon Redshift Serverless option

  • Is used for online analytics processing (OLAP)

AWS fully managed purpose-built database options

A ledger database can be used for keeping transactions/history of data (e.g. credit card)

Make sure that...

  • You have suitable workloads

  • Correct data model

  • Look over feautes and benefits

  • Review common use cases

before choosing a database 🍩

AWS DMS

Migrate an on-prem database to a AWS database

  • Is a managed migration and replication service

  • Helps move existing database and analytics workloads to and within AWS

  • Supports most widely used commercial and open source databases

  • Replicates data on demand or on a schedule to replicate changes from a source

Tools to migrate between different database engines

AWS DMS replicates data from a database into a data lake

Last updated

Was this helpful?