Module 6

All about AWS Databases

Module 8 | AWS Cloud Foundations

Database considerations

Scalability

Storage requirements

Data characteristics

Durability

How much throughput is needed?

Will the database need to be sized in gigabytes, terabytes, or petabytes of data

What is your data model? What are your data access patterns?

What level of data durability, availability, and recoverability is required?

Will it scale?

Do you need low latency?

Do regulatory obligations apply?

Relational and non-relational databases

Relational = SQL/Tables ; Non-relational = Objects/JSON documents/fast lol

Amazon Database Options

Relational Databases

Non-relational databases

Database Capacity Planning

Analyze current storage capacity
Predict capacity requirements
Determine if horizontal scaling, vertical scaling, or a combination is needed

Amazon Relational Database Service (RDS)

Is a managed relational database service to deploy and scale relational databases
Supports multiple database engines
Uses Amazon Elastic Block Store (Amazon EBS)volumes for database and log storage

By default the database is NOT encrypted. You can only encrypt it while your first provisioning/building the database. You can't encrypt it after its done

Benefits of RDS

You don't need to provision your infrastructure or install and maintain database software
You can scale up or down the compute and memory resources powering your deployment
You can configure automated backups, database snapshots, and automatic host replacement
You can isolate your database in your own virtual network

RDS Architecture

Amazon Aurora (sub type of RDS)

Is a relational database management system (RDBMS) built for the cloud with full MySQL and PostgreSQL compatibility
Is managed by Amazon RDS
Provides high performance and availability at one-tenth of the cost
Delivers Multi-AZ deployments with Aurora Replicas

Can be serverless, up to 15 read replicas, 100% compatiable

Aurora clusters

Amazon RDS use case:

We have tables, fixed properties and a strict format

RDS Best practices

Amazon RDS Proxy

Fully managed and highly available

More scalable

More resilient

More secure

Pools and shares database connections for improved application scaling

Reduces database failover times for Aurora and Amazon RDS databases by up to 66 percent for Amazon RDS Multi-AZ databases

Enforces IAM authentication and stores credentials in AWS Secrets Manager

Connection Pooling for Improved Scalability

We can store database user credentials in Secrets Manager. The application should be the only principal that can read the credentials. This prevents source code leakage to contain credentials

Backing up data in RDS

Two options backups and snapshots:

Backup
- Restore a database instance to a specific point in time
- Daily during your backup window (transaction logs are captured every 5 minutes)
- The default is 7 days but it can be set to up to 35 days. The backups can be automatically deleted after any retention period
- Cannot be shared (needs to be copied to a manual snapshot first)
Database Snapshots
- Back up a database instance in a known state, and then restore it to that specific state.
- User-initiated (as frequently as the user chooses)
- Kept until the user explicitly deletes it
- Can be shared (shared snapshots can be copied by other AWS accounts)

RDS encryption for backups

Amazon DynamoDB

Is a fully managed, serverless, NoSQL database
Supports key-value and document data models
Delivers millisecond performance and can automatically scale tables to adjust for capacity
Is used for developing applications, and mission-critical workloads that prioritize speed, scalability, and data durability

JSON document; has a partition and sort key

Use cases

Features

Serverless performance with limitless scalability
- Secondary indexes provide flexibility on how to access your data.
- Amazon DynamoDB Streams is ideal for an event-driven architecture.
- Multi-Region, multi-active data replication with global tables
Built-in security and reliability
- DynamoDB encrypts all customer data at rest by default.
- Point-in-time recovery protects data from accidental operations

Structure

Example

Multi-region Replication

Global tables provide a multi-region, multi-active database for fast local read and write performance for global applications

Best practices

Preventative

Detective

Use IAM roles to authenticate access

Use AWS CloudTrail to monitor AWS managed AWS KMS key usage

Use IAM policies for DynamoDB base authorization

Monitor DynamoDB operations by using CloudTrail

Use IAM policy conditions for fine-grained access control

Monitor DynamoDB configuration with AWS Config

Use a VPC endpoint and policies to access DynamoDB

Monitor DynamoDB compliance with AWS Config rules.

Amazon Redshift

Is a fully managed, cloud-based data warehousing service designed to handle petabyte-scale analytics workloads
Achieves optimum query performance with columnar storage
Has an Amazon Redshift Serverless option
Is used for online analytics processing (OLAP)

AWS fully managed purpose-built database options

A ledger database can be used for keeping transactions/history of data (e.g. credit card)

Make sure that...

You have suitable workloads
Correct data model
Look over feautes and benefits
Review common use cases

before choosing a database 🍩

AWS DMS

Migrate an on-prem database to a AWS database

Is a managed migration and replication service
Helps move existing database and analytics workloads to and within AWS
Supports most widely used commercial and open source databases
Replicates data on demand or on a schedule to replicate changes from a source

Tools to migrate between different database engines

AWS DMS replicates data from a database into a data lake

PreviousModule 5 NextModule 7

Last updated 1 year ago

Was this helpful?