Module 6
All about AWS Databases
Database considerations
How much throughput is needed?
Will the database need to be sized in gigabytes, terabytes, or petabytes of data
What is your data model? What are your data access patterns?
What level of data durability, availability, and recoverability is required?
Will it scale?
Do you need low latency?
Do regulatory obligations apply?
Relational and non-relational databases

Amazon Database Options
:(
:(
Database Capacity Planning
Analyze current storage capacity
Predict capacity requirements
Determine if horizontal scaling, vertical scaling, or a combination is needed

Amazon Relational Database Service (RDS) 

Is a managed relational database service to deploy and scale relational databases
Supports multiple database engines
Uses Amazon Elastic Block Store (Amazon EBS)volumes for database and log storage
By default the database is NOT encrypted. You can only encrypt it while your first provisioning/building the database. You can't encrypt it after its done
Benefits of RDS
You don't need to provision your infrastructure or install and maintain database software
You can scale up or down the compute and memory resources powering your deployment
You can configure automated backups, database snapshots, and automatic host replacement
You can isolate your database in your own virtual network
RDS Architecture


Amazon Aurora (sub type of RDS)
Is a relational database management system (RDBMS) built for the cloud with full MySQL and PostgreSQL compatibility
Is managed by Amazon RDS
Provides high performance and availability at one-tenth of the cost
Delivers Multi-AZ deployments with Aurora Replicas
Aurora clusters

Amazon RDS use case:

RDS Best practices

Amazon RDS Proxy
Fully managed and highly available
Pools and shares database connections for improved application scaling
Reduces database failover times for Aurora and Amazon RDS databases by up to 66 percent for Amazon RDS Multi-AZ databases
Enforces IAM authentication and stores credentials in AWS Secrets Manager
Connection Pooling for Improved Scalability


Backing up data in RDS
Two options backups and snapshots:
Backup
Restore a database instance to a specific point in time
Daily during your backup window (transaction logs are captured every 5 minutes)
The default is 7 days but it can be set to up to 35 days. The backups can be automatically deleted after any retention period
Cannot be shared (needs to be copied to a manual snapshot first)
Database Snapshots
Back up a database instance in a known state, and then restore it to that specific state.
User-initiated (as frequently as the user chooses)
Kept until the user explicitly deletes it
Can be shared (shared snapshots can be copied by other AWS accounts)
RDS encryption for backups

Amazon DynamoDB
Is a fully managed, serverless, NoSQL database
Supports key-value and document data models
Delivers millisecond performance and can automatically scale tables to adjust for capacity
Is used for developing applications, and mission-critical workloads that prioritize speed, scalability, and data durability
Use cases

Features
Serverless performance with limitless scalability
Secondary indexes provide flexibility on how to access your data.
Amazon DynamoDB Streams is ideal for an event-driven architecture.
Multi-Region, multi-active data replication with global tables
Built-in security and reliability
DynamoDB encrypts all customer data at rest by default.
Point-in-time recovery protects data from accidental operations
Structure

Example

Multi-region Replication
Global tables provide a multi-region, multi-active database for fast local read and write performance for global applications

Best practices
Use IAM roles to authenticate access
Use AWS CloudTrail to monitor AWS managed AWS KMS key usage
Use IAM policies for DynamoDB base authorization
Monitor DynamoDB operations by using CloudTrail
Use IAM policy conditions for fine-grained access control
Monitor DynamoDB configuration with AWS Config
Use a VPC endpoint and policies to access DynamoDB
Monitor DynamoDB compliance with AWS Config rules.
Amazon Redshift
Is a fully managed, cloud-based data warehousing service designed to handle petabyte-scale analytics workloads
Achieves optimum query performance with columnar storage
Has an Amazon Redshift Serverless option
Is used for online analytics processing (OLAP)
AWS fully managed purpose-built database options

Make sure that...
You have suitable workloads
Correct data model
Look over feautes and benefits
Review common use cases
before choosing a database 🍩
AWS DMS
Migrate an on-prem database to a AWS database
Is a managed migration and replication service
Helps move existing database and analytics workloads to and within AWS
Supports most widely used commercial and open source databases
Replicates data on demand or on a schedule to replicate changes from a source

Tools to migrate between different database engines


AWS DMS replicates data from a database into a data lake

Last updated
Was this helpful?