Infrastructure as Code (IaC) Using CloudFormation
FaxDroid’s infrastructure was built in AWS using CloudFormation. In the following sections I explain how the infrastructure was setup. From a high level there are three different AWS accounts:
- CI Environment. Used for running automated tests and performing deployments. See Provisioning, CI Environment, Deployments and Automated Testing for more details.
- QA Account and Prod Accounts. These are identical AWS accounts with the same infrastructure. QA is used for testing while Prod provides service to customers.
1.Route53 (Public): Route53 contains a public hosted zone for faxdroid.com. Anytime someone visits www.faxdroid.com the DNS is resolved using Route53 and points to the CloudFront distribution.
2. CloudFront: The main CloudFront distribution that serves both front-end content and API calls to the server. See Angular Server Side Rendering with Lambda@Edge and CloudFront for details.
3. Lambda@Edge: See Angular Server Side Rendering with Lambda@Edge and CloudFront for details.
4. S3 Static Resources: Bucket for serving static front-end content. See Angular Server Side Rendering with Lambda@Edge and CloudFront for details.
5. WAF: All requests directed at the servers must first pass through the WAF (Web application Firewall). There are various checks that occur here to protect FaxDroid:
- IPs are blocked based on a managed IP Black lists by AWS
- IPs are blocked based on my own custom black list.
- Only requests that match one of the publicly supported end points by FaxDroid are allowed to pass. When you have a public facing web service various bots will constantly send requests trying find potential weak spots. By whitelisting request patterns I’ve ensure only supported requests can pass this point.
- IP Rate limiting is also setup using WAF to ensure the servers are not overloaded by DDOS attacks
6. VPC(Main): A VPC containing all the server side resources for FaxDroid.
7. Public Subnet: The main reason for creating a public/private subnet is due to FaxDroid’s reliance on SIP trunks (see Creating a Fax Server Using Asterisk (an Open Source PBX)) FaxDroid uses a third party carriers to provide SIP trunks. The most common method for authentication with SIP trunks is whitelisting IP addresses. Without public/private subnets the instances would communicate directly with the SIP trunk through whatever dynamically allocated IP they have. By creating a public/private subnet with a NAT gateway, I was able to assign a static dedicated IP to the NAT gateway allowing all outgoing communication to occur from a single whitelisted IP address.
8. Private Subnet: See (7. Public Subnet)
9. AutoScaling Group (Application Servers): This is where FaxDroid’s server is located. A major benefit of using auto scaling groups is that when traffic to the servers increase, the number of instances could easily scale. It also allowed for zero down time deployment. This is not possible through traditional single instance servers.
10. ELB (Public): An ELB (Elastic Load Balance) in the public subnet. The ELB is used to handle all incoming connections. The ELB also takes advantage of AWS certificate manager to enable HTTPS connections.
11. Certificate Manager: See (10.ELB)
12. NAT Gateway: See (7. Public Subnet)
13. RDS Aurora (MySQL): FaxDroid’s storage is backed by a MySQL server. The server is hosted on an RDS Aurora cluster. The service is fully managed by AWS and allows for zero down time horizontal scaling (for read operations), and vertical scaling (with some downtime) for write operations.
14.Redis (ElastiCache): A Redis ElastiCache database is also used for creating locks. Locks are used in places where two concurrent calls could cause unstable situations.
15.Route53 (Private): A Route 53 private hosted zone. Private hosted zones are only viewable from inside the VPC. There are services inside the VPC which I wanted to communicate with each other. As an example I wanted the Lambda to be able to call the server hosts. With a private route53 record I can add a record such as endpoint.com and have it point to the ELB (Private). Regardless of which environment I am located in (QA, Prod, …) the Lambda will make the its calls to the private ELB by making requests to endpoint.com. Without the Route 53 records I would most likely have to either expose these interfaces to the public (to be accessible directly by making API calls to faxdroid.com which would increase the attack service of the application) or I would need some logic in the source code to find the private ELB.
16.ELB (Private): An ELB in the private subnet. This was built for internal resources that need to access the server. See 15.Route53 for more details.
17.SQS Queue for Refunds: Anytime there is a failure in fax delivery a full refund is provided. When a fax fails a message is sent to this queue with the transaction to refund. It is consumed in an async manner and a refund is provided for each transaction.
18.SQS Queue for Fax Retries: Faxes fail for various reasons (connection issues, busy line, …). Anytime there is a failure in delivery the fax is retried 4 times. This retry mechanism is built using SQS queues. After all 4 retries a refund is provided (See 17.SQS Queue for Refunds).
19.Cloudwatch Event Rule: 17&18 mention two SQS queues that are used for async operations. A CloudWatch event rule is setup to trigger a Lambda event every minute. This Lambda will then poll the SQS queues for any messages and send request to the FaxDroid servers.
20.Lambda Polling SQS Queues: See 19.Cloudwatch Event Rule for more details.
21.CI Account: A separate account is created for the CI Environment. For details on how the CI environments works please see Provisioning, CI Environment, Deployments and Automated Testing.
22. Jenkins: The CI environment is built using Jenkins installed on an EC2 instance. For details on the how the CI Environments works please see Provisioning, CI Environment, Deployments and Automated Testing.
Description
Using Infrastructure as Code (IaC) to build out the AWS infrastructure needed to run FaxDroid:
- CloudFormation
- Route53
- CloudFront
- Lambda
- S3
- WAF
- Certificate Manager
- Elastic Load Balancer
- Relational Database Service (RDS)
- Networking (Subnets, Route Tables, Nat, VPCs)
- Auto Scaling Groups (ASG)
- ElastiCache Redis
- SQS
- CloudWatch