Infrastructure Repository Overview
Core AWS Infrastructure as Code
The infrastructure repository contains Path2Response’s AWS Cloud Development Kit (CDK) infrastructure code for managing core AWS resources across multiple environments. It provides centralized configuration for VPCs, EMR security, DNS architecture, and developer instance provisioning.
Purpose
The infrastructure repository serves as the foundation for P2R’s cloud infrastructure:
- VPC & Network Configuration - Centralized VPC definitions and subnet mappings across all AWS accounts
- EMR Security - Security configurations enforcing IMDSv2 for EMR clusters
- DNS Management - Public and private DNS zone architecture using Route53
- Developer Instances - Automated provisioning of EC2 instances for data scientists and developers
- Global Constants - Shared account IDs, region mappings, EFS volume configurations, and SSH key pairs
Target Users: Infrastructure Engineers, DevOps, Data Scientists, Developers
Architecture
Repository Structure
/infrastructure/
├── package.json # Root package configuration (v335.0.0-SNAPSHOT)
├── .gitignore # Git ignore patterns
├── README.md # Repository documentation
└── cdk/ # AWS CDK infrastructure stacks
├── dns/ # DNS zone management
│ ├── public-zones/ # Public Route53 hosted zones
│ ├── private-zones/ # Private Route53 hosted zones
│ ├── resolver-rules/ # DNS resolver rules
│ ├── shared-resources/# Common DNS components
│ └── docs/ # DNS architecture documentation
├── emr/ # EMR security configuration stack
│ ├── bin/ # CDK app entry point
│ ├── lib/ # Stack implementation
│ └── test/ # Unit tests
├── global/ # Shared configuration namespace
│ ├── account.ts # AWS account definitions
│ ├── vpc.ts # VPC and subnet mappings
│ ├── efs.ts # EFS volume configurations
│ └── index.ts # Module exports
└── shared-services/ # Cross-cutting infrastructure
└── dev-instances/ # Developer instance provisioning
├── bin/ # CDK app entry point
├── lib/ # Stack and constructs
│ ├── launch-templates/ # EC2 launch template generation
│ ├── home-dirs/ # EFS home directory setup
│ ├── network/ # Network constructs
│ ├── instance-management/ # Instance lifecycle
│ └── utils/ # Utility functions
├── docs/ # Architecture documentation
└── scripts/ # Deployment scripts
Technology Stack
| Layer | Technologies |
|---|---|
| IaC Framework | AWS CDK 2.233 (TypeScript) |
| Language | TypeScript 5.9 |
| Runtime | Node.js 20, 22, or 24 |
| Testing | Jest |
| Compression | lzma-native (for userdata scripts) |
| Templating | Handlebars |
AWS Account Structure
The infrastructure supports a multi-account AWS Organization with Control Tower:
| Environment | Account ID | Profile | Purpose |
|---|---|---|---|
| p2r (Root) | 448838825215 | p2r_root | Path2Response main account, DNS root |
| prd | 531556151531 | p2r_prd | Production workloads |
| rc | 881797796941 | p2r_rc | Release Candidate testing |
| stg | 135821922267 | p2r_stg | Staging environment |
| dev | 190585684037 | p2r_dev | Development environment |
All accounts operate in us-east-1 region.
Core Components
1. Global Configuration (/cdk/global/)
Centralized TypeScript modules providing shared constants and lookup functions used across all CDK stacks.
Account Configuration (account.ts):
- AWS account IDs for each environment
- Region mappings
getCdkEnvironment()helper for stack deployment
VPC Configuration (vpc.ts):
- Complete VPC and subnet definitions by environment
- Three VPC purposes:
ControlTower,General,Emr - Functions:
findVpcIdByPurpose(),findVpcIdByName(),getKeyName() - Region abbreviation mappings (e.g.,
us-east-1->use1)
EFS Configuration (efs.ts):
- Data volume IDs by environment (General, Ingest, IngestPrd)
- Home volume IDs by architecture (X86, ARM)
- Security group mappings per VPC
- Functions:
lookupEfsDataVolumeId(),lookupEfsHomeVolumeId()
Example Usage:
import { accounts, getCdkEnvironment } from '../global/account';
import { findVpcIdByPurpose, VpcPurpose } from '../global/vpc';
import { lookupEfsHomeVolumeId } from '../global/efs';
const vpcId = findVpcIdByPurpose('dev', 'us-east-1', VpcPurpose.General);
const efsId = lookupEfsHomeVolumeId('p2r', 'us-east-1', 'X86');
2. EMR Stack (/cdk/emr/)
Deploys EMR security configuration enforcing IMDSv2 (Instance Metadata Service v2) across all EMR clusters.
Stack: EmrStack
Key Configuration:
new emr.CfnSecurityConfiguration(this, 'EmrSecurityConfiguration', {
name: 'p2r-emr-security-config',
securityConfiguration: {
"InstanceMetadataServiceConfiguration": {
"MinimumInstanceMetadataServiceVersion": 2,
"HttpPutResponseHopLimit": 1
}
}
});
Deployment:
cd cdk/emr
npm install
cdk deploy --profile p2r_dev # Deploy to development
cdk deploy --profile p2r_prd # Deploy to production
Purpose: Security hardening - prevents SSRF attacks by requiring IMDSv2 tokens for metadata access.
3. DNS Infrastructure (/cdk/dns/)
Manages both public and private DNS using AWS Route53, with a mirrored domain structure across environments.
DNS Architecture
Public DNS Hierarchy:
path2response.com (Root - P2R Account)
├── dev.path2response.com (Development Account)
├── stg.path2response.com (Staging Account)
├── rc.path2response.com (RC Account)
├── prd.path2response.com (Production Account)
└── common.path2response.com (P2R Account - shared services)
Private DNS Hierarchy:
path2response.internal (Each workload account)
├── dev.path2response.internal
├── stg.path2response.internal
├── rc.path2response.internal
└── prd.path2response.internal
Key Features
- Decentralized Management: Each AWS account manages its own subdomain
- Cross-Account Resolution: DNS resolver rules enable cross-account name resolution
- User-Friendly Production URLs:
foo.path2response.comCNAMEs tofoo.prd.path2response.com - Mirrored Structure: Public and private DNS use consistent naming conventions
Directory Organization
| Directory | Purpose |
|---|---|
public-zones/ | Public Route53 hosted zones per environment |
private-zones/ | Private Route53 hosted zones per environment |
resolver-rules/ | Inbound/outbound DNS resolver rules |
shared-resources/ | Common constructs and utilities |
docs/ | Architecture documentation |
4. Developer Instances Stack (/cdk/shared-services/dev-instances/)
Automated provisioning system for EC2 developer instances with persistent home directories and pre-configured environments.
Features
- Multiple Instance Types: 12 launch templates covering burstable, general-use, memory-optimized, and compute-optimized instances
- Architecture Support: Both X86 (AMD/Intel) and ARM (Graviton) instances
- Persistent Home Directories: EFS-mounted
/homedirectories that survive instance termination - Data Lake Access: Environment-specific EFS volumes mounted at
/mnt/data - Pre-configured Tools: Conda, Node.js, Python, and operations scripts
- Idle Detection: Automatic notification after 4 hours of idle time
- SSM Integration: AWS Systems Manager for patching and management
- Slack Notifications: Integration with Slack for instance events
Launch Template Types
| Template Family | Description |
|---|---|
Burstable_X86_AMD | T3a instances for intermittent CPU needs |
Burstable_X86_Intel | T3 instances for intermittent CPU needs |
Burstable_Arm | T4g ARM instances for cost-efficient bursting |
GeneralUse_X86_AMD | M7a balanced workloads |
GeneralUse_X86_Intel | M7i balanced workloads |
GeneralUse_Arm | M7g ARM balanced workloads |
MemoryOptimized_X86_AMD | R7a for large in-memory datasets |
MemoryOptimized_X86_Intel | R7i for large in-memory datasets |
MemoryOptimized_Arm | R7g ARM for memory-intensive apps |
ComputeOptimized_X86_AMD | C7a for compute-bound applications |
ComputeOptimized_X86_Intel | C7i for compute-bound applications |
ComputeOptimized_Arm | C6g ARM for batch processing |
Template Variants
Each template family has three variants:
- DataScience (dev02-style): General EFS data mount + EFS home
- Ingest (dev01-style): Ingest EFS data mount + EFS home
- Ingest-NoEfsHome: Ingest EFS data mount, no persistent home
Stack Resources
| Resource | Purpose |
|---|---|
| Security Groups | Control access from VPC CIDR blocks |
| IAM Role + Instance Profile | SSM, CloudWatch, S3, EFS permissions |
| S3 Config Bucket | Stores startup scripts and configuration |
| SNS Topic | Slack integration for notifications |
| SSM Parameters | Compressed startup scripts for each template |
| Launch Templates | Pre-configured EC2 launch specifications |
Userdata Script Pipeline
Startup scripts are processed through a Handlebars templating system and compressed with LZMA:
- Base userdata - Core instance setup (swap, EFS mounts, user creation)
- Transformers - Additional configuration layers:
withCondaInstaller- Miniforge/Conda setupwithNodeInstall- Node.js installationwithSophosGateway- Network routing through Sophos firewallwithOperationsRepo- Clone operations repositorywithGeneralEfsDataMount/withIngestEfsDataMount- Data volume mounting
Deployment
cd cdk/shared-services/dev-instances
npm install
npm run build
npx cdk deploy --profile p2r_root
VPC Architecture
Development Environment
| VPC Name | CIDR | Purpose | Subnets |
|---|---|---|---|
aws-controltower-VPC | 172.31.0.0/16 | Control Tower | 3 public, 3 private (us-east-1a/b/c) |
Vpc4General | 10.129.0.0/16 | General workloads | 3 public, 3 private (us-east-1a/b/f) |
Vpc4Emr | 172.19.0.0/16 | EMR clusters | 1 public, 1 private (us-east-1a) |
P2R (Root) Account
| VPC Name | CIDR | Purpose | Subnets |
|---|---|---|---|
P2R- SAF Created VPC | 192.168.0.0/16 | General | 1 public, 1 private (us-east-1a) |
Vpc4Emr | 192.19.0.0/16 | EMR clusters | 5 public, 5 private (us-east-1a/b/c/d/f) |
EFS Volumes
Data Volumes (by Environment)
| Environment | Purpose | Volume ID |
|---|---|---|
| dev | General | fs-03a174cc36df5c6fd |
| stg | General | fs-0c4d5253c243ba1bf |
| rc | General | fs-041218514437fec08 |
| prd | General | fs-00aa2dafd53a02602 |
| p2r | General | fs-0c4d5253c243ba1bf |
| p2r | Ingest | fs-b027b950 |
| p2r | IngestPrd | fs-ab36c148 |
Home Volumes (P2R Account)
| Architecture | Volume ID | Security Group |
|---|---|---|
| X86 | fs-06c9b837b86da16d9 | sg-03c95a564763eda59 |
| ARM | fs-0ec542736419440c8 | sg-03c95a564763eda59 |
Development Workflows
Deploying EMR Security Config
cd cdk/emr
npm install
npm run build
# Deploy to specific environment
cdk deploy --profile p2r_dev
cdk deploy --profile p2r_prd
# Preview changes
cdk diff --profile p2r_dev
Deploying Developer Instances Stack
cd cdk/shared-services/dev-instances
npm install
npm run build
# Synthesize CloudFormation template
npm run synth
# Preview changes
npm run diff
# Deploy
npm run deploy
Adding a New VPC
- Add VPC definition to
/cdk/global/vpc.tsin thevpcsobject - Include all subnets with CIDR, availability zone, and scope (public/private)
- Add corresponding key pair to
keypairsobject if needed - Update any dependent stacks
Adding a New EFS Volume
- Add volume configuration to
/cdk/global/efs.ts - For data volumes: add to
efsDataVolumesobject - For home volumes: add to
efsHomeVolumeswith security group mappings - Update launch templates if volume should be auto-mounted
Key Integrations
| System | Integration |
|---|---|
| AWS Control Tower | Multi-account organization structure |
| AWS SSM | Instance patching and parameter storage |
| AWS Quick Setup | Patch policy management |
| Slack | Instance notifications (via Chatbot + SNS) |
| Sophos Firewall | Network gateway routing |
| Operations Repo | Scripts deployed to instances |
Related Documentation
- BERT Overview - Application platform using this infrastructure
- Project Inventory - Complete P2R codebase inventory
- Tools and Systems - Overview of all P2R tools
Important Notes
- Multi-Account Deployment - Use appropriate AWS CLI profile for each environment
- IMDSv2 Required - All instances enforce IMDSv2 for security
- EFS Persistence - Home directories survive instance termination
- Network Routing - P2R account uses Sophos gateway for NAT in public subnet
- Idle Detection - Instances notify after 4 hours idle (development environment only)
- Ubuntu 22.04 - All launch templates use Ubuntu Jammy (2023-12-07 AMI)
- Operations Repo - Cloned and zipped during CDK synthesis for deployment
Source: infrastructure repository (README.md, cdk/global/, cdk/emr/, cdk/dns/, cdk/shared-services/dev-instances/) Documentation created: 2026-01-24