Response Analysis (ARA) Overview
Automated batch processing system that measures direct mail campaign effectiveness by correlating promotional mailings with customer transaction data from the data cooperative.
LEGACY NOTICE: This system is a candidate for BERT migration. The current implementation uses Deno/TypeScript for orchestration with a Python processing engine (auto_ra). Future consolidation into the BERT platform is anticipated.
Purpose
Response Analysis (internally called “ARA” or simply “RA”) is Path2Response’s automated system for measuring the effectiveness of direct mail marketing campaigns. It answers the fundamental question: “Did the people we mailed to make purchases?”
The system:
- Processes thousands of promotions daily
- Correlates mailing records with post-mail transaction data
- Calculates response rates and campaign performance metrics
- Provides ongoing analysis until measurement windows close
- Generates reports consumed by the Dashboards application
Architecture
Directory Structure
cdk-backend/projects/response-analysis/
├── bin/ # CLI entry points
│ └── ra # Main executable
├── docs/ # mdbook documentation
│ └── src/
│ ├── architecture.md
│ ├── data-sources.md
│ ├── glossary.md
│ └── usage.md
├── lib/response-analysis/ # Source code
│ ├── response-analysis.main.ts # CLI entry point and orchestration
│ ├── run-response-analysis.ts # Core processing runner
│ ├── assemble-campaigns.ts # Campaign/promotion assembly logic
│ ├── title-transactions.ts # Transaction data handling
│ ├── manifest-writer.ts # Output metadata generation
│ ├── memo.ts # Households memo integration
│ ├── utils.ts # Utility functions
│ ├── audit-query.main.ts # Audit Query (aq) CLI tool
│ └── types/
│ └── account.ts # TypeScript type definitions
├── README.md # Primary documentation
├── compile.sh # Linting/formatting script
└── update.sh # Dependency update script
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Orchestration | Deno / TypeScript | Job scheduling, data coordination, CLI |
| Analysis Engine | Python (auto_ra) | Statistical processing, response calculation |
| Data Processing | Polars (Rust) | High-performance data manipulation |
| Large-scale ETL | AWS EMR | Transaction data extraction |
| Data Storage | AWS S3 | Households data, transaction cache |
| Results Storage | AWS EFS | Persistent output at /mnt/data/prod/ |
| Notifications | Slack | Error alerts and job status |
System Components
┌─────────────────────────────────────────────────────────┐
│ Data Sources │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Dashboards │ │ Households │ │ Transaction │ │
│ │ Audit Data │ │ Data │ │ Data │ │
│ │ (S3) │ │ (S3) │ │ (S3/EMR) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
└─────────┼────────────────┼────────────────┼────────────┘
│ │ │
└────────────────┼────────────────┘
│
┌──────────────────────────▼──────────────────────────────┐
│ Response Analysis Controller (ra) │
│ │
│ • Orchestrates workflow • Manages job queue │
│ • Coordinates data sources • Handles errors │
│ • Generates reports • Slack notifications │
└──────────────────────────┬──────────────────────────────┘
│
┌──────────────────────────▼──────────────────────────────┐
│ auto_ra Processing (Python) │
│ │
│ • Statistical analysis • Response calculation │
│ • Polars data processing • Result generation │
│ • Serial execution (~3s/promotion) │
└──────────────────────────┬──────────────────────────────┘
│
┌──────────────────────────▼──────────────────────────────┐
│ Results Storage (EFS) │
│ │
│ /mnt/data/prod/<title>/ra/ │
│ ├── campaigns/<campaign_id>/ │
│ │ ├── .account.json.gz (metadata) │
│ │ ├── .log.txt.gz (execution log) │
│ │ └── .config-*.json (job config) │
│ └── promotions/<order_key>/ │
└─────────────────────────────────────────────────────────┘
Core Functionality
Processing Workflow
-
Data Discovery
- Locate current households file on S3
- Download households metadata (title keys, transaction availability dates)
- Fast-scan Dashboards audit for account files
-
Campaign Assembly
- Parse Account documents to find campaigns and promotions
- Filter by date ranges (mailed within last year, data available after mail date)
- Group promotions into campaigns or mark as standalone
- Determine which jobs need to run (incomplete, not yet finalized)
-
Transaction Preparation
- Check S3 cache for title transaction data
- If missing, trigger EMR job (
title-transaction-counts) - adds 20+ minutes - Download and decompress transaction files
-
Response Analysis Execution
- For each runnable campaign/promotion:
- Generate configuration JSON (pythonic snake_case naming)
- Call
auto_raPython script with configuration - Capture stdout/stderr and job status
- For campaigns with multiple promotions:
- Process each promotion individually
- Run combination step to aggregate results
- For each runnable campaign/promotion:
-
Results Storage
- Write analysis results to EFS (
/mnt/data/prod/<title>/ra/) - Generate metadata files (
.account.json.gz,.log.txt.gz) - Prune old versions (retain 31 days)
- Write analysis results to EFS (
Processing Logic
| Condition | Action |
|---|---|
| Mail date < 1 year ago | Eligible for analysis |
| Transaction data > mail date | Can begin analysis |
| Transaction data > measurement date | Analysis considered final |
| Post-measurement + 30 days | Continue for late transactions |
| Previous run successful + past cutoff | Skip (complete) |
Key Terms
| Term | Definition |
|---|---|
| Opportunity | Dashboards/Salesforce term for what business calls a “promotion” |
| Promotion | A specific mailing activity; basic unit of response analysis |
| Campaign | Group of related promotions analyzed together |
| Mail Date | When promotional materials were sent |
| Measurement Date | End of response measurement window |
| Title | A catalog or mailing program (e.g., “McGuckin Hardware”) |
CLI Usage
# Standard run
ra
# Preview without executing
ra --dry-run
# Filter by title
ra --title burrow --title cosabella
# Filter by order
ra --order 50984
# Rerun completed analyses
ra --assume-incomplete
# Verbose logging
ra -v
# Custom slack channel
ra --slack-channel @jsmith
# Disable slack
ra --no-slacking
Performance Characteristics
| Metric | Value |
|---|---|
| Startup overhead | ~30 seconds |
| Per-promotion processing | ~3 seconds (varies) |
| EMR data creation | 20+ minutes |
| Daily capacity | Thousands of promotions |
| Execution model | Serial (auto_ra is resource-intensive) |
Integrations
Upstream Systems
| System | Integration |
|---|---|
| Dashboards | Consumes audit data for campaign/promotion metadata |
| Households File | Source of transaction data and customer records |
| Salesforce | Configuration parameters (suppression periods, etc.) |
Downstream Systems
| System | Integration |
|---|---|
| Dashboards | Consumes RA results for reporting |
| Client Reports | Campaign performance metrics |
AWS Services
| Service | Purpose |
|---|---|
| S3 | Households data, transaction cache, audit files |
| EMR | Large-scale transaction data extraction |
| EFS | Results storage (/mnt/data/prod/) |
| CloudWatch | Logging and monitoring |
Data Model
Account Document Structure
Account {
AccountId: string // "001Du000003bxG7IAI"
Name: string // "Ada Health"
TitleKey: string // "adahealth"
Vertical: "Catalog" | "Nonprofit" | "DTC" | ...
Campaigns: {
[campaignId]: Campaign {
Name: string
StartDate, EndDate: string
Opportunities: Opportunity[]
RaJob: { Status, Config, Log, Error }
}
}
Opportunities: {
[opportunityKey]: Opportunity {
Name: string
MailDate, MeasurementDate: string
OrderKey: string
Order: { Fulfillments, Shipments }
RaJob: { Status, Config, Log, Error }
}
}
RaJob: { Complete: boolean, Success: boolean }
}
Output Files
| File | Purpose |
|---|---|
.account.json.gz | Account document with RaJob metadata appended |
.log.txt.gz | Complete execution log |
.config-promotion_*.json | Promotion configuration (snake_case) |
.config-campaign_*.json | Campaign configuration (snake_case) |
Development
Prerequisites
-
auto_ra Python script installed
- From
ds-modelingpackage - Symlinked to
~/.local/bin/auto_ra
- From
-
title-transaction-counts EMR utility
- Included in
preselect-emrinstaller
- Included in
-
AWS CLI v2 (not Python venv version 1)
-
Access permissions
- S3: Read/write to households data
- EFS: Read/write to
/mnt/data/prod/* - Dashboards audit: Read access
Installation
# Standard installation
sudo cp ./bin/* /usr/local/bin/
sudo rsync -avL --delete ./lib/ra/ /usr/local/lib/ra/
# User installation
cp ./bin/* ~/.local/bin/
rsync -avL --delete ./lib/ra/ ~/.local/lib/ra/
# Verify
ra --help
Maintenance
# Format, lint, and check
./compile.sh
# Update dependencies
./update.sh
Environment Variables
| Variable | Purpose | Default |
|---|---|---|
RA_ALT_SRC_DIR | Alternate production directory | /mnt/data/prod/ |
RA_ALT_TARGET_DIR_NAME | Alternate RA output folder | ra |
P2R_SLACK_URL | Slack webhook URL | (none) |
NO_COLOR | Disable colored output | 0 |
Related Documentation
- Source repository:
cdk-backend/projects/response-analysis/ - Full mdbook docs:
projects/response-analysis/docs/ - auto_ra (Python):
ds-modelingrepository - title-transaction-counts:
preselect-emrinstaller - Dashboards: Consumes RA results for client reporting
Audit Query Tool (aq)
The Response Analysis project also includes a utility for querying Dashboards audit data:
# Get suppression import-ids for PDDM/Swift hotline runs
aq pddm-suppress birkenstock
Returns import-ids of shipments that need suppression based on SalesForce-configured suppression periods.
Source: cdk-backend/projects/response-analysis (README.md, docs/src/.md, lib/response-analysis/.ts)
Documentation created: 2026-01-24