Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Response Analysis (ARA) Overview

Automated batch processing system that measures direct mail campaign effectiveness by correlating promotional mailings with customer transaction data from the data cooperative.


LEGACY NOTICE: This system is a candidate for BERT migration. The current implementation uses Deno/TypeScript for orchestration with a Python processing engine (auto_ra). Future consolidation into the BERT platform is anticipated.


Purpose

Response Analysis (internally called “ARA” or simply “RA”) is Path2Response’s automated system for measuring the effectiveness of direct mail marketing campaigns. It answers the fundamental question: “Did the people we mailed to make purchases?”

The system:

  • Processes thousands of promotions daily
  • Correlates mailing records with post-mail transaction data
  • Calculates response rates and campaign performance metrics
  • Provides ongoing analysis until measurement windows close
  • Generates reports consumed by the Dashboards application

Architecture

Directory Structure

cdk-backend/projects/response-analysis/
├── bin/                          # CLI entry points
│   └── ra                        # Main executable
├── docs/                         # mdbook documentation
│   └── src/
│       ├── architecture.md
│       ├── data-sources.md
│       ├── glossary.md
│       └── usage.md
├── lib/response-analysis/        # Source code
│   ├── response-analysis.main.ts # CLI entry point and orchestration
│   ├── run-response-analysis.ts  # Core processing runner
│   ├── assemble-campaigns.ts     # Campaign/promotion assembly logic
│   ├── title-transactions.ts     # Transaction data handling
│   ├── manifest-writer.ts        # Output metadata generation
│   ├── memo.ts                   # Households memo integration
│   ├── utils.ts                  # Utility functions
│   ├── audit-query.main.ts       # Audit Query (aq) CLI tool
│   └── types/
│       └── account.ts            # TypeScript type definitions
├── README.md                     # Primary documentation
├── compile.sh                    # Linting/formatting script
└── update.sh                     # Dependency update script

Technology Stack

ComponentTechnologyPurpose
OrchestrationDeno / TypeScriptJob scheduling, data coordination, CLI
Analysis EnginePython (auto_ra)Statistical processing, response calculation
Data ProcessingPolars (Rust)High-performance data manipulation
Large-scale ETLAWS EMRTransaction data extraction
Data StorageAWS S3Households data, transaction cache
Results StorageAWS EFSPersistent output at /mnt/data/prod/
NotificationsSlackError alerts and job status

System Components

                    ┌─────────────────────────────────────────────────────────┐
                    │                    Data Sources                         │
                    │                                                         │
                    │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
                    │  │ Dashboards  │  │ Households  │  │ Transaction │     │
                    │  │ Audit Data  │  │    Data     │  │    Data     │     │
                    │  │    (S3)     │  │    (S3)     │  │  (S3/EMR)   │     │
                    │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘     │
                    └─────────┼────────────────┼────────────────┼────────────┘
                              │                │                │
                              └────────────────┼────────────────┘
                                               │
                    ┌──────────────────────────▼──────────────────────────────┐
                    │           Response Analysis Controller (ra)             │
                    │                                                         │
                    │  • Orchestrates workflow        • Manages job queue     │
                    │  • Coordinates data sources     • Handles errors        │
                    │  • Generates reports            • Slack notifications   │
                    └──────────────────────────┬──────────────────────────────┘
                                               │
                    ┌──────────────────────────▼──────────────────────────────┐
                    │              auto_ra Processing (Python)                │
                    │                                                         │
                    │  • Statistical analysis         • Response calculation  │
                    │  • Polars data processing       • Result generation     │
                    │  • Serial execution (~3s/promotion)                     │
                    └──────────────────────────┬──────────────────────────────┘
                                               │
                    ┌──────────────────────────▼──────────────────────────────┐
                    │                 Results Storage (EFS)                   │
                    │                                                         │
                    │  /mnt/data/prod/<title>/ra/                             │
                    │    ├── campaigns/<campaign_id>/                         │
                    │    │     ├── .account.json.gz    (metadata)             │
                    │    │     ├── .log.txt.gz         (execution log)        │
                    │    │     └── .config-*.json      (job config)           │
                    │    └── promotions/<order_key>/                          │
                    └─────────────────────────────────────────────────────────┘

Core Functionality

Processing Workflow

  1. Data Discovery

    • Locate current households file on S3
    • Download households metadata (title keys, transaction availability dates)
    • Fast-scan Dashboards audit for account files
  2. Campaign Assembly

    • Parse Account documents to find campaigns and promotions
    • Filter by date ranges (mailed within last year, data available after mail date)
    • Group promotions into campaigns or mark as standalone
    • Determine which jobs need to run (incomplete, not yet finalized)
  3. Transaction Preparation

    • Check S3 cache for title transaction data
    • If missing, trigger EMR job (title-transaction-counts) - adds 20+ minutes
    • Download and decompress transaction files
  4. Response Analysis Execution

    • For each runnable campaign/promotion:
      • Generate configuration JSON (pythonic snake_case naming)
      • Call auto_ra Python script with configuration
      • Capture stdout/stderr and job status
    • For campaigns with multiple promotions:
      • Process each promotion individually
      • Run combination step to aggregate results
  5. Results Storage

    • Write analysis results to EFS (/mnt/data/prod/<title>/ra/)
    • Generate metadata files (.account.json.gz, .log.txt.gz)
    • Prune old versions (retain 31 days)

Processing Logic

ConditionAction
Mail date < 1 year agoEligible for analysis
Transaction data > mail dateCan begin analysis
Transaction data > measurement dateAnalysis considered final
Post-measurement + 30 daysContinue for late transactions
Previous run successful + past cutoffSkip (complete)

Key Terms

TermDefinition
OpportunityDashboards/Salesforce term for what business calls a “promotion”
PromotionA specific mailing activity; basic unit of response analysis
CampaignGroup of related promotions analyzed together
Mail DateWhen promotional materials were sent
Measurement DateEnd of response measurement window
TitleA catalog or mailing program (e.g., “McGuckin Hardware”)

CLI Usage

# Standard run
ra

# Preview without executing
ra --dry-run

# Filter by title
ra --title burrow --title cosabella

# Filter by order
ra --order 50984

# Rerun completed analyses
ra --assume-incomplete

# Verbose logging
ra -v

# Custom slack channel
ra --slack-channel @jsmith

# Disable slack
ra --no-slacking

Performance Characteristics

MetricValue
Startup overhead~30 seconds
Per-promotion processing~3 seconds (varies)
EMR data creation20+ minutes
Daily capacityThousands of promotions
Execution modelSerial (auto_ra is resource-intensive)

Integrations

Upstream Systems

SystemIntegration
DashboardsConsumes audit data for campaign/promotion metadata
Households FileSource of transaction data and customer records
SalesforceConfiguration parameters (suppression periods, etc.)

Downstream Systems

SystemIntegration
DashboardsConsumes RA results for reporting
Client ReportsCampaign performance metrics

AWS Services

ServicePurpose
S3Households data, transaction cache, audit files
EMRLarge-scale transaction data extraction
EFSResults storage (/mnt/data/prod/)
CloudWatchLogging and monitoring

Data Model

Account Document Structure

Account {
  AccountId: string           // "001Du000003bxG7IAI"
  Name: string               // "Ada Health"
  TitleKey: string           // "adahealth"
  Vertical: "Catalog" | "Nonprofit" | "DTC" | ...
  Campaigns: {
    [campaignId]: Campaign {
      Name: string
      StartDate, EndDate: string
      Opportunities: Opportunity[]
      RaJob: { Status, Config, Log, Error }
    }
  }
  Opportunities: {
    [opportunityKey]: Opportunity {
      Name: string
      MailDate, MeasurementDate: string
      OrderKey: string
      Order: { Fulfillments, Shipments }
      RaJob: { Status, Config, Log, Error }
    }
  }
  RaJob: { Complete: boolean, Success: boolean }
}

Output Files

FilePurpose
.account.json.gzAccount document with RaJob metadata appended
.log.txt.gzComplete execution log
.config-promotion_*.jsonPromotion configuration (snake_case)
.config-campaign_*.jsonCampaign configuration (snake_case)

Development

Prerequisites

  1. auto_ra Python script installed

    • From ds-modeling package
    • Symlinked to ~/.local/bin/auto_ra
  2. title-transaction-counts EMR utility

    • Included in preselect-emr installer
  3. AWS CLI v2 (not Python venv version 1)

  4. Access permissions

    • S3: Read/write to households data
    • EFS: Read/write to /mnt/data/prod/*
    • Dashboards audit: Read access

Installation

# Standard installation
sudo cp ./bin/* /usr/local/bin/
sudo rsync -avL --delete ./lib/ra/ /usr/local/lib/ra/

# User installation
cp ./bin/* ~/.local/bin/
rsync -avL --delete ./lib/ra/ ~/.local/lib/ra/

# Verify
ra --help

Maintenance

# Format, lint, and check
./compile.sh

# Update dependencies
./update.sh

Environment Variables

VariablePurposeDefault
RA_ALT_SRC_DIRAlternate production directory/mnt/data/prod/
RA_ALT_TARGET_DIR_NAMEAlternate RA output folderra
P2R_SLACK_URLSlack webhook URL(none)
NO_COLORDisable colored output0
  • Source repository: cdk-backend/projects/response-analysis/
  • Full mdbook docs: projects/response-analysis/docs/
  • auto_ra (Python): ds-modeling repository
  • title-transaction-counts: preselect-emr installer
  • Dashboards: Consumes RA results for client reporting

Audit Query Tool (aq)

The Response Analysis project also includes a utility for querying Dashboards audit data:

# Get suppression import-ids for PDDM/Swift hotline runs
aq pddm-suppress birkenstock

Returns import-ids of shipments that need suppression based on SalesForce-configured suppression periods.


Source: cdk-backend/projects/response-analysis (README.md, docs/src/.md, lib/response-analysis/.ts)

Documentation created: 2026-01-24