Data Processing Agreement
Last Updated: February 24, 2026
This Data Processing Agreement ("DPA") forms part of the Terms of Service between Helix Systems LLC ("Processor," "we," "us," "our"), a limited liability company organized under the laws of the United States, and the customer ("Controller," "you") using Helix Extract services. In the event of any conflict between this DPA and the Terms of Service, this DPA shall control with respect to data protection matters.
1. Definitions
- Personal Data: Any information relating to an identified or identifiable natural person processed under this DPA.
- Processing: Any operation performed on Personal Data, including collection, storage, use, transmission, and deletion.
- Sub-processor: Any third-party entity engaged by Processor to process Personal Data on Processor's behalf in the course of providing the Service.
- Data Subject: The individual whose Personal Data is processed.
- Controller: The entity that determines the purposes and means of Processing Personal Data (you, the Customer).
- Processor: The entity that processes Personal Data on behalf of the Controller (Helix Systems LLC).
- Service: The Helix Extract platform, API, and Chrome browser extension provided by Helix Systems LLC.
- Document Data: The contents of documents (PDFs, DOCX files, images) submitted to the Service for extraction.
- Account Data: Personal Data collected to create and manage a user account, including name, email address, authentication credentials, billing identifiers, and usage records.
- GDPR: Regulation (EU) 2016/679 (General Data Protection Regulation), including as adopted into UK law.
- CCPA: California Consumer Privacy Act of 2018, as amended by the CPRA.
- SCCs: Standard Contractual Clauses adopted by the European Commission for international transfers of personal data.
2. Scope and Roles
This DPA applies when you use Helix Extract to process Personal Data. The parties acknowledge that:
- You are the Controller of Personal Data you submit to or through the Service.
- We are the Processor of that Personal Data, acting only on your instructions as set out in this DPA and the Terms of Service.
- Where we independently determine the purposes and means of processing (e.g., for security, fraud prevention, or compliance with legal obligations), we may act as a Controller for those limited purposes.
This DPA covers two distinct categories of Personal Data:
- Document Data: Content within documents submitted for extraction. This data is processed transiently and not retained after processing is complete (see Section 4.1).
- Account Data: Profile, authentication, billing, and usage data stored in our database to operate the Service (see Section 4.2).
3. Instructions for Processing
We process Personal Data only on your documented instructions. By entering into this DPA and using the Service, you instruct us to process Personal Data for the following purposes:
- Performing optical character recognition (OCR) and AI-based field extraction on documents you submit.
- Returning structured extracted data to the Chrome extension for population of web forms.
- Creating and managing your account, including authentication, licensing, and token balance tracking.
- Processing payments and maintaining billing records via our payment processor.
- Sending transactional email notifications (email verification, password reset, billing) via AWS SES.
- Maintaining audit logs and security records as required by applicable law and our security program.
If we are required by applicable law to process Personal Data for a purpose not covered by your instructions, we will inform you before processing unless prohibited from doing so by law.
4. Nature of Processing and Data Retention
4.1 Document Data (Transient)
Documents submitted for extraction (PDF, DOCX, images) are processed as follows:
- Base64 uploads: Document content is decoded and processed in application memory. It is never written to persistent storage and is discarded upon completion of the extraction request.
- S3-staged uploads: For multi-page or large documents, the document is temporarily stored in encrypted AWS S3 (KMS-managed AES-256 encryption) during processing. The document is automatically deleted from S3 immediately after extraction completes, and in all cases within a maximum of 24 hours.
- AI processing: Document text and extracted blocks are sent to AWS Bedrock (Claude models) and AWS Textract for OCR and field extraction. These services process data in-context only; your document content is not retained by AWS Bedrock or AWS Textract beyond the duration of the API call, and is not used to train AI models.
- No secondary use: Document Data is never used for AI model training, analytics, product improvement, or any purpose beyond completing the requested extraction.
4.2 Account Data (Persistent)
The following Account Data is stored persistently in our AWS RDS (PostgreSQL) database for the duration of your account:
- Identity: Email address, first name, last name, account creation date, last login date.
- Authentication: Bcrypt-hashed password (for email/password accounts) or OAuth provider name and provider-assigned user ID (for Google OAuth accounts). Plaintext passwords are never stored.
- Email verification: Verification token and expiry (discarded after successful verification).
- Password reset: Reset token and expiry (discarded after use).
- Licensing & billing: Stripe customer ID, Stripe subscription ID, license tier, subscription status, token balance, and usage quota.
- Consent records: Timestamps of Privacy Policy and Terms of Service acceptance.
- JWT revocation list: SHA-256 hashes of issued JWT token identifiers, issue and expiry timestamps (retained until token expiry).
- Audit logs: Action type, resource accessed, success/failure indicator, IP address, user-agent string, and timestamp (retained for security and compliance purposes).
- Token transactions: Transaction type, token amount, balance after transaction, extraction ID (opaque reference), and Stripe payment intent ID (for purchases).
Account Data is retained for as long as your account is active. Upon account deletion or termination, Account Data is deleted or anonymized within 30 days, except where retention is required by law (e.g., financial records, tax compliance).
4.3 Chrome Extension Local Storage
The Helix Extract Chrome extension stores the following data locally in your browser using the Chrome storage API:
- Authentication JWT token (to authenticate API requests).
- User preferences and extension settings.
The extension uses the following Chrome permissions:
- activeTab / scripting: To read the DOM of the currently active browser tab for detection of form fields, and to inject content scripts that populate form fields with extracted data. The extension only accesses the active tab when you explicitly initiate an extraction.
- sidePanel: To display the Helix Extract side panel UI within the browser.
- storage: To persist authentication tokens and user preferences locally.
- identity: To support Google OAuth sign-in.
The extension does not transmit browser history, visited URLs, or tab content to our servers unless you explicitly initiate a document extraction. Form field labels visible in the active tab are transmitted solely to enable the extraction mapping feature.
5. Purpose Limitation and No Training Use
We expressly commit that:
- Personal Data (including Document Data) processed under this DPA will not be used to train, fine-tune, evaluate, or improve any AI or machine learning model, including AWS Bedrock foundation models or any model operated by Helix Systems LLC.
- Personal Data will not be sold, rented, or disclosed to third parties except as set out in Section 6 (Sub-processors) or as required by law.
- We will not process Personal Data for any purpose other than those described in Section 3 of this DPA without your prior written consent.
6. Sub-processors
We engage the following sub-processors to provide the Service. All sub-processors are bound by data processing agreements that provide at least equivalent data protection to this DPA.
| Sub-processor | Service / AWS Service | Purpose | Data Processed | Location |
|---|---|---|---|---|
| Amazon Web Services (AWS) — Bedrock | AWS Bedrock (Claude models) | AI field extraction; maps document text to form field values | Document text (transient, in-context only; not retained or used for training) | us-east-1 |
| Amazon Web Services (AWS) — Textract | AWS Textract | OCR; extracts text blocks and bounding boxes from document images | Document images rendered from submitted files (transient) | us-east-1 |
| Amazon Web Services (AWS) — S3 | AWS S3 | Temporary document staging for multi-page extractions; extension binary distribution | Document content (deleted immediately after extraction, max 24 h) | us-east-1 |
| Amazon Web Services (AWS) — RDS | AWS RDS (PostgreSQL) | Persistent storage of user accounts, licenses, billing records, audit logs, and token transactions | Account Data (email, name, hashed credentials, Stripe IDs, audit logs — see Section 4.2) | us-east-1 |
| Amazon Web Services (AWS) — SES | AWS Simple Email Service | Transactional email delivery (verification, password reset, billing notifications) | Email address, message content | us-east-1 |
| Amazon Web Services (AWS) — Secrets Manager | AWS Secrets Manager | Secure storage of application secrets (JWT signing keys, OAuth credentials, DB credentials) | Application secrets (not Personal Data directly) | us-east-1 |
| Amazon Web Services (AWS) — KMS | AWS Key Management Service | Encryption key management for S3, RDS, and Secrets Manager | Cryptographic key material (not Personal Data directly) | us-east-1 |
| Amazon Web Services (AWS) — CloudWatch | AWS CloudWatch | Structured application logging, error alerting, and performance monitoring | Application logs (may include IP addresses and request metadata; no document content) | us-east-1 |
| Amazon Web Services (AWS) — ECS / ECR | AWS ECS (Fargate) & ECR | Container hosting and image registry for the Extract API | Runtime environment; processes all data in-transit through the API | us-east-1 |
| Amazon Web Services (AWS) — WAFv2 / ALB | AWS WAF & Application Load Balancer | Web application firewall (OWASP rule sets), DDoS protection, and HTTPS termination | Network traffic metadata (IP addresses, request headers); no payload inspection of document content | us-east-1 |
| Stripe, Inc. | Stripe Payments | Payment processing, subscription management, billing portal | Payment card data, billing address (document content is never transmitted to Stripe) | United States |
| Google LLC | Google OAuth 2.0 | Optional sign-in via Google account (users who choose Google OAuth) | Google account ID, email address (only when user selects "Sign in with Google") | United States |
| Vercel, Inc. | Vercel (Next.js hosting) | Hosting and CDN delivery of the marketing website and web application frontend | IP addresses, browser metadata for web requests; no Personal Data stored by Vercel | United States / Global CDN |
6.1 Sub-processor Change Notification
We will notify you at least 30 days before adding a new sub-processor or making a material change to an existing sub-processor that processes Document Data or core Account Data. Notification will be provided by:
- Email to the address associated with your account, and/or
- An in-app notice within the Helix Extract dashboard.
You may object to a new sub-processor within 14 days of notification. If you object and we cannot accommodate your objection, you may terminate your account and receive a pro-rata refund of any prepaid fees covering the unused portion of your subscription term.
7. Security Measures
7.1 Technical Measures
- Encryption in transit: All data is transmitted over HTTPS with TLS 1.2 minimum (TLS 1.3 preferred). No plaintext HTTP connections are accepted.
- Encryption at rest: AWS RDS databases, S3 buckets, and Secrets Manager are encrypted at rest using AWS KMS-managed keys (AES-256). Key rotation is enabled.
- Network isolation: All services are deployed within an AWS Virtual Private Cloud (VPC). The database subnet has no internet gateway. Access between services is controlled by security groups with least-privilege rules.
- Web Application Firewall: AWS WAFv2 is deployed on all ALBs with AWS Managed Rule Groups (Core Rule Set, Known Bad Inputs). DDoS protection is provided by AWS Shield Standard.
- Authentication: RS256 JWT tokens (signed with keys stored in AWS Secrets Manager) are required for all API access. Tokens are tracked in the database for revocation. Passwords are hashed using bcrypt.
- Secrets management: Application secrets (JWT keys, OAuth credentials, database credentials) are stored in AWS Secrets Manager, not in environment variables or code.
- Rate limiting: IP-based sliding-window rate limiting is enforced on all API endpoints to mitigate abuse.
7.2 Organizational Measures
- Access controls: Role-based access control (IAM) restricts AWS service permissions to the minimum necessary. Document content in transit is never accessible to Helix staff via the normal application path.
- Audit logging: All significant user actions (authentication, extractions, payments, admin operations) are recorded in an audit log table with IP address, user-agent, and outcome.
- Structured logging: Application logs are written to AWS CloudWatch with structured JSON format. Logs do not contain document content.
- Dependency management: Regular security updates and dependency vulnerability scanning.
- Incident response: Documented incident response procedures; see Section 9.
8. Confidentiality
We ensure that all Helix Systems personnel authorized to process Personal Data are bound by appropriate confidentiality obligations, whether by employment contract, contractor agreement, or equivalent binding instrument. These obligations survive the termination of the employment or contracting relationship.
Personal Data will not be disclosed to any third party except: (a) as described in the sub-processor list in Section 6; (b) as required by applicable law, regulation, or lawful court order (in which case we will notify you to the extent permitted by law); or (c) with your prior written consent.
9. Your Rights and Obligations
9.1 Controller Responsibilities
You are responsible for:
- Ensuring you have a lawful basis to process Personal Data through the Service (e.g., consent, legitimate interest, or contract).
- Providing any required notices to, and obtaining any required consents from, Data Subjects whose data is contained in documents submitted for extraction.
- Not submitting Special Category Personal Data (health, biometric, financial account numbers, government-issued IDs, etc.) unless you have implemented appropriate safeguards and have obtained all required permissions from Data Subjects.
- Complying with applicable data protection laws in your jurisdiction, including GDPR (if applicable) and CCPA (if applicable).
9.2 Your Rights
You have the right to:
- Audit and inspection: Request, no more than once per 12-month period, a written summary of our security practices and sub-processor list. You may also request that we complete a reasonable security questionnaire. If you require a formal on-site or third-party audit, we will cooperate with such audit at your reasonable expense and upon reasonable advance notice.
- Data Subject Request assistance: Instruct us to assist with Data Subject access, rectification, erasure, or portability requests as required by applicable law (see Section 10).
- Termination and deletion: Terminate this DPA and your account at any time; see Section 12 for deletion commitments.
- Sub-processor objection: Object to new sub-processors as described in Section 6.1.
10. Data Subject Requests
If we receive a request directly from a Data Subject regarding their Personal Data processed under this DPA:
- We will promptly notify you of the request (unless legally prohibited from doing so).
- We will not respond to the Data Subject directly without your authorization, except as required by law.
- We will provide you with reasonable assistance to fulfill the request within applicable timelines.
Document Data: Because Document Data is not persistently stored after processing, we cannot retrieve or delete specific Document Data from past extraction sessions. You should communicate this limitation to Data Subjects as appropriate.
Account Data: We can assist with access, rectification, and deletion of Account Data (name, email, etc.) upon your written instruction or by the account holder directly through account settings.
11. Security Incident Response
In the event of a confirmed security incident affecting your Personal Data:
- We will notify you within 72 hours of becoming aware of a Personal Data breach (as defined under GDPR Art. 4(12) or equivalent applicable law).
- Our initial notification will include: the nature of the breach, categories and approximate number of Data Subjects and records affected, likely consequences, and measures taken or proposed.
- We will promptly provide additional information as it becomes available.
- We will cooperate with your incident response and remediation efforts, and document all incidents and actions taken.
You are responsible for determining whether your applicable data protection law requires you to notify a supervisory authority or affected Data Subjects, and for making such notifications.
12. Term and Termination
This DPA is effective when you begin using Helix Extract and continues until your account is terminated or we cease providing the Service.
Upon termination of your account or the Terms of Service:
- Document Data: Already deleted immediately upon extraction completion; no additional action is required.
- Account Data: We will delete or anonymize all Account Data within 30 days of termination, except where retention is required by applicable law (e.g., tax and financial records, which may be retained for up to 7 years in compliance with US tax law).
- Upon your written request submitted within 30 days of termination, we will provide confirmation that deletion has been completed.
Sections 5 (Purpose Limitation), 8 (Confidentiality), and 11 (Incident Response) of this DPA survive termination to the extent they relate to Personal Data processed prior to termination.
13. International Data Transfers
All Personal Data processed under this DPA is stored and processed in the United States (AWS us-east-1, N. Virginia). This applies to both Document Data and Account Data.
For transfers of Personal Data from the European Economic Area (EEA), the United Kingdom, or Switzerland to the United States, we rely on:
- Standard Contractual Clauses (SCCs): European Commission Decision (EU) 2021/914 (Module Two: Controller to Processor), incorporated herein by reference. By entering into this DPA, both parties are deemed to have executed the applicable SCCs.
- UK Addendum: For transfers from the UK, the UK International Data Transfer Addendum (IDTA) to the SCCs applies.
- AWS sub-processor compliance: AWS processes data under its own GDPR-compliant Data Processing Addendum, including SCCs for international transfers, available at aws.amazon.com/compliance/gdpr-center/.
If you are subject to other data transfer restrictions (e.g., Swiss DSG or Canadian PIPEDA), please contact us at privacy@discoverhelix.com to discuss appropriate transfer mechanisms.
14. GDPR and CCPA Compliance
14.1 GDPR (EU/UK)
To the extent our processing of Personal Data is subject to the GDPR or UK GDPR:
- We act as a Processor (Art. 28 GDPR) and process Personal Data only pursuant to your documented instructions.
- We will provide reasonable assistance to help you fulfill your obligations under GDPR Arts. 32–36 (security, breach notification, DPIA, prior consultation).
- We will delete or return all Personal Data to you after the end of the provision of Services relating to processing, unless storage is required by Union or Member State law.
- We will make available to you all information necessary to demonstrate compliance with GDPR Art. 28 obligations.
14.2 CCPA / CPRA (California)
To the extent we process Personal Information of California residents as a Service Provider (as defined under the CCPA):
- We will not sell or share Personal Information as those terms are defined under the CCPA.
- We will not retain, use, or disclose Personal Information for any purpose other than providing the Service as specified in this DPA.
- We will not combine Personal Information received from you with Personal Information received from other sources except as permitted under the CCPA.
- We certify that we understand and will comply with the restrictions in this section.
15. Liability
Each party's liability under this DPA, whether in contract, tort, or otherwise, is subject to the limitations and exclusions set out in our Terms of Service, except that neither party limits its liability for: (a) death or personal injury caused by its negligence; (b) fraud or willful misconduct; or (c) any liability that cannot be excluded or limited under applicable data protection law.
16. Modifications
We may update this DPA to reflect changes in our processing practices, sub-processors, or legal requirements. Material changes will be communicated via email to the address associated with your account at least 30 days before they take effect. Continued use of the Service after the effective date constitutes acceptance of the updated DPA. If you do not accept a material change, you may terminate your account prior to the effective date.
17. Contact
For DPA-related inquiries, requests, or to exercise your rights:
Company: Helix Systems LLC
Privacy contact: privacy@discoverhelix.com
We aim to respond to all data protection inquiries within 5 business days.
Appendix A: Technical and Organizational Measures
The following measures are implemented as of the Last Updated date above. We may update these measures, provided that updates do not materially reduce the overall level of protection.
A.1 Infrastructure Security
- VPC isolation: All services run within an AWS Virtual Private Cloud with separate public, application, and database subnets. The database subnet has no direct internet access.
- Security groups: Least-privilege inbound/outbound rules. The database accepts connections only from the application layer security group.
- WAF: AWS WAFv2 with OWASP Core Rule Set (CRS) and Known Bad Inputs managed rule groups, deployed on all Application Load Balancers.
- DDoS protection: AWS Shield Standard on all ALBs.
- KMS encryption: Customer-managed KMS keys with automatic annual rotation for S3, RDS, and Secrets Manager encryption.
A.2 Application Security
- Authentication: RS256-signed JWT tokens stored in AWS Secrets Manager; bcrypt password hashing (adaptive work factor); token revocation list maintained in database.
- Authorization: License-based and role-based access control enforced on all API endpoints; developer bypass flag isolated to internal accounts.
- Input validation: Request body validation and sanitization on all API inputs.
- HTTPS only: TLS 1.2 minimum, TLS 1.3 preferred, enforced on all ALB listeners.
- Rate limiting: IP-based sliding-window rate limiting on all endpoints (10 requests/60 s in production).
- CORS: Exact-origin allowlist; no wildcard origins.
- Content Security Policy: Enforced on the Chrome extension via manifest CSP.
A.3 Operational Security
- Logging and monitoring: AWS CloudWatch structured logging and alerting; audit logs stored in the database.
- Dependency management: Regular security updates and vulnerability scanning of application dependencies.
- Container security: Application runs in AWS ECS Fargate (serverless containers); no persistent compute instances.
- Incident response: Documented incident detection, escalation, and notification procedures.
A.4 Data Security
- Document Data lifecycle: Document content is never written to persistent storage (base64 path). S3-staged documents are deleted immediately after extraction and at most within 24 hours.
- No document retention: No extraction results or document content is stored in the database; only opaque extraction IDs and token usage counts are recorded.
- Encryption at rest: AES-256 (KMS-managed) for all persistent storage (RDS, S3, Secrets Manager).
- Encryption in transit: TLS for all service-to-service communication within the VPC and for all external API calls.
Appendix B: Categories of Personal Data and Data Subjects
Categories of Data Subjects
- Helix Extract account holders (employees or contractors of the Controller)
- Individuals whose Personal Data appears in documents submitted for extraction (if any)
Categories of Personal Data
| Category | Examples | Storage |
|---|---|---|
| Identity data | First name, last name, email address | Persistent (Account Data) |
| Authentication data | Bcrypt password hash, Google OAuth ID | Persistent (Account Data) |
| Billing data | Stripe customer ID, Stripe subscription ID, payment intent references | Persistent (Account Data) |
| Usage data | Token balance, transaction history, extraction counts | Persistent (Account Data) |
| Security data | IP address, user-agent, JWT token hashes, audit log entries | Persistent (Account Data) |
| Document content | Any Personal Data appearing in submitted PDF, DOCX, or image files | Transient only (deleted after extraction) |
By using Helix Extract, you acknowledge and agree to this Data Processing Agreement. This DPA is incorporated into and forms part of the Terms of Service.