Regex Task¶
Overview¶
The Regex Task uses regular expressions to extract, validate, replace, or match patterns in text. Use it for complex data extraction, input validation, text cleaning, or pattern-based transformations.
When to use this task:
- Extract specific patterns (emails, phones, URLs)
- Validate input formats
- Clean and normalize text
- Parse structured data from unstructured text
- Find and replace patterns
- Split text by complex patterns
- Extract multiple matches
- Data scraping and parsing
Key Features:
- Full regex pattern support
- Multiple extraction modes
- Capture groups
- Find and replace
- Match validation
- Global and case-insensitive matching
- Extract all matches
- Comprehensive output fields
[SCREENSHOT NEEDED: Regex task configuration showing pattern editor and test input]
Quick Start¶
- Add Regex task
- Select operation (Extract/Replace/Match)
- Input text to process
- Write regex pattern
- Test with sample data
- Save
Simple Example:
Operation: Extract
Input: {{task_49001_message}}
Pattern: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
Output: {{task_10001_match}} = extracted email
Operations¶
Extract (Most Common)¶
Extract data matching pattern.
Configuration:
Output: - {{task_10001_match}} - First match - {{task_10001_matches_all}} - All matches (comma-separated) - {{task_10001_found}} - true/false - {{task_10001_count}} - Number of matches
Replace¶
Find pattern and replace with text.
Configuration:
Operation: Replace
Input: {{task_49001_text}}
Pattern: \d{3}-\d{3}-\d{4}
Replace With: [PHONE REDACTED]
Output: - {{task_10001_result}} - Text with replacements
Match/Validate¶
Check if text matches pattern.
Configuration:
Operation: Match
Input: {{task_49001_email}}
Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Output: - {{task_10001_matches}} - true/false - {{task_10001_found}} - true/false
Common Patterns¶
Email Addresses¶
Pattern:
Example:
Input: "Contact us at support@example.com or sales@company.org"
Output: {{task_10001_match}} → "support@example.com"
Output: {{task_10001_matches_all}} → "support@example.com, sales@company.org"
Phone Numbers¶
US Format (123-456-7890):
International (+1-123-456-7890):
Example:
Input: "Call me at 555-123-4567 or +1-555-987-6543"
Output: {{task_10001_matches_all}} → "555-123-4567, +1-555-987-6543"
URLs¶
Pattern:
Example:
Input: "Visit https://example.com or http://test.org/page"
Output: {{task_10001_matches_all}} → "https://example.com, http://test.org/page"
Numbers¶
Integers:
Decimals:
Currency:
Example:
Input: "Total: $1,234.56 and shipping: $45.00"
Pattern: \$\d+(?:,\d{3})*(?:\.\d{2})?
Output: {{task_10001_matches_all}} → "$1,234.56, $45.00"
Dates¶
MM/DD/YYYY:
YYYY-MM-DD:
Example:
Input: "Event on 02/08/2026 or 2026-02-08"
Pattern: \d{4}-\d{2}-\d{2}
Output: {{task_10001_match}} → "2026-02-08"
Usernames/IDs¶
Pattern:
Example:
Input: "Mentioned @john_doe and @jane_smith in the thread"
Output: {{task_10001_matches_all}} → "@john_doe, @jane_smith"
Order/Invoice Numbers¶
Pattern:
Example:
Input: "Orders ORD-5560 and INV-12345 are ready"
Output: {{task_10001_matches_all}} → "ORD-5560, INV-12345"
Capture Groups¶
Extract specific parts of matches.
Example - Extract Name and Email:
Pattern: ([A-Za-z\s]+)<([^>]+)>
Input: "John Doe <john@example.com>"
Outputs:
{{task_10001_match}} → "John Doe <john@example.com>" (full match)
{{task_10001_group_1}} → "John Doe" (first group)
{{task_10001_group_2}} → "john@example.com" (second group)
Example - Extract Domain:
Pattern: @([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})
Input: "support@example.com"
Output:
{{task_10001_group_1}} → "example.com"
Example - Parse Order Details:
Pattern: Order\s+(\w+-\d+)\s+for\s+\$(\d+\.\d{2})
Input: "Order ORD-5560 for $129.99 shipped"
Outputs:
{{task_10001_group_1}} → "ORD-5560"
{{task_10001_group_2}} → "129.99"
Real-World Examples¶
Example 1: Lead Email Domain Classification¶
Workflow: 1. Form Submission - Lead form 2. Regex - Extract email domain 3. Code Task - Check if business email 4. If Task - Route based on email type
Extract Domain:
Operation: Extract
Input: {{task_49001_email}}
Pattern: @([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})
Output: {{task_10001_group_1}} → "example.com"
Check Business Email (Code Task):
const domain = input.task_10001_group_1 || '';
const freeProviders = ['gmail.com', 'yahoo.com', 'hotmail.com', 'outlook.com', 'aol.com'];
const isBusiness = !freeProviders.includes(domain.toLowerCase());
return {
is_business_email: isBusiness,
domain: domain,
classification: isBusiness ? 'Business' : 'Personal'
};
Route:
If {{task_42001_is_business_email}} = true:
→ High priority sales path
Else:
→ Standard nurture path
Example 2: Extract and Validate Order Information¶
Workflow: 1. Inbound Email - Order confirmation email 2. Regex - Extract order number 3. Regex - Extract order total 4. Regex - Extract tracking number 5. MySQL Query - Insert order data
Extract Order Number:
Operation: Extract
Input: {{task_50001_body}}
Pattern: Order\s+#?(\w+-\d+)
Output: {{task_10001_group_1}} → "ORD-5560"
Extract Total:
Operation: Extract
Input: {{task_50001_body}}
Pattern: Total:\s*\$(\d+(?:,\d{3})*\.\d{2})
Output: {{task_18002_group_1}} → "1,234.56"
Extract Tracking:
Operation: Extract
Input: {{task_50001_body}}
Pattern: Tracking:\s*([A-Z0-9]{10,})
Output: {{task_18003_group_1}} → "1Z999AA10123456784"
MySQL Query:
Parameters:
Example 3: Phone Number Validation and Cleaning¶
Workflow: 1. Form Submission - Contact form 2. Regex - Validate phone format 3. If Task - Check if valid 4. Regex - Extract digits only 5. Phone Formatter - Format to E.164
Validate Format:
Operation: Match
Input: {{task_49001_phone}}
Pattern: ^[\+]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,6}$
Output: {{task_10001_matches}} → true/false
If Valid:
Condition: {{task_10001_matches}} = true
True: Continue processing
False: Email admin about invalid phone
Extract Digits:
Operation: Extract
Input: {{task_49001_phone}}
Pattern: \d+
Extract: All digits
Output: {{task_18002_matches_all}} → "5551234567"
Example 4: Parse Structured Email Content¶
Workflow: 1. Inbound Email - Vendor invoice email 2. Regex - Extract invoice date 3. Regex - Extract line items 4. Loop - Process each item 5. MySQL Query - Insert items
Extract Invoice Date:
Operation: Extract
Input: {{task_50001_body}}
Pattern: Invoice\s+Date:\s*(\d{2}/\d{2}/\d{4})
Output: {{task_10001_group_1}} → "02/08/2026"
Extract Line Items:
Operation: Extract (All Matches)
Input: {{task_50001_body}}
Pattern: (\w+)\s+Qty:\s*(\d+)\s+Price:\s*\$(\d+\.\d{2})
Flags: Global
Outputs (for each match):
{{task_18002_group_1}} → Product name
{{task_18002_group_2}} → Quantity
{{task_18002_group_3}} → Price
Code Task (Parse All Items):
const body = input.task_50001_body;
const pattern = /(\w+)\s+Qty:\s*(\d+)\s+Price:\s*\$(\d+\.\d{2})/g;
const items = [];
let match;
while ((match = pattern.exec(body)) !== null) {
items.push({
product: match[1],
quantity: parseInt(match[2]),
price: parseFloat(match[3])
});
}
return { items_json: JSON.stringify(items) };
Example 5: Redact Sensitive Information¶
Workflow: 1. CRM Trigger - Note added 2. Regex - Redact SSN 3. Regex - Redact credit cards 4. Regex - Redact phone numbers 5. Edit Client - Update with clean note
Redact SSN:
Operation: Replace
Input: {{task_47001_note}}
Pattern: \b\d{3}-\d{2}-\d{4}\b
Replace With: [SSN REDACTED]
Output: {{task_10001_result}}
Redact Credit Cards:
Operation: Replace
Input: {{task_10001_result}}
Pattern: \b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b
Replace With: [CARD REDACTED]
Output: {{task_18002_result}}
Redact Phones:
Operation: Replace
Input: {{task_18002_result}}
Pattern: \b\d{3}[-.]?\d{3}[-.]?\d{4}\b
Replace With: [PHONE REDACTED]
Output: {{task_18003_result}}
Update Note:
Advanced Patterns¶
Lookahead and Lookbehind¶
Extract text between markers:
Extract price without currency symbol:
Non-Capturing Groups¶
Extract domain without protocol:
Pattern: https?://([a-z0-9.-]+)
Input: "https://example.com"
Output: {{task_10001_group_1}} → "example.com"
Alternation (OR)¶
Match multiple patterns:
Pattern: (Mr|Mrs|Ms|Dr)\.?\s+([A-Za-z]+)
Input: "Dr. Smith and Ms. Jones"
Outputs multiple matches with titles and names
Greedy vs Non-Greedy¶
Greedy (default):
Non-Greedy:
Flags and Options¶
Case Insensitive¶
Match regardless of case:
Global¶
Find all matches (not just first):
Pattern: \d+
Flags: Global (g)
Input: "1 2 3 4 5"
Output: {{task_10001_matches_all}} → "1, 2, 3, 4, 5"
Multiline¶
^ and $ match line starts/ends:
Best Practices¶
Pattern Design¶
- Be specific - Narrow patterns reduce false matches
- Test thoroughly - Use regex testers (regex101.com)
- Escape special characters -
.,*,+,?,[,],(,),{,},^,$,|,\ - Use character classes -
\dfor digits,\wfor word chars - Anchor when validating - Use
^and$for full string match
Performance¶
- Avoid catastrophic backtracking - Be careful with nested quantifiers
- Use specific patterns -
\d{3}better than.+for 3 digits - Limit scope - Extract small text sections first
- Cache patterns - Reuse same regex in Variable task
- Don't overuse - Simple string operations might be faster
Maintainability¶
- Comment complex patterns - Document what pattern does
- Break into steps - Multiple simple regex > one complex
- Test edge cases - Empty strings, special characters
- Provide examples - Document expected inputs/outputs
- Version patterns - Track changes to regex patterns
Data Quality¶
- Validate before extract - Check if text exists
- Handle no matches - Provide defaults
- Trim whitespace - Clean extracted data
- Validate extractions - Verify format of extracted data
- Log failures - Track when patterns don't match
Troubleshooting¶
Pattern Not Matching¶
Check: 1. Escape special characters (\. \$ \* etc.) 2. Case sensitivity needed? 3. Anchors correct (^ and $)? 4. Pattern tested on actual data?
Debug: Test pattern at regex101.com with sample input
Too Many/Wrong Matches¶
Issue: Extracting unwanted text
Solutions: - Make pattern more specific - Use anchors (^ $) - Use non-greedy quantifiers (? after + or *) - Add negative lookaheads
Capture Groups Not Working¶
Issue: {{task_10001_group_1}} is empty
Check: - Using parentheses () for groups? - Pattern actually matching? - Accessing correct group number?
Example:
Performance Issues¶
Issue: Regex takes too long
Causes: - Catastrophic backtracking - Very long input text - Overly complex pattern
Solutions: - Simplify pattern - Use more specific character classes - Process smaller chunks - Use Code task for complex parsing
Special Characters Breaking Pattern¶
Issue: Pattern fails with special input
Solution: Escape special regex characters:
Frequently Asked Questions¶
What regex flavor does BaseCloud use?¶
JavaScript regex (ECMAScript). Most common patterns supported.
Can I test patterns before deploying?¶
Yes, use regex101.com with JavaScript flavor selected.
How many capture groups can I use?¶
Up to 9 groups ({{task_ID_group_1}} through {{task_ID_group_9}})
Can regex replace variables?¶
No, replacement is static text. Use Code task for dynamic replacements.
What if no match found?¶
Output fields are empty. Check {{task_10001_found}} = false.
Can I extract all matches separately?¶
Yes, use Global flag. Access via {{task_10001_matches_all}} (comma-separated).
How to match across multiple lines?¶
Use Multiline flag (m) and DOTALL flag (s) if available.
Can regex parse HTML/XML?¶
Not recommended. Use dedicated parser or Code task with DOM methods.
How to make pattern optional?¶
Use ? quantifier: https? matches "http" or "https"
Related Tasks¶
- Code Task - Complex text processing
- Formatter Task - Simple text transformations
- If Task - Conditional logic based on regex results
- Variable Task - Store regex results
- Phone Formatter - Phone-specific patterns