Skip to main content
Secured AI - Protecting You in the AI Age
Pricing
Reference Guide

PII/PHI Detection Guide

Understanding sensitive data types and detection methods. This guide covers the 40+ data types that require protection in AI workflows, along with detection strategies and risk classification.

35 min readReference guideFor security teams

Sensitive Data Categories

Six primary categories of sensitive data that require protection in AI workflows.

high risk
Personal Identifiers
Data that directly identifies an individual

Examples:

Full nameSocial Security NumberDriver's licensePassport number
Pattern matching + context analysis
high risk
Contact Information
Data used to contact or locate an individual

Examples:

Email addressPhone numberPhysical addressIP address
Regex patterns + format validation
high risk
Financial Data
Banking, payment, and financial account information

Examples:

Credit card numbersBank accountRouting numbersFinancial records
Luhn algorithm + pattern matching
high risk
Health Information (PHI)
Medical records and health-related data

Examples:

Medical recordsPrescriptionsLab resultsInsurance IDs
Healthcare terminology + identifier patterns
high risk
Authentication Data
Credentials and access tokens

Examples:

PasswordsAPI keysOAuth tokensSSH keys
Entropy analysis + pattern matching
medium risk
Location Data
Geographic and location information

Examples:

GPS coordinatesStreet addressesZip codesGeolocation
Geographic format patterns

PII/PHI Types Reference

Comprehensive list of sensitive data types with detection methods.

Data TypeCategoryRiskRegexML
SSNIdentifierhighYesYes
Full NameIdentifierhigh-Yes
EmailContacthighYes-
PhoneContactmediumYes-
Credit CardFinancialhighYes-
Bank AccountFinancialhighYesYes
AddressLocationmedium-Yes
Date of BirthIdentifiermediumYesYes
IP AddressTechnicalmediumYes-
Medical RecordPHIhighYesYes
API KeyCredentialhighYesYes
PasswordCredentialhigh-Yes

Detection Methods

Understanding different approaches to PII/PHI detection.

Pattern Matching (Regex)
Rules-based detection using regular expressions for structured data formats.

Strengths

  • +High precision for structured data
  • +Fast execution
  • +Predictable results

Limitations

  • -Limited context understanding
  • -Cannot detect unstructured PII
  • -Maintenance overhead

Best for: SSN, credit cards, email, phone numbers

Machine Learning (NER)
Named entity recognition models trained to identify PII in unstructured text.

Strengths

  • +Handles context and variations
  • +Detects unstructured PII
  • +Improves over time

Limitations

  • -Requires training data
  • -May have false positives
  • -Computationally intensive

Best for: Names, addresses, free-form text

Hybrid Approach
Combines regex patterns with ML models for comprehensive detection.

Strengths

  • +Best of both methods
  • +Higher recall and precision
  • +Handles edge cases

Limitations

  • -More complex to implement
  • -Requires tuning
  • -Higher latency

Best for: Enterprise-grade protection

Risk Classification Matrix

How to prioritize protection based on data sensitivity.

High Risk
Data that can directly lead to identity theft, financial loss, or regulatory violations.
SSNCredit CardBank AccountMedical RecordsPasswordsAPI Keys
Always mask before AI transmission
Medium Risk
Data that could contribute to identification when combined with other information.
Phone NumberEmail AddressDate of BirthIP AddressEmployee ID
Mask based on context and policy
Lower Risk
Data with limited sensitivity but still requiring consideration in aggregate.
First Name OnlyCity/StateJob TitleOrganization Name
Monitor and log for audit

Implementation Steps

How to implement PII/PHI detection in your organization.

1
Data Discovery
Identify where sensitive data exists in your AI workflows
  • Audit current AI tool usage
  • Map data flows to AI systems
  • Identify data sources
2
Classification
Categorize data by type and risk level
  • Apply data taxonomy
  • Assign risk levels
  • Document data lineage
3
Detection Configuration
Configure detection rules for your data types
  • Enable built-in detectors
  • Create custom patterns
  • Set confidence thresholds
4
Protection Policies
Define how each data type should be handled
  • Set masking rules
  • Configure reveal permissions
  • Establish exceptions
5
Monitoring
Continuously monitor and improve detection accuracy
  • Review detection logs
  • Tune false positives
  • Update patterns
Automate PII/PHI Detection
Secured AI detects 40+ sensitive data types with high accuracy, protecting your data before it reaches any AI system.

Free trial - No credit card required