Encryption, hashing, and de-identification are three distinct technical privacy controls that protect personal data from unauthorized disclosure. They differ in reversibility, mechanism, and appropriate use case.
**Encryption** is a reversible transformation that converts plaintext into ciphertext using a cryptographic key. Only an authorized party with the correct key can recover the original data. Examples: AES-256-GCM for data at rest; TLS 1.3 for data in transit.
**Hashing** is a one-way, deterministic transformation that converts input data into a fixed-length digest. The transformation cannot be reversed, but the same input always produces the same output — enabling verification without storing the original. Examples: argon2id and bcrypt for password storage; SHA-256 for data integrity checks.
**De-identification** removes or transforms data elements so that individuals cannot reasonably be identified from the resulting dataset. Techniques range from rule-based (HIPAA Safe Harbor: remove 18 defined identifiers) to statistical (k-anonymity, l-diversity, t-closeness, differential privacy) to substitution-based (tokenization, pseudonymization).
These controls form the technical layer of a privacy architecture. They reduce data sensitivity at storage, transit, and use — and are selected and layered based on whether the original data must remain recoverable, which regulatory obligations apply, and what re-identification risk is acceptable.
Where it stops · what it isn't
- —Encryption IS a reversible control requiring active key management. It IS NOT a substitute for access controls or data minimization.
- —Hashing IS appropriate for one-way verification (passwords, integrity checks). It IS NOT appropriate as a standalone control for structured PII fields (email, SSN, ZIP) where the small, predictable input space enables rainbow-table and dictionary attacks.
- —De-identification IS a transformation intended to reduce re-identification risk to an acceptable threshold. It IS NOT guaranteed anonymization — if re-identification is feasible using auxiliary data, GDPR and equivalent regulations may still apply to the dataset.
- —Tokenization and format-preserving encryption (FPE) ARE specialized encryption variants that preserve data format for legacy-system compatibility. They ARE NOT de-identification: the original data remains recoverable via a token vault or decryption key.
- —Pseudonymization (replacing direct identifiers with pseudonyms) IS a privacy control recognized by GDPR Article 4(5). It IS NOT full anonymization — pseudonymized data remains personal data under GDPR.
- —Key management IS a critical, separate dependency of encryption. Weak or absent key management negates encryption's protection regardless of algorithm strength.
Connected concepts in the graph
Every cubelet sits in a knowledge graph. Here's what this one connects to.
PART OFPrivacy Architecture — Technical Privacy Controls (ISACA CDPSE Domain 2)
REQUIRESKey Management (rotation, escrow, HSM/KMS infrastructure)Data Classification and Sensitivity Labeling
ENABLESPrivacy-by-Design ArchitectureRegulatory Compliance (GDPR Article 32, HIPAA Security Rule, PCI-DSS, CCPA)Privacy-Enhancing Technologies (PETs): federated learning, differential privacy, synthetic data
RELATED TOAccess Controls and Identity Management (Technical Privacy Controls)Data Masking and Anonymization
CONSTRAINSData Utility for Analytics and AI/ML Pipelines