Obfuscation: steganography, tokenization, data masking

1.4 Cryptographic solutions

📘CompTIA Security+ SY0-701


Definition:
Obfuscation is the process of making data or code harder to understand or read. The main goal is security and privacy—protecting sensitive information from unauthorized access, even if someone gets hold of it.

Think of it as “hiding or disguising” information so attackers or unauthorized users cannot easily understand or misuse it.

There are three main types used in IT security:

  1. Steganography
  2. Tokenization
  3. Data masking

1. Steganography

Definition:
Steganography is hiding data within another file so that the presence of the data is not obvious. Unlike encryption, where the data is scrambled but still visible as scrambled content, steganography hides the very existence of the data.

Key Points for Exam:

  • Used to secretly transmit information.
  • The hidden data can be embedded in:
    • Images (e.g., hiding a secret message inside a .png)
    • Audio files (e.g., hiding data in a .mp3 file)
    • Video files
  • Often combined with encryption for extra security.
  • Detection is harder than encryption because someone may not even know the hidden data exists.

Example in IT context:

  • A system administrator hides sensitive configuration data inside an image file on a server to avoid exposure in logs or accidental leaks.

Exam Tip:

  • Know the difference: encryption = scrambles content; steganography = hides content.

2. Tokenization

Definition:
Tokenization replaces sensitive data with a non-sensitive placeholder (token) that has no meaningful value outside the system. The original data is stored securely in a separate system called a token vault.

Key Points for Exam:

  • Tokens cannot be reversed without the token vault.
  • Protects sensitive information like:
    • Credit card numbers
    • Social Security numbers
    • Personal health information
  • Reduces the risk if the system is breached, because attackers only get meaningless tokens.
  • Often used in payment systems and cloud applications.

Example in IT context:

  • A payment system replaces a user’s real credit card number with a token in the database. If hackers access the database, they only get the tokens, not real card numbers.

Exam Tip:

  • Tokenization is not encryption, because tokens cannot be mathematically reversed like encrypted data—they rely on a vault lookup.

3. Data Masking

Definition:
Data masking hides or obscures specific data while keeping the structure or type of data intact. Unlike tokenization, the masked data may look realistic but is not real.

Key Points for Exam:

  • Often used in testing or development environments to protect sensitive production data.
  • Methods of masking:
    • Substitution: Replace real data with fake but realistic data (e.g., replacing real names with “User1”, “User2”)
    • Shuffling: Shuffle existing data to remove associations but keep value format
    • Redaction: Hide parts of data (e.g., showing only the last 4 digits of a credit card)
  • Keeps systems functional without exposing sensitive data.

Example in IT context:

  • Developers testing an HR application see employee records, but real names and salaries are replaced with fake values, so no real personal data is exposed.

Exam Tip:

  • Masking = protecting data in non-production environments while still being usable.

Key Differences Between These Techniques

TechniquePurposeReversibilityTypical Use Case
SteganographyHide existence of dataCan be reversed if knownCovert communication
TokenizationReplace sensitive dataNot reversible without vaultPayment data, PHI
Data MaskingHide/obscure sensitive dataNot reversible (fake data)Testing, training, reporting

Exam-Focused Summary

  1. Obfuscation = hiding data to protect it.
  2. Steganography = hide the data in other files.
  3. Tokenization = replace sensitive data with tokens stored elsewhere.
  4. Data masking = replace sensitive data with fake or obscured data while keeping format.
  5. Remember when and why each is used, as questions often focus on use cases and differences.

Leave a Reply

Your email address will not be published. Required fields are marked *

Buy Me a Coffee