How to Redact a PDF for GDPR Compliance
Redacting a PDF for GDPR is not the same as blacking out text for convenience. You must remove or irreversibly obscure personal data so it cannot be recovered, while still disclosing everything the data subject is entitled to see. You must also protect third parties: names, opinions, and identifiers belonging to anyone other than the requester usually need to be redacted before you release documents under a Subject Access Request or other disclosure.
Start with the legal purpose of the document
Before you touch a PDF, clarify why you are redacting. Under access-request rules, the data subject is generally entitled to their own personal data — not everyone else's. Emails, HR files, and case notes often mix the requester's data with colleagues' names, medical details, or legal advice. Your job is to disclose the requester's information while redacting identifiers and sensitive content relating to others, unless a narrow exception applies (and you should document that decision).
For internal investigations, tribunal bundles, or regulator correspondence, the same principle applies: only disclose what the law and your policy allow, and redact the rest with a clear, consistent method.
What to redact in practice
Typical redactions for GDPR-aligned disclosure include:
- Third-party names, direct identifiers, and contact details
- National insurance numbers, staff IDs, and other unique references
- Opinions or assessments about people other than the data subject
- Legally privileged or confidential material where an exemption applies
- Any data that would unfairly harm another person if disclosed
Redaction must be permanent in the delivered file: "hidden" text or metadata left in the PDF has led to enforcement action elsewhere. Always verify the exported file opens cleanly and that removed content cannot be copied from behind black boxes.
A sensible workflow
First, inventory which documents you will disclose. Second, on Pro, use AI-powered detection to flag names, addresses, IDs, and other candidates you might miss by eye (on the free tier, mark manually). Third, human review every page — automation will not catch every edge case. Fourth, apply redactions and export a final PDF. Fifth, keep a short internal record of what was withheld and why (without storing unnecessary copies of unredacted files).
When you use AI detection on Pro and Team, document that processing step in your legal basis and Article 30 register alongside your review and export steps. See our security page for architectural details you can reference in your DPIA.
Common mistakes to avoid
- Using opaque boxes without removing underlying text (recoverable with copy-paste)
- Forgetting headers, footers, watermarks, and comment threads
- Missing handwritten notes, stamps, and scanned signatures
- Inconsistent treatment of the same third party across documents
Frequently asked questions
Do I need to redact third parties in a DSAR/SAR response?
Usually yes. The data subject is generally entitled to their own personal data, not to other people's. Colleagues' names, customer details, and similar identifiers often must be redacted unless a specific legal exception applies. Your DPO or counsel should confirm for each pack.
Is hiding text behind a black box enough for GDPR?
Not if the underlying text can still be copied or searched. Redaction for disclosure should be permanent in the file you hand over. Always verify the exported PDF does not leak hidden text or metadata you intended to remove.
Can Ghost redact my PDF with strong privacy controls?
Yes. On Pro and Team, AI-powered detection analyses documents and suggests redactions. The free tier is a manual redaction tool. See our security page for how document processing works.