Remove metadata from a Word document before filing

A Word document's DOCX container holds two metadata files: docProps/core.xml (title, creator, last-modified-by, revision count, timestamps) and docProps/app.xml (company name, manager, template). These survive a "Save As PDF" in most applications — so the resulting PDF can inherit fields like the attorney who drafted it and the firm's internal template name.

Beyond static metadata, DOCX files can carry tracked changes (insertions and deletions recorded in word/document.xml) and a comments.xml part with reviewer notes. If you flatten a document to PDF while tracked changes are still accepted, the PDF body is clean. But sending the DOCX itself exposes the full revision history.

PrepFile's metadata scrubber unzips the DOCX container, reads both properties files, scans document.xml for tracked change markers (w:ins and w:del tags), checks for comments.xml, produces a report, clears the author/company/revision fields, and removes comments.xml. Tracked changes are flagged in the report but not removed — only accepting or rejecting them in Word does that correctly.

Open the tools — free, no upload

Pre-filing DOCX checklist

  1. Drop the DOCX into the Metadata Scrubber tool.
  2. Read the report: note whether tracked changes or a company name appeared.
  3. Open the cleaned DOCX in Word. If tracked changes were found, accept or reject them (Review → Accept All), then re-export to PDF.
  4. Run the cleaned PDF through the E-Filing Checker to confirm page size and searchability.

Questions

Why doesn't the scrubber remove tracked changes?

Tracked changes are embedded in the main document body XML in a way that requires understanding the document structure to resolve correctly. Blindly deleting the markup tags leaves malformed XML. Accept all changes in Word first, then strip metadata from the resulting clean document.

Does "Save As PDF" in Word carry the metadata over?

Yes — the PDF info dict is populated from the DOCX properties unless you explicitly clear them before export. Scrub the DOCX first, or scrub the resulting PDF with PrepFile's PDF mode.

What is the revision number?

Word increments an internal revision counter every time the document is saved. A high number (e.g. 147) signals extensive revision history. Courts don't read it, but opposing counsel reviewing a produced DOCX could note it.

Try the tools now