Defending collection processes in court

Forensically sound data collection refers to the process by which data is collected for ediscovery without any changes to the data or its metadata. Collection itself is part of the second stage of the ediscovery process, in which data marked for preservation is collected into a repository for later processing, review, analysis, and production. Collection supports the earlier stage of preservation, as collected data is available for later stages of discovery and will not be inadvertently deleted or modified.

To be forensically sound, a data collection process must be defensible, meaning that it is consistent, repeatable, and well documented. A forensically sound data collection process should be accompanied by an audit trail describing every step that was taken in collecting electronically stored information (ESI). The process should be subject to authentication, or proof that the collected data is the same data that a litigant used, unchanged from its original state. In short, the entire data collection process should be correct and explainable so that it can withstand scrutiny in a court of law.

One common method of forensically sound data collection involves the forensic imaging of a subject drive or storage device. While terminology may vary, generally a logical copy of a folder or drive includes only accessible files. This is the copy that a regular user would make; it doesn’t capture deleted or hidden files. Physical imaging, on the other hand, is more rigorous. This method produces a bitstream or bit-by-bit copy of an entire drive, including any deleted or hidden files that were missed in a logical copy.

In small, straightforward cases, self-collection by the owner of the data may be appropriate and even preferable to a more expensive forensically sound data collection. But merely accessing a file can be sufficient to modify its metadata, which may raise doubts about the validity of the production.

Therefore, forensically sound data collection is preferred in high-stakes cases or those where a party may be accused of spoliation. Additionally, complex ESI—including website information, encrypted data, and archived data—should be forensically collected by the IT department or a specialized vendor or partner to ensure that data and metadata are not inadvertently modified.

For an example of how not to conduct a forensically sound data collection, see Leidig v. BuzzFeed, Inc., No. 16 Civ. 542 (VM) (GWG) (S.D.N.Y. Dec. 19, 2017). In that case, the plaintiffs produced screenshots of websites and other documents with incorrect or missing metadata. The plaintiffs’ witness admitted that he “inadvertently changed or deleted the metadata” for some files when he tried to move them to a hard drive for production. The court imposed sanctions on the plaintiffs for their “amateurish collection” efforts.

Glossary definition

Forensically sound data collection refers to the process by which ESI is collected for ediscovery without any alteration or destruction of either the data or its metadata. To be forensically sound, the collection process must be defensible: consistent, repeatable, well documented, and authenticated.