Follow Us

Processing Specifications

How We Ensure Data Integrity with Unique Control Numbers

Below is an overview of our standard specifications when processing Electronically Stored Information (ESI). Please note that these standards may change from time to time due to advancements in technology and improvements to workflows. Should you have any questions or concerns related to the specifications defined herein, then please contact our professional services team at pm@everestdiscovery.com

Numbering Settings

Numbering Type:

  • Documents will receive a unique identifier during data processing. This unique identifier is referred to as the “Control Number”.
  • By default, the Control Number will be prefixed with “REL” and contain 7 leading digits (e.g.; REL0000001)
  • The Control Number will be applied to each record with the next available number for that prefix.

Parent/Child Numbering

  • Child records will always receive a sequential Control Number immediately following their parent record. The only exception being if a retried exception file is published to a workspace. If a retried exception file is published to the workspace, then the retried children will be suffixed with the parent.

Global Deduplication

Global deduplication is applied on promotion to review by default. This means that during data processing only one copy of each parent record recognized during data processing is promoted to each workspace (along with its associated attachments).

Deduplication  in Relativity is applied only on Level 1 non-container parent files. If a child file (Level 2+) has the same processing duplicate hash as a parent file or another child file, then they will not be deduplicated, and they will be published to Relativity, regardless of whether the hash field has the same value. This is done to preserve family integrity.

Please contact our professional services team should you have a need to apply deduplication on a custodial level, or not at all.

Timezone

All files are processed in Coordinated Universal Time (UTC) by default. The time zone used to display date and time on a processed document. This selection determines the default time zone on the processing data sources that you create and then associate with a processing set. The default time zone is applied from the processing profile during the discovery stage.

Application of a standard time zone can help to normalize data sets when custodians reside in different regions. Please contact our professional services team should a need exist to change the default timezone.

Embedded Objects

All child files (attachments, embedded objects, images, and other non-parent files) recognized during discovery are extracted during data processing with the exception of the following:

  • Microsoft Embedded Images
  • Email inline Images

Some objects from specific file types may not be extractable during data processing. Please contact our team should you have any concerns, or wish to change the embedded object extraction behavior.

Extraction Settings

Email files will be output as MSG files for all Outlook, Lotnus Notes, and Bloomberg file types. Text will be extracted from Excel, PowerPoint, and Word documents leveraging the file’s native application. OCR will be performed in English by default for records which do not have text recognized.

DeNIST

All files found on the National Institute of Standards and Technology (NIST) list are removed prior to processing to ensure they are not promoted to review. Relativity makes new versions of the NIST list available shortly after the National Software Reference Library (NSRL) releases them quarterly. The list will change overtime.

By default, DeNIST’ing will not break any parent/child groups, regardless if the files are on the NIST list. Please let our professional services team know if there is a need to disable the DeNIST functionality during processing.

Metadata Fields

The following metadata fields are mapped to our ECA and Review workspaces by default and will be extracted where available. Please let us know should you have any questions on the below:

Name

Field Type

Is Relational

MD5 Hash

Fixed-Length Text

Yes

Family Group

Fixed-Length Text

Yes

Conversation Index

Long Text

No

Created Date/Time

Date

No

Last Modified Date/Time

Date

No

Email Received Date/Time

Date

No

Email Sent Date/Time

Date

No

Delivery Receipt Requested

Yes/No

No

Control Number Beg Attach

Fixed-Length Text

No

Control Number End Attach

Fixed-Length Text

No

File Extension

Fixed-Length Text

No

Email BCC

Long Text

No

Email CC

Long Text

No

Email From

Fixed-Length Text

No

Email To

Long Text

No

File Name

Fixed-Length Text

No

File Size

Decimal

No

Number of Attachments

Whole Number

No

Sort Date/Time

Date

No

Conversation Family

Fixed-Length Text

No

Attachment List

Long Text

No

Last Printed Date/Time

Date

No

File Type

Fixed-Length Text

No

Extracted Text Size in KB

Decimal

No

Email Subject

Long Text

No

Lotus Notes Other Folders

Long Text

No

All Custodians

Multiple Object

No

All Paths/Locations

Long Text

No

Attachment Document IDs

Long Text

No

Conversation

Long Text

No

Created Date

Long Text

No

Created Time

Long Text

No

DeDuped Custodians

Multiple Object

No

DeDuped Paths

Long Text

No

Title

Long Text

No

Author

Fixed-Length Text

No

Email BCC (SMTP Address)

Long Text

No

Email CC (SMTP Address)

Long Text

No

Child MD5 Hash Values

Long Text

No

Child SHA1 Hash Values

Long Text

No

Child SHA256 Hash Values

Long Text

No

Comments

Long Text

No

Company

Fixed-Length Text

No

Contains Embedded Files

Yes/No

No

Last Saved Date/Time

Date

No

Document Subject

Long Text

No

Unprocessable

Yes/No

No

Unified Title

Long Text

No

Track Changes

Yes/No

No

Email Created Date/Time

Date

No

Email Last Modified Date/Time

Date

No

Image Taken Date/Time

Date

No

Last Accessed Date/Time

Date

No

Meeting End Date/Time

Date

No

Meeting Start Date/Time

Date

No

Primary Date/Time

Date

No

Email Store Name

Fixed-Length Text

No

Last Saved By

Fixed-Length Text

No

MS Office Document Manager

Fixed-Length Text

No

MS Office Revision Number

Fixed-Length Text

No

Message ID

Fixed-Length Text

No

Original Author Name

Fixed-Length Text

No

Email Original Author

Fixed-Length Text

No

Original File Extension

Fixed-Length Text

No

Parent Document ID

Fixed-Length Text

No

SHA1 Hash

Fixed-Length Text

No

SHA256 Hash

Fixed-Length Text

No

Email Sender Name

Fixed-Length Text

No

Email Recipient Domains (BCC)

Multiple Object

No

Email Recipient Domains (CC)

Multiple Object

No

Email Recipient Domains (To)

Multiple Object

No

Email Sender Domain

Multiple Object

No

Email Format

Single Choice

No

Email Sensitivity

Single Choice

No

Importance

Single Choice

No

Media Type

Single Choice

No

Message Class

Single Choice

No

Message Type

Single Choice

No

Outlook Flag Status

Single Choice

No

Password Protected

Single Choice

No

Record Type

Single Choice

No

Text Extraction Method

Single Choice

No

Email Recipient Count

Whole Number

No

Email Has Attachments

Yes/No

No

Email Modified Flag

Yes/No

No

Email Sent Flag

Yes/No

No

Email Unread

Yes/No

No

Excel Hidden Columns

Yes/No

No

Excel Hidden Rows

Yes/No

No

Excel Hidden Worksheets

Yes/No

No

Excel Pivot Tables

Yes/No

No

Has Hidden Data

Yes/No

No

Has OCR Text

Yes/No

No

Is Embedded

Yes/No

No

Is Parent

Yes/No

No

PowerPoint Hidden Slides

Yes/No

No

Email Read Receipt Requested

Yes/No

No

Speaker Notes

Yes/No

No

Suspect File Extension

Yes/No

No

Document Title

Long Text

No

Email Categories

Long Text

No

Email Entry ID

Long Text

No

Email Folder Path

Long Text

No

Email In Reply To ID

Long Text

No

Email From (SMTP Address)

Fixed-Length Text

No

Keywords

Long Text

No

Last Accessed Date

Long Text

No

Last Accessed Time

Long Text

No

Last Modified Date

Long Text

No

Last Modified Time

Long Text

No

Last Printed Date

Long Text

No

Last Printed Time

Long Text

No

Last Saved Date

Long Text

No

Last Saved Time

Long Text

No

Meeting End Date

Long Text

No

Meeting End Time

Long Text

No

Meeting Start Date

Long Text

No

Meeting Start Time

Long Text

No

Message Header

Long Text

No

Native File

Long Text

No

Other Metadata

Long Text

No

Email Received Date

Long Text

No

Email Received Time

Long Text

No

Email Recipient Name (To)

Long Text

No

Email Sent Date

Long Text

No

Email Sent Time

Long Text

No

Source Path

Long Text

No

Email To (SMTP Address)

Long Text

No

Password

Long Text

No

Search Index

A dtSearch index will be created in the ECA or Review workspace and will be populated with the Extracted/OCR text extracted during data processing. The only text which will be searchable in the workspace is the Extracted/OCR text. Should a need exist to include more than the extracted text in the dtSearch index, then please reach out to our professional services team.

Disclaimer

This documentation is not an all-inclusive list of all settings and options available during data processing, but does detail many common processing specifications for ESI. Please reach out to our professional services team should you require any additional information related to the processing specifications for your project.