Glossary
The following glossary provides general definitions for many e-discovery terms. It is not intended to provide complete technical or legal definitions which are provided by other professional organizations.
An additional glossary is provided by The Sedona Conference®. Click here to download The Sedona Conference glossary.
Active Data
Data that is directly available to operating system and/or application software.
Active Records
Records residing in storage that is currently being used in day to day processes.
Amended Federal Rules of Civil Procedure
See "Federal Rules of Civil Procedure"
Analysis
The process of determining relevancy of electronic discovery materials through evaluation based on the variables of the case.
Application or Application Software
See "Software Application."
Archival data
Digital information that is retained for long term storage, not immediately available and often stored on removable media.
Archive
A copy of data on a computer drive or on a portion of a drive, maintained for historical reference.
Attachment
A memorandum, letter, spreadsheet or any other electronic document appended to another document or e-mail.
Attribute
A data characteristic that identifies it, such as type, length or location.
Audit Log / Audit Trail
A chronological record of users' behavior: when they logged in, time engaged in specific activities, attempted security breaches.
Author
A person or position who originated a document. Sometimes software can automatically capture the author.
Backup
A copy of active data, intended for use in restoration of data.
Backup Tape
Magnetic tape used to store copies of ESI for restoration or recovery purposes.
Batch file
Instructions defined within a file used to instruct a computer program to perform a function or series of functions.
Bates number
Sequential numbering used to track documents and images in data sets. Each page has a unique number.
Bit/Bit Map
The smallest unit of computer data. There are 8 bits in a byte.
Boolean Search
Boolean refers to a system of logic developed by an early computer pioneer, George Boole. In Boolean searching, an "and" operator between two words results in a search for documents containing both of the words. An "or" operator between two words creates a search for documents containing either of the target words. A "not" operator between two words creates a search result containing the first word but excluding the second.
Business Risk Management
A structured approach to managing uncertainty related to a threat, through a sequence of human activities including risk assessment, strategies development to manage it and mitigation of risk using managerial resources.
Byte
The basic measurement of computer data; consists of 8 bits.
Cache
A form of high-speed memory used to temporarily store frequently accessed information. Once the information is stored, it can be retrieved quickly from memory rather than from the hard drive.
Case De-Duplication
Retains only single copies of documents per case. For example, if an identical document resides with Mr. A, Mr. B and Mr. C, only the first occurrence of the file will be saved (Mr. A's). Contrast with Custodian De-Duplication and Production De-Duplication.
Chain of Custody
Documentation regarding possession, movement and location of evidence from the time it was obtained to the time it is presented in court.
Chain of Custody Procedure
Procedure that specifies how evidence is to be moved from location to location to preserve its integrity and prove to the court that the evidence has not been altered.
Clawback Agreement
An agreement that sets forth procedures to protect against waiver of privilege due to inadvertent production of documents or data.
Client
Computer system that requests services from another computer system.
Cluster
In operating systems that use a file allocation table (FAT) architecture, the smallest unit of storage space required for data written to a drive. Also called an allocation unit.
Collection
The process of gathering electronically stored information.
Compound document
A file that combines more than one document into one by embedding objects or linked data. Data may be from different applications.
Compression
A technology for storing data in fewer bits, it makes data smaller so less disk space is needed to represent the same information. Compression programs like WinZip and UNIX compress are valuable to network users because they save both time and bandwidth. Data compression is also widely used in backup utilities, spreadsheet applications and database management systems.
Computer
Includes, but is not limited to, network servers, desktops, laptops, notebook computers, employees’ home computers, mainframes, the PDAs of [party name] and its employees (personal digital assistants, such as PalmPilot, Blackberry and other such handheld computing devices), digital cell phones, smartphones and pagers.
Computer forensics
Specialized techniques to recover, authenticate and analyze electronic data.
Concept search
Analyzing conceptual groups of words in a document to understand the true meaning, rather than searching only for a word (keyword).
Container file
One file that contains multiple documents and document types. Requires decompression or ripping to process.
Contextual search
Searching surrounding text to analyze the context in which a word is used.
Corporate Information Governance
See "Information Governance"
Corporate Investigations
Criminal, regulatory, securities and/or other investigations pertaining to the activities and/or electronically stored information of one or more corporations.
Cost Sharing
See “Cost Shifting”
Cost Shifting
Shifting the cost or a portion of the cost of production of inaccessible electronically stored documents to the requesting party.
Culling
Removing a document prior to production or review; generally reduces the volume of data that is produced or reviewed.
Custodian
See "Data Custodian"
Custodian De-Duplication
Culls a document if multiple copies of that document reside within the same custodian's data set. For example, if Mr. A and Mr. B each have a copy of a specific document and Mr. C has two copies, the system will maintain one copy each for Mr. A, Mr. B and Mr. C. Contrast with case de-duplication and production de-duplication.
Customer-Added Metadata
Data or work product created by a user while reviewing a document. For example, annotation text of a document or subjective coding information. Contrast with Vendor-Added Metadata.
Data
Any and all electronically stored information (ESI) on media that may be accessed by a computer.
Data Categorization
Categorization and sorting of ESI.
Data Custodian
Person having administrative control of a document; for example, the data custodian of an e-mail is the owner of the mailbox which contains the e-mail.
Data Extraction
Retrieving data from documents.
Data Formats
The organization of information for display, storage or printing. Data is maintained in certain common formats so that it can be used by various programs, which may only work with data in a particular format. This term is commonly used in the industry when asking another person about the state in which particular information exists. For example, "What format is it in, PDF or HTML?"
Data Mining
The process to cull data to extract ESI for production.
Database
A set of data elements, usually stored in one location and made available to more than one user.
De-Duplication
The process of identifying (or some vendors include actually removing) additional copies of identical documents in a document collection. There are three types of de-duplication: case, custodian and production.
De-NIST
Screening files against the NIST list of computer file types. Separates those files generated by a user from those generated by a system.
Deleted File
A file with disk space that has been designated as available for reuse. Although a user may "erase" or "delete" a file, what is really erased is a reference to that file in a table on the hard disk. Unless overwritten with new data, a "deleted" file can be as intact on the disk as any "active" file you would see in a directory listing.
Digital Camera
A camera that stores still or moving pictures in a digital format (jpg, GIF, etc.).
Digital Certificate
A means of providing heightened security for the access of a website or a specific document. Digital certificates are electronic records that contain keys used to decrypt information, especially information sent over a public network like the Internet. Digital certificates must be applied for and granted by a Certificate Authority (CA).
Discovery
The process of identifying, preserving, collecting, processing, reviewing and producing evidence for legal review.
Discovery Compliance
Complying with the federal, state and local regulations around electronic discovery (e.g. Federal Rules of Civil Procedure).
Discovery Conference
See "Meet and Confer"
Discovery Cost Allocation
The distribution of the costs incurred by organizations who are compelled to produce ESI.
Discovery Response
Activities performed in response to a request for discovery.
Discovery Response Plan
A plan developed to guide the activities to be taken in response to a request for discovery. A Discovery Response Plan may be developed reactively for a specific request or may be developed proactively within highly litigious organizations to help mitigate the cost and risk of discovery with preemptive planning.
Discovery Response Strategy
A strategic plan developed to guide the response to a request for discovery.
Discovery Response Team
A team of individuals assembled to coordinate and execute a Discovery Response Plan. A discovery response team may include members from legal, IT, business management and other resources from within an organization legal consulting vendors and outside counsel.
Document
Includes but is not limited to any ESI preserved on magnetic or optical storage media as an “active” file or files (readily readable by one or more computer applications or forensics software); any “deleted” but recoverable electronic files on said media; any electronic file fragments (files that have been deleted and partially overwritten with new data); and slack (data fragments stored randomly from random access memory on a hard drive during the normal operation of a computer [RAM slack] or residual data left on the hard drive after new data has overwritten some but not all of previously stored data).
Document Metadata
Data stored in the document about the document. Often this data is not immediately viewable in software application used to create/edit the document, but often can be accessed via a "Properties" view. Contrast with File System Metadata and E-Mail Metadata.
E-Discovery
See "Electronic Discovery"
E-mail Metadata
Data stored in the e-mail about the e-mail. Often this data is not even viewable in e-mail client application used to create the e-mail. The amount of e-mail metadata available for a particular e-mail varies greatly depending on the e-mail system. Contrast with file system metadata and document metadata.
ED
See "Electronic Discovery"
EDD
Electronic Data Discovery. See Electronic Discovery.
eDiscovery
See "Electronic Discovery"
EDRM
See "Electronic Discovery Reference Model"
EDRM Metrics
The EDRM Metrics project is designed to provide a standard approach and generally accepted language for measuring the full range of electronic discovery activities. The Metrics project follows the electronic discovery process described in the Electronic Discovery Reference Model: identification, preservation, collection, processing, review, analysis and production. For each stage of the process, the Metrics project will offer guidelines for how to measure associated costs, time and volumes
EDRM XML
The EDRM XML project is designed to provide a standard, generally accepted XML schema to facilitate the movement of electronically stored information (ESI) from one step of the electronic discovery process to the next, from one software program to the next and from one organization to the next.
Electronic Data Discovery
See "Electronic Discovery"
Electronic Discovery
The process of identifying, preserving, collecting, processing, reviewing and producing ESI for legal review.
Electronic Discovery Company
A company that provides e-discovery services
Electronic Discovery Guidelines
Guidelines, rules and other procedures developed to define activities, processes and procedures in the course of preparing for or responding to electronic discovery.
Electronic Discovery Industry
The sum of productive enterprises and organizations involved in developing products and/or delivering services relating to the legal and/or other requirements of electronic discovery.
Electronic Discovery Laws
Statues and/or precedence outlining the legal requirements of Electronic Discovery
Electronic Discovery Model
See "Electronic Discovery Reference Model"
Electronic Discovery Reference Model
The Electronic Discovery Reference Model is a framework to describe the phases of activities around Electronic Discovery (www.EDRM.net).
Electronic Discovery Rules
Rules outlining the legal requirements, processes and procedures of Electronic Discovery.
Electronic Discovery Services
Services to support the preparation for and response to litigation or investigation.
Electronic Discovery Software
Software to support the preparation for and response to litigation or investigation.
Electronic Discovery Vendor
Vendors that provide software and/or services for e-discovery
Electronic Evidence
Electronically stored information (ESI).
Embedded Metadata
Text, numbers, content, data or information that is directly or indirectly input into a Native File by a user and which is not typically visible to the user viewing the output of display of the Native File on screen or as a print-out.
Encryption
A technology that renders the contents of a file unintelligible to anyone not authorized to read it. Encryption is used to protect information as it moves from one computer to another and is an increasingly common way of sending credit card numbers over the Internet when conducting e-commerce transactions.
Enterprise Document Management
Planning, organizing, controlling and directing activities that oversee the life-cycle of information across the enterprise.
Enterprise Risk Management
Enterprise risk management (ERM) includes the methods and processes used by organizations to manage risks (or seize opportunities) related to the achievement of their objectives.
ERM
See "Enterprise Risk Management"
ESI
See "Electronically Stored Information")
Evidence Production
See “Production”
Evidence Review
To evaluate the collected ESI for relevance and privilege.
Federal Rules of Civil Procedure
Rules that govern the conduct of all civil actions brought in Federal district courts.
Federal Rules of Evidence
Rules that govern the introduction of evidence in proceedings, both civil and criminal, in Federal courts.
File
An element of data storage in a file system. A collection of data or information that has a name, called the filename. Almost all information stored in a computer must be in a file. There are many different types of files: data files, text files, program files, directory files and more.
File System
The system that an operating system or program uses to organize and keep track of files. For example, a hierarchical file system is one that uses directories to organize files into a tree structure. Types of file systems include file allocation table (FAT) and Windows® NT file system (NTFS).
File System Metadata
Data that can be obtained or extracted about a file from the file system storing the file. Contrast with document metadata and e-mail metadata.
Filename
The name of a file. All files have names. Different operating systems impose different restrictions on filenames. Most operating systems, for example, prohibit the use of certain characters in a filename and impose a limit on the length of a filename. In addition, many systems, including DOS and UNIX, allow a filename extension that consists of one or more characters following the proper filename. The filename extension usually indicates what type of file it is.
Filename Extension
In DOS and some other operating systems, one or several letters at the end of a filename. Filename extensions usually follow a period (dot) and indicate the type of information stored in the file. For example, in the filename LETTER.DOC, the extension is DOC, which indicates that the file is a word processing file.
Flash Drive
A portable, USB storage device that can hold between 256 megabytes and 4 gigabytes of ESI
Forensics
See "Computer Forensics"
Form 35 (FRCP)
Report of the parties' planning meeting in which the parties jointly propose the agreed upon discovery plan.
FRCP
See "Federal Rules of Civil Procedure"
FRE
See "Federal Rules of Evidence"
Hard Drive
The primary hardware that a computer uses to store information, typically magnetized media on rotating discs.
Hart Scott Rodino
The Hart-Scott-Rodino Antitrust Improvements Act of 1976 requires organizations to report specific information at the start of the merger process and to report on additional data points as part of a second request for information.
Hash/Hash Coding
Algorithm that represents a unique value, like a digital fingerprint. The process of creating a unique algorithm that is unique to every document.
Help Features/Documentation
Instructions that assist a user on how to set up and use a product including but not limited to software, manuals and instruction files.
HSR
See "Hart Scott Rodino"
HTML
HyperText Markup Language, a language that uses tags to structure text into headings, paragraphs, lists and links. It tells a Web browser how to display text and images.
Identification
A phase of the electronic discovery process involving the identification of all relevant sources of electronically stored information.
IM
See "Instant Messaging"
Imaged Copy
A "mirror image" bit-by-bit copy of a hard drive, i.e. a complete replication of the physical drive regardless of how the drive is organized or whether the image created contains meaningful data in whole or in part. From an imaged copy of a hard drive it is possible to reconstruct the entire contents and organization of the source drive from which it was taken.
Imaged Copy
A “mirror image,” bit-by-bit copy of a hard drive (i.e., a complete replication of the physical drive).
Information Governance
The organizational structures and processes that ensure an accountability framework for use by IT that also support an organization’s legal objectives and strategies.
Information Management
Information management (IM) is the collection and management of information from one or more sources and the distribution of that information to one or more audiences.
Input Device
Any object which allows a user to communicate with a computer by entering information or issuing commands (e.g. keyboard, mouse or joystick).
Instant Messaging
Using non-e-mail internet or network-based software or services to send and/or receive text messages between one or more computers and/or mobile devices. Messages are often, but not always, stored in log files.
Legal Document Management
The policies, procedures, planning and other activities around the storage and possessing of documents that may be needed for legal matters.
Legal Hold
The process of distributing a "hold" notice to potential and/or probable custodians of ESI -- with instructions outlining the preservation and collection of said data.
Litigation Management
The business activities around preparing for and/or responding to litigation.
Litigation Preparation
The strategic planning and/or activities around preparing for litigation.
Litigation Readiness Consulting
Consultative services to help guide an organization in preparation for litigation.
Litigation Response Consulting
Consultative services to help guide an organization in its response to litigation.
Litigation Support
Personnel or resources that help one or more organizations prepare for and respond to litigation or investigation.
Litigation Support Services
Services to support the preparation for and response to litigation or investigation.
Magnetic or Optical Storage Media
Including, but not limited to, hard drives (also known as “hard disks”), backup tapes, CD-ROMs, DVD-ROMs, JAZ and Zip drives, smart cards, memory sticks, digital jukeboxes and floppy disks.
Mailbox
An area in memory or on a storage device where e-mail is placed. In e-mail systems, each user has a private mailbox. When the user receives e-mail, the mail system automatically puts it in the mailbox. The mail system allows you to scan mail that is in your mailbox, copy it to a file, delete it, print it or forward it to another user. The mailbox format used by Microsoft Exchange® e-mail systems is PST, while Lotus Notes® uses NSF files.
Meet and Confer (FRCP Rule 26(f)
A rule within the FRCP that requires parties to meet prior to a scheduling conference in Federal court to discuss and agree upon discovery of information and evidence relevant to the case.
Memory
Internal storage areas in the computer. The term “memory” identifies data storage that comes in the form of chips and the word “storage” is used for memory that exists on tapes or disks. Moreover, the term “memory” is usually used as shorthand for physical memory, which refers to the actual chips capable of holding data. Some computers also use virtual memory, which expands physical memory onto a hard disk. See the definitions for two types of physical memory: RAM and ROM.
Metadata
Data about data. In data processing, metadata provides information about a document or other data managed within an application or environment. There are five types of metadata: file system, document, e-mail, vendor-added and customer-added.
Native File/Format
The source document, as collected from the source computer or server, before any conversion or processing of the document.
Native Format Review (also "Native Review")
Reviewing ESI using the software used to create it originally. For example, using Microsoft Word in the review process to open/review a .DOC (MS Word Document format) file.
Near de-duplication
The elimination of electives with "near duplicate" similarities, i.e. a document that was sent to multiple custodians.
Network
A group of connected computers that allows people to share information and equipment (e.g. local area network (LAN), wide area network (WAN), metropolitan area network (MAN), storage area network (SAN), peer-to-peer network or client-server network).
Network Operating System
Software that directs the overall activity of networked computers.
NIST
National Institute of Standards and Technology
OCR
Optical Character Recognition, a method of translating printed text and images into a form that a computer can manipulate (into ASCII codes, for example). An OCR system enables you to scan a printed document directly into a computer file.
Onsite Electronic Discovery
Electronic data discovery services performed at the facility where ESI is stored.
Operating System
Software that directs the overall activity of a computer (e.g. MS-DOS ®, Windows ®, Linux ®, etcetera).
Overwrite
To copy new data over existing data. Overwritten data cannot be retrieved.
PDF
Portable Document Format - a file format developed by Adobe Systems. PDF captures formatting information from a variety of desktop publishing applications, making it possible to send formatted documents and have them appear on the recipient's monitor or printer as they were intended. To view a file in PDF format, you need Adobe Acrobat Reader, a free application distributed by Adobe Systems.
Presentation
Presentation of the preserved, collected, processed, reviewed, analyzed and produced ESI at a legal proceeding.
Preservation
The process of retaining and protecting all relevant evidence from destruction or deletion.
Privilege Data Set
A set of documents that are deemed responsive or relevant but are withheld on the grounds of privilege (work product or attorney-client).
Processing
Capturing an electronic data image or a representation of the image, generally in native format, entering it into a computer system and processing and or manipulating it so that it can be exported into a review application.
Production
To electronically deliver ESI to a variety of recipients or for use in other systems.
Production De-Duplication
Culling of a document if multiple copies of that document reside within the same production set. For example, if two identical documents are both marked responsive, non-privileged, production de-duplication ensures that only one of those documents is produced. Contrast with Case De-Duplication and Custodian De-Duplication.
Proximity Search
Process that searches for words or phrases within a prescribed distance of another word or phrase.
PST
A Microsoft Outlook e-mail store.
Quality control
Process undertaken to ensure that service rendered is of quality set forth in SLA or fulfills the requirements with sufficiently high quality.
Quick peek
ESI is made available to opposing party before being reviewed for privilege, confidentiality or privacy. Strict guidelines are required to prevent waiver.
RAM
Random Access Memory - the hardware inside a computer that retains memory on a short-term basis and stores information while the user utilizes the computer.
Record Custodian
Person responsible for the storage and protection of records throughout the records retention period.
Record Retention Policy
Policy for setting procedures around managing the lifecycle of records, from creation to maintenance to disposition.
Record Retention Schedule
A formalized plan for the management of records, identifying how long records should be kept, when they should be archived and when they can be destroyed.
Records Management
Planning, organizing, controlling and directing activities that oversee the lifecycle of information.
Repository for electronic records
A device on which electronic records and record metadata is stored.
Restore
The transferring of data from a back up date or other medium, to an on-line system, typically to recover ESI lost due to disaster or system failure.
Review
Examination of potentially relevant data sets, or ESI, for relevancy, privilege and confidentiality in advance of production.
ROM
Read Only Memory - the hardware in a computer that that can be read but not written to. ROM contains the programming that allows a computer to boot up each time the user turns it on and it contains essential system programs that neither the user nor the computer can erase.
Rule 16 (FRCP)
Pretrial conference – FRCP Rule 16 may provide a party with an opportunity to discuss settlement without giving the appearance of having initiated the conversation.
Rule 26 (FRCP)
General provisions governing discovery; duty of disclosure.
Rule 37 (FRCP)
FRCP 37(e), formerly 37(f), provides a safe harbor when data is lost or overwritten in the normal course of business.
Rule 502 (FRE)
The proposed Federal Evidence Rule 502 is intended to reduce the risk of forfeiting the attorney-client privilege or work product protection “so that parties need not scrutinize production of documents to the same extent as they do now.”
Rules of Civil Procedure
See “Federal Rules of Civil Procedure”
Rules of Evidence
See “Federal Rules of Evidence”
Sampling
Testing a database or ESI to determine the frequency of relevant information.
Second Request
Under the Hart-Scott-Rodino Antitrust Improvements Act of 1976, certain mergers and acquisitions are subject to pre-merger antitrust review by the DOJ and FTC. In some cases the DOJ or FTC will issue requests for "additional information and documentary material relevant to the proposed acquisition.” This is typically referred to as a second request.
Seven-Factor Test
Seven Factor Zubulake (Zubulake I, 217 F.R.D. at 322) Test for the cost of producing data from inaccessible sources; factors are listed in descending order of importance:
1. The extent to which the request is specifically tailored to discover relevant information;
2. The availability of such information from other sources;
3. The total costs of production compared to the amount in controversy;
4. The total costs of production, compared to the resources available to each party;
5. The relative ability of each party to control costs and its incentive to do so;
6. The importance of the issues at stake in the litigation; and
7. The relative benefits to the parties of obtaining the information.
Slack
The difference in empty bytes of the space that is allocated in clusters minus the actual size of the files. Also described as the data fragments stored randomly on a hard drive during the normal operation of a computer or the residual data left on the hard drive after new data has overwritten some of the previously stored data.
Software
Any set of instructions stored on computer-readable media that tells a computer what to do. Includes operating systems and software applications.
Software Application
A program that instructs a computer to perform a specific set of instructions or execute a process. Some software applications are user-driven, like Microsoft Word or Notepad, while others are system-driven like the Windows system clock or automatic virus scanning programs.
Spoliation
The destruction of records or record properties, such as metadata, that may be relevant to litigation or governmental investigation.
States Rules (of Civil Procedure)
Rules promulgated at the state level governing the procedural rules to be followed in state courts. Many states are adopting e-discovery rules, similar to the FRCP Rule 26(f),as proposed by the National Conference of Commissioners on Uniform State Laws.
Storage Device
Any device that a computer uses to store information.
Storage Media
Any removable device that stores ESI. See Magnetic or Optical Storage Media.
Structured data
Data that has a structured format, such as a database.
Substantive Metadata
Data that reflects the substantive changes made to the document by the user. For example, it may include the text of actual changes to a document. While no generalization is universally applicable, system metadata is less likely to involve issues of work product and/or privilege.
System Metadata
Data that is automatically generated by a computer system. For example, system metadata includes information such as author, date and time of creation and the date a document was modified.
Tape Drive
A hardware device used to store data on a magnetic tape. Tape drives are usually used to back up large quantities of data due to their large capacity and cheap cost relative to other data storage options.
TIFF
Tagged Image File Format - a graphic file format used for storing still-image bitmaps. TIFFs are stored in tagged fields and programs use the tags to accept or ignore fields, depending on the application.
Unstructured data
ESI that does not have a data structure or has a structure that is not easily readable by computer.
USB
Universal Serial Bus. See Flash Drive.
Vendor-Added Metadata
Data created and maintained by the electronic discovery vendor as a result of processing the document. While some vendor-added metadata has direct value to customers, much of it is used for process reporting, chain of custody and data accountability. Contrast with Customer-Added Metadata.
XML
Extensible Markup Language
Zubulake
Five landmark decisions on e-discovery addressing when to shift the cost of electronic discovery to the requesting party; when a company needs to begin preserving electronic evidence and what electronic evidences must be preserved; what steps must be taken to preserve and the consequences of the failure to adequately preserve electronic evidence.