collection

Natural Orange

Orange Legal Technologies’ collection services can help you rapidly and accurately acquire potentially relevant electronically stored information (ESI), for audits, investigations, and litigation.

Our Traditional Collection Services include:

  • Fixed Storage Collection (Manual+Active Data Copy/Forensic Imaging)
  • Portable Storage Collection (Manual Copy/Forensic Imaging)
  • Back Up Tape Restoration
  • Automated Network Discovery of Devices/Repositories and Data
  • Remote Collection With Legal Hold

Our Digital Forensics Services include:

We support these services with:

  • Certified Collection Experts
  • Collection and Planning Consulting
  • Defensible Chain of Custody Procedures and Processes

To determine how we can help you with your specific collection requests, contact us and we can immediately help you translate your requests into action.

Collection

What is Collection?

The collection of data, which can be simply defined as the acquisition of potentially relevant electronically stored information (ESI), represents an important part of audition, investigation, and litigation processes.

What is Electronically Stored Information?

While not specifically defined in the FRCP, electronically stored information, or ESI, is defined in the November 2006 issue of The Third Branch (Newsletter of the Federal Courts) simply as ”…all information in computers”.

What is Information?

From a technology perspective, information is defined as the summarization of data. Technically, data are raw facts and figures that are processed into information, such as summaries and totals. But since information can also be the raw data for the next job or person, the two terms cannot be precisely defined, and both are used interchangeably.

What Is Data?

  • Factual information, especially information organized for analysis or used to reason or make decisions.
  • Computer Science. Numerical or other information represented in a form suitable for processing by computer.

Collection Consideration

Based on the interchangeability of definitions, ESI may be referenced as “data” in the remainder of this section.

Collection Considerations In Preparing For Cases Involving ESI

  1. What is the scope of the data in question?
  2. What is the structure of the data?
  3. What is the format of the data?
  4. What is the state of the data?
  5. How does one “Connect” to the data?
  6. How does one get to Active State data?
  7. How does one maintain the Static State data?
  8. How much data will be acted upon?
  9. Is the data encrypted?
  10. What capabilities will be needed to display information?
  11. How will data reports and/or files be provided to requestor?
  12. How will the data be stored after being acted upon?

In going through these twelve questions and the corresponding notes provided below, Orange Legal Technologies will help construct a planning framework from which to consider your case specific collection requirements for ESI.

Data Scope (What is the scope of the data in question?)

  • Entity Scope – Entities that may have had individuals involved in the creation, review, and/or response of data that may contain relevant information for the matter at hand.
  • Custodian Scope – Individuals who may have been involved in the creation, review, and/or response of data that may contain relevant information for the matter at hand.
  • Data Steward Scope – Individuals who have Information Technology management responsibilities for the entities and individuals determined to be relevant to the matter at hand and/or individuals who maintain access rights to the applications and equipment used by these entities and organizations.
  • Geographical Scope – The geographical locales of the entities and individuals that may have been involved in the creation, review, and/or response of communications and/or documents relevant to the matter at hand as well as the locales of the equipment used to support creation, transmission, review, and storage of these communications and/or documents.
  • Time Frame Scope – The period of time in which relevant information may have been created, reviewed, and/or responded to for the matter at hand.
  • Volume Scope – The estimated volume of data that may contain relevant information for the matter at hand.

Data Structure (What is the structure of the data?)

  • Unstructured – Unstructured data (or unstructured information) refers to masses of (usually) computerized information in which every bit of information does not have an assigned format and significance.   Examples of “unstructured data” may include audio, video and unstructured text such as the body of an email or word processor document.  Unstructured data represents approximately 85% of enterprise data.
  • Structured - Structured data (or structured information) refers to masses of (usually) computerized information in which every bit of information has an assigned format and significance.  Examples of “structured data” may include databases such as SQL or Access.   Structured data represents approximately 15% of enterprise data.

Data Format (What is the format of the data?)

  • Still Image – Images that convey their meaning in visual terms, e.g. pictorial images, photographs, posters, graphs, diagrams, documentary architectural drawings.  Formats for such images may be bitmapped (sometimes called raster), vector, or some combination of the two.  A bitmapped image is an array of dots (usually called pixels, from picture elements, when referring to screen display), the type of image produced by a digital camera or a scanner.  Vector images are made up of scalable objects—lines, curves, and shapes—defined in mathematical terms, often with typographic insertions.
  • Sound - Media-independent sound content that can be broken into two format sub-categories. The first sub-category consists of formats that represent recorded sound, often called waveform sound. Such formats are employed for applications like popular music recordings, recorded books, and digital oral histories. The second sub-category consists of formats that provide data to support dynamic construction of sound through combinations of software and hardware. Such software includes sequencers and trackers that use data that controls when individual sound elements should start and stop, attributes such as volume and pitch, and other effects that should be applied to the sound elements. The sound elements may be short sections of waveform sound (sometimes called samples or loops) or data elements that characterize a sound so that a synthesizer (which may be in software or hardware) or sound generator (usually hardware) can produce the actual sound. The data are brought together when the file is played, i.e., the sounds are generated in a dynamic manner at runtime. This second sub-category is sometimes called structured audio.
  • Moving Image – A variety of media-independent digital moving image formats and their implementations. Some formats, e.g., QuickTime and MPEG-4, allow for a very wide range of implementations compared to, say, MPEG-2, an encoding format whose possible implementations are relatively more constrained.
  • Textual – Content works consisting primarily of text.
  • Web Archive – Content in formats that might hold the results of a crawl of a Web site or set of Web sites, a dynamic action resulting from the use of a software package that calls up Web pages and captures them in the form disseminated to users.
  • Generic – Content in widely acceptable generic formats to include but not limited to specifications for wrappers (e.g., RIFF and ISO_BMFF), bundling formats (e.g., METS and AES-31), and encodings (e.g., UTF-8 and IEEE 754-1985).

Data State (What is the state of the data?)

  • Active State:  Active Data is information residing on the hard drives or optical drives of computer systems, that is readily visible to the operating system and/or application software with which it was created and is immediately accessible to users without deletion, modification or reconstruction.
  • Static State – Static Data (or Archival Data) is information that is not directly accessible to the user of a computer system but that the organization maintains for long-term storage and record keeping purposes. Static data may be written to removable media such as a CD, magneto-optical media, tape or other electronic storage device, or may be maintained on system hard drives in compressed formats.
  • Residual State: Residual Data (sometimes referred to as “Ambient Data”) refers to data that is not active on a computer system. Residual data includes (1) data found on media free space; (2) data found in file slack space; and (3) data within files that has functionally been deleted in that it is not visible using the application with which the file was created, without use of undelete or special data recovery techniques.

Data Network (How does one “Connect” to the data?)

  • Non-Networked:  Data is not interconnected to a group of computers.
  • Personal Area Network (PAN): A personal area network (PAN) is a computer network used for communication among computer devices close to one person. Some examples of devices that may be used in a PAN are printers, fax machines, telephones, PDAs, or scanners. The reach of a PAN is typically within about 20-30 feet (approximately 4-6 Meters). PANs can be used for communication among the individual devices (intrapersonal communication), or for connecting to a higher level network and the Internet (an uplink).
  • Local Area Network (LAN): A network covering a small geographic area, like a home, office, or building. Current LANs are most likely to be based on Ethernet technology.
  • Campus Area Network (CAN): A network that connects two or more LANs but that is limited to a specific and contiguous geographical area such as a college campus, industrial complex, or a military base. A CAN, may be considered a type of MAN (metropolitan area network), but is generally limited to an area that is smaller than a typical MAN.
  • Metro Area Network (MAN): A Metropolitan Area Network is a network that connects two or more Local Area Networks or Campus Area Networks together but does not extend beyond the boundaries of the immediate town, city, or metropolitan area. Multiple routers, switches & hubs are connected to create a MAN.
  • Wide Area Network (WAN): A WAN is a data communications network that covers a relatively broad geographic area (i.e. one city to another and one country to another country) and that often uses transmission facilities provided by common carriers, such as telephone companies.
  • InterNetwork:Two or more networks or network segments connected using devices that operate at layer 3 (the ‘network’ layer) of the OSI Basic Reference Model, such as a router. Any interconnection among or between public, private, commercial, industrial, or governmental networks may also be defined as an internetwork.  In modern practice, the interconnected networks use the Internet Protocol. There are at least three variants of internetwork, depending on who administers and who participates in them:
    • Intranet: An intranet is a set of interconnected networks, using the Internet Protocol and uses IP-based tools such as web browsers, that are under the control of a single administrative entity. That administrative entity closes the intranet to the rest of the world, and allows only specific users. Most commonly, an intranet is the internal network of a company or other enterprise.
    • Extranet: An extranet is a network or internetwork that is limited in scope to a single organization or entity but which also has limited connections to the networks of one or more other usually, but not necessarily, trusted organizations or entities (e.g. a company’s customers may be given access to some part of its intranet creating in this way an extranet, while at the same time the customers may not be considered ‘trusted’ from a security standpoint). Technically, an extranet may also be categorized as a CAN, MAN, WAN, or other type of network, although, by definition, an extranet cannot consist of a single LAN; it must have at least one connection with an external network.
    • The Internet”: A specific internetwork , consisting of a worldwide interconnection of governmental, academic, public, and private networks based upon the Advanced Research Projects Agency Network (ARPANET) developed by ARPA of the U.S. Department of Defense – also home to the World Wide Web (WWW) and referred to as the ‘Internet’ with a capital ‘I’ to distinguish it from other generic internetworks.

Intranets and extranets may or may not have connections to the Internet. If connected to the Internet, the intranet or extranet is normally protected from being accessed from the Internet without proper authorization. The Internet itself is not considered to be a part of the intranet or extranet, although the Internet may serve as a portal for access to portions of an extranet.

Data Storage Network (How does one get to Active State data?)

  • Direct Attached Storage (DAS): Direct-attached storage (DAS) refers to a digital storage system directly attached to a server or workstation, without a storage network in between. It is a retronym, mainly used to differentiate non-networked storage from SAN and NAS.
  • Network-Attached Storage (NAS) Network Attached Storage (NAS) is a file-level computer data storage connected to a computer network providing data access to heterogeneous network clients.
  • Storage Area Network (SAN): A storage area network (SAN) is an architecture to attach remote computer storage devices (such as disk arrays, tape libraries and optical jukeboxes) to servers in such a way that, to the operating system, the devices appear as locally attached.

Data Storage Media (How does one maintain the Static State data?)

  • Semi Conductor Based Storage Media (Memory Cards, USB Flash Drives, PDAs, Digital Audio Players, Digital Cameras, Mobile Phones, Copiers)
  • Magnetic Based Storage Media (Floppy Disk, Hard Disk, Magnetic Tape)
  • Optical and Magneto Optical Storage Media (CD, CD-ROM, DVD, BD-R, BL-RE, HD DVD, CD-R, DVD-R, DVD+R, CD-RW, DVD-RW, DVD+RW, DVD-RAM, UDO)

Data Volume (How much data will be acted upon?)

  • Uncompressed – Data not having undergone a process of transformation from one representation to another, smaller representation from which the original, or a close approximation to it, can be recovered.
  • Compressed – Data having undergone a process of transformation from one representation to another, smaller representation from which the original, or a close approximation to it, can be recovered.  Typically determined by Algorithm Complexity and Amount Of Compression.

Data Encryption (Is the data encrypted?)

  • Data Not-Encrypted - Data not having undergone a procedure that renders the contents of a computer message or file unintelligible to anyone not authorized to read it. The data is encoded mathematically with a string of characters called a data encryption key.
  • Data Encrypted – Data having undergone a procedure that renders the contents of a computer message or file unintelligible to anyone not authorized to read it. The data is encoded mathematically with a string of characters called a data encryption key.

Data Code Format  (What capabilities will be needed to display information?)

  • Unicode Support – Unicode Support provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.
  • Non-Unicode Support - Data Code Format does not  provide a unique number for every character regardless of platform, program, or language.

Data Output (How will data reports and/or files be provided to requestor?)

  • Custom Reports (Based On Task)
  • Native Files
  • TIFF Files
  • PDF Files
  • Load Files (Specifics Provided By Requestor)
  • Custom Files (Specifics Provided By Requestor)

Data Storage Requirements (How will the data be stored after being acted upon?)

  • Hot – Data is stored in an active state and is immediately accessible to end users.
  • Warm – Data is stored in an active state not immediately accessible to end users.
  • Cold – Data is stored in a static state.
  • Destruct – Data is destroyed.

As one begins to understand these collections considerations, one can then begin to assign economic values (time/money) to the potential approaches to get the data and make it available for all parties involved in a specific matter.   Ranging from extremely general and subjective on one end of the spectrum to very specific and objective on the other, this economic value can also serve as the basis for discussing from a position of understanding whether or not ESI is accessible or not-reasonably-accessible from a case-specific legal perspective.

To learn more, contact us.

References: