Your single source for new lessons on legal technology, e-discovery, and the people innovating behind the scenes.

How to Simplify 3 Common Types of e-Discovery and Document Review

Stan Pierson

Data formats often vary between types of disputes and investigations, and they will always change e-discovery protocols. Certain industries—such as insurance or pharmaceuticals—are more frequently subject to their own unique flavors of litigation. There are also investigations—internal and governmental—to take into account.

In today’s corporate landscape, data diversity means no two cases are the same. However, some reasonable predictions can be made based on the type of action or company at play—especially when it comes to the use of analytics—to support faster case strategizing. Ahead of your next document review project, consider these best practices to help you better support your client and case team’s unique circumstances.

Investigations: Ripe for Analytics

Data Considerations

Although any company may be subject to investigations—both internal and external—it’s clear that the banking industry is seeing an increased regulatory market, and investigations are not uncommon.

Many investigators—particularly those focusing on potential fraud or insider trading, for example—need to take a deep look at the conversations that drive criminal behavior. These might include email threads or chat logs. Often, code words are used in place of the players’ true intentions, making manual, first-pass review feel a bit like reading a whole other language.

Other investigations that are particularly relevant to the banking industry may include spreadsheets full of important calculations, rates, or transactions. For example, investigating accounting practices may require close scrutiny of quarterly budget reports in Excel.

How to Make e-Discovery Easier

For emails and chat logs, it’s important to:

  • Use email threading. Even in a small data set, email threading can, at a minimum, significantly reduce the time spent on review by better organizing your data and setting reviewers up for a more intuitive experience.
  • Leverage keyword expansion. Even if the conversations you’ve collected are brief, your e-discovery software can learn from the concepts they discuss—and use the relationships between those concepts to reveal code words that transform an otherwise innocuous-looking document into your smoking gun.
  • Create clusters. Another way to find the subtle language cues that might be an undercurrent in your data is through clustering. This process requires no user input; the system will automatically organize your data into conceptually related groups so you can dig into what matters most—and start identifying obviously irrelevant material—from the get-go.

For spreadsheets full of numerical values, conceptual analytics won’t provide much analysis. However, you can:

  • Identify near-duplicates. You can save your team from rereading content by flagging documents that are almost exactly the same as other records in your data set. This also makes for an excellent quality control tool to ensure you aren’t missing any relevant content in your production.

Patent or Construction Matters: Careful Collection is Key

Data Considerations

In a tech-forward world where intellectual property is king, more companies are facing related intellectual property matters. In the technology space and the energy industry, for example, patent litigation can be a frequent reality. Additionally, construction law can yield litigation over unfinished projects or poorly implemented building designs.

Both of these types of litigation typically involve a tricky data type: drawings and images.

For example, construction matters may rest on blueprints and CAD drawings—or even scanned sketches—created by the engineers and architects behind a disputed project.

Similarly, patent disputes could involve the original designs behind a groundbreaking invention.

How to Make e-Discovery Easier

Gathering insights with this type of data requires a well-trained eye and expertise in the relevant field during document review. To simplify e-discovery, your team can:

  • Engage in smarter collections. If you know CAD drawings will be at the heart of the matter at hand, consider drilling into that type of data during a targeted collection. Pulling excess documents you’ll just set aside after a first-pass cull can waste precious time that will be required for close review of those drawings.
  • Perform a native review. This type of data may be best reviewed in its native format, rather than as a TIFF image or a PDF, because the data may require several layers of information or interaction that can’t translate well into a simple image. Whether you’re requesting or sending a production, ensure that consideration is included in your production standards.

Medical Cases: Contracts and Forms Galore

Data Considerations

Litigation within healthcare and pharmaceutical industries  may involve malpractice cases of a fairly small size, or disputes over clinical trials on a much larger scale.

In cases involving clinical trials, there may be an immensely large number of similar documents, such as the same form filled out by hundreds or even thousands of participants.

Additionally, pharmaceutical matters—because they may involve the efficacy of a particular drug over many decades of research and use—can involve many historical documents that were originally created on paper. This typically means a lot of OCRed text.

How to Make e-Discovery Easier

Even large-scale medical matters can be simpler to tackle with a few key tools.

  • Optimize your OCR results. Whether you’re performing OCR yourself or receiving extracted text, use sampling to QC the results and take the necessary action to resolve any incomplete or inaccurate data.
  • Try concept searching. Say you need to find, amidst a data set involving 1,000 copies of the same form, the subset of trial participants who indicated a medical history of any type of allergy in their answers. You can easily search with blocks of text that you know are responsive to yield all of the relevant forms, even if participants all answered the same question in a slightly different way.
  • Consider technology-assisted review (TAR). If your text is of good quality, discusses concepts the computer can analyze, and you have enough data to work with, TAR can be an ideal way to leverage your experts’ efforts to accelerate review while working through your data set accurately.

Always Scrutinize Your Data

Whether you’re working on a matter like these or something else entirely, it’s critical to take a good look at your data and understand what it’s likely to contain even before document review begins. Build good habits so you can execute defensible, effective protocols for every project. For example:

  • Use early case assessment (ECA) to understand the anatomy of your data set, from file types to date ranges.
  • Before building an analytics index, understand how the amount of text and conceptual richness might impact your strategy.
  • When new data arrives in the middle of a review project, ensure you’re performing this close look again to evaluate its content. It can be dangerous to dump it all at the end of your reviewers' queues without accounting for how the new information might affect work that is already in progress.


The latest insights, trends, and spotlights — directly to your inbox.

The Relativity Blog covers the latest in legal tech and compliance, professional development topics, and spotlights on the many bright minds in our space. Subscribe today to learn something new, stay ahead of emerging tech, and up-level your career.

Interested in being one of our authors? Learn more about how to contribute to The Relativity Blog.