Your single source for new lessons on legal technology, e-discovery, and the people innovating behind the scenes.

Not Using Active Learning? You're Falling Behind

Sam Bock

As technology-assisted review has racked up points for ease of use, judicial approval, and, above all, time and cost savings, its benefits are becoming difficult to ignore.

Let’s cut to the chase: If there’s technology available to cut review time and prioritize the voluminous data in your projects automatically—with plenty of judicial approval—why wouldn’t you use it?

RelativityOne Certified Partner Complete Discovery Source (CDS) hosted a panel discussion in London late last year to discuss the use of TAR—particularly its active learning workflow—amongst modern legal teams. One of their biggest takeaways? Active learning can and should be applied to the majority of cases.

We caught up with some of their panelists—Mark Anderson of CDS and Jeffrey Shapiro, e-discovery manager at Clifford Chance—to ask a few follow-up questions for us on the subject. Check out their insights below.

Sam: During the panel, you mentioned that active learning is a good option for most cases. Why is that?

Mark Anderson: At CDS we see active learning as almost a no-brainer for most matters. The goal of document review is almost always to find documents relevant to the matter, so there are few reasons not to use a tool that automatically prioritizes documents most likely to contain relevant information for the reviewers.

With active learning, small cases can see wins very quickly. This is most noticeable for cases with a low richness (percentage of relevant documents in the case): for example, if only 500 documents in your 10,000-document case are relevant, then active learning can help you complete the document review in a day. Not only does this speed up the review, but it can remove the need to use keywords.

That being said, active learning works equally well on large document sets whereby a much larger population of documents can be removed from review by prioritizing relevant documents to the front of the review queue. Review may still take some time (depending on the size and richness of the case), but case teams are likely to see a huge return in time and cost savings with the software.

There are certain cases where active learning is not a good candidate, most notably cases with a large amount of audio/video data, chat logs, SMS messages, or building plans. Conducting an analysis of document types and having a deep understanding of the case is key to the success of an active learning project. CDS typically recommends using active learning for any documents with sufficient text for analysis, alongside a traditional linear review for non-suitable documents.

How has the legal culture on proportionality changed in recent years?

Jeff Shapiro: The UK High Court first approved the use of TAR in 2016 with the Pyrrho1 and Brown2 decisions, where the courts noted the use of TAR was proportionate and consistent with the overriding objective. In the ensuing two-and-a-half years, the judiciary has recognised that:

  1. TAR is no longer novel;
  2. The court does not need to approve the use of TAR, even in a contested application;
  3. The party seeking to use TAR must be aware of its disclosure obligation and decide how best to meet that obligation; and,
  4. It is essential for the parties to engage meaningfully on a proposal to use TAR at an early stage in the disclosure process.3

Outside of judicial opinions, the Disclosure Pilot for the Business and Property Courts in England and Wales commences on 1st January 2019.  The Pilot makes specific references to using technology, including analytics and technology-assisted review, to help keep disclosure reasonable and proportionate in light of the overriding objective.4

How does active learning work in conjunction with other analytics features, such as email threading or clustering?

Mark: Other technologies can and should certainly be utilized alongside active learning. An active learning project can be kickstarted by utilizing analytics technologies such clustering and similar document detection to locate and identify key documents. This enables the review team to get to the key data quickly and build a model allowing other relevant documents to be identified and prioritized to the front of the queue. Other technologies such as email threading can also be utilized to cull data, i.e. remove earlier email threads. This will further reduce the number of documents for review, thus decreasing the time and cost associated with reviewing additional data.

What are your tips for teams who are new to TAR and want to give active learning a try?

Jeff: Legal teams can use TAR in a variety of ways. Some of the most common methods include: prioritised review; review cut-off; quality control; and review of the other side's disclosure. 

  1. Prioritised review: In prioritised review, the legal team reviews documents in a similar fashion to how they have in the past; the only difference is that TAR is working in the background to move predicted relevant documents up the queue so that the team sees them sooner than they otherwise would have.
  2. Quality control: With TAR, a legal team can quickly find documents predicted relevant but which a reviewer has coded as not-relevant and vice versa. Similarly, TAR can help with other coding decisions such as with privileged documents.
  3. Review cut-off: This entails using TAR to help determine when a legal team has found a sufficient percentage of relevant documents within its data and further review would be disproportionate.
  4. Review of the other side's disclosure: A legal team can use the TAR learning model from its own data and apply it to the other side's disclosure to make predictions about what is relevant in the other side's documents.

Mark: At CDS we feel our best advice is to just jump in and try it. Use it on your next project or trial it on a small project, but start utilizing this functionality sooner rather than later. There is a small difference in workflow between conducting a traditional linear review and an active learning review, so your reviewers will begin the project with very little additional training. You should quickly be able to see the advantages of active learning: your reviewers will spend less time reviewing irrelevant material and your case will conclude more quickly and accurately. If your e-discovery partner is not recommending active learning to you, they should be able to explain why not.

What is in store for the future of active learning, in your opinion?

Mark: Active learning has and will continue to be highly developed over the upcoming years. In the near future I can see an improvement in the types of data suitable for active learning, and an increase in combining existing technologies with active learning (e.g., sentiment analysis and image recognition). I also see the potential for creating case profiles which can be used on future active learning cases. In other words, if you have a new case regarding fraud, a fraud profile from a previous case can be applied to kickstart the review to find similar data. Once the review starts, the active learning model begins to build based on the specific information on the current case, streamlining how quickly you begin reviewing relevant data.

1 Pyrrho Investments Ltd v MWB Property Ltd [2016] EWHC 256 (Ch)
2 Brown v BCA Trading [2016] EWHC 1464 (Ch)
3 Tchenguiz v Grant Thornton UK LLP, unreported Case Management Conference on 5 October 2017, and Triumph Controls UK Limited v Primus International Holding Co. [2018] EWHC 176 (TCC)
4 Practice Direction [ ], Disclosure Pilot for the Business and Property Courts, par. 3.2(3), 9.6(3), available at (last visited 5 December 2018); Draft Disclosure Review Document – Appendix 2, p. 16-17, available at (last visited 5 December 2018)

Sam Bock is a member of the marketing team at Relativity, and serves as editor of The Relativity Blog.