AI in e-Discovery: Watching the Current and Future Data Privacy Landscape

Editor's Note: The second in a two-part series, this article exemplifies a ton of collaboration between author Brittany Roush and members of Relativity's legal team, including Beth Kallet-Neuman and Mark Bussey.

All this discussion of privacy, ethical obligations, and the changing landscape of artificial intelligence leads into another point of consideration: the enduring nature of technology, which is part and parcel to the concept of defensibility. As AI is integrated into litigation, investigation, and associated workflows, users and technologists must consider the full spectrum of privacy concerns that they may encounter with AI and make the determination of whether the technology they are using will meet those standards over time. Without careful consideration, a litigation team may find themselves in a position where they’ve used AI that is no longer permitted or useful in their particular use case, calling into question their evidence identification and procedures.

This is not a new issue for those in the legal industry. Early technology-assisted review (TAR) solutions met similar scrutiny; it took many years for the use of TAR in a matter to be considered a standard, defensible workflow. However, TAR is just one type of AI. These days, the use of AI has proliferated to the point where “AI” has become a catch-all, umbrella term and attorneys must now understand increasingly specific concepts—like large language models, machine learning, classification models, clustering models, narrow AI vs Super AI, and so much more—to keep up with what it means to use AI in legal applications.

With the EU AI Act looming on the horizon, attorneys who work within or process data in the EU will have to also understand the risk level that each model poses to their clients and the subjects of their matters. In the legal industry, many of the models attorneys use will be, by their nature, medium- and high-risk, necessitating the development of specific, fit-for-purpose models tailored for e-discovery and investigations.

Transparency plays a crucial role in this new AI ecosystem. “Black box” AI, or AI that gives a user no visibility into how a decision or output is derived, may not be appropriate for use in many legal matters, given the high-risk nature of these use cases.

AI vendors must provide their customers with training and educational materials so that, if defensibility is called into question, an attorney can competently defend the use of AI on their matter. Those materials should highlight details on data privacy, bias reduction, the purpose of the model, the kind of training data used, and how risks are mitigated. In many ways, attorneys and AI vendors must create a symbiotic relationship to ensure AI’s endurance in the industry. After all, there’s no true value to technology that is merely a flash in the pan.

Sentiment Analysis and GDPR

Relativity has written extensively on the development and release of sentiment analysis in RelativityOne, and the feature has been well received by our customers. However, we also know that the use of sentiment analysis is considered a “question mark” for customers who process personal data subject to privacy regulations in the UK or EU (particularly those who fall under the purview of the GDPR).

It’s a reasonable concern. If you’re a data privacy wonk like me (ahem, nerd), it’s a wonderful and interesting issue because fundamentally, it opens the question of whether people have a right to expect privacy about their innermost thoughts and feelings when they’re expressed electronically—especially if it’s related to a legal matter. Furthermore, does an investigator or litigator have a right to automatically classify and store a person’s emotional state of being at any given point in time?

Emotions are, by nature, fleeting, representing a snapshot in time (which is why RelativityOne ranks sentiment on a sentence-by-sentence level—to accurately capture the ephemeral, transient nature of the emotions expressed in a document). In marketing sentiment models, there is very little reason to store a person’s name or other identifying information along with their sentiment. The sentiment itself is sufficient for marketing analysis, and the individual is almost irrelevant. But that’s not the case in litigation or an investigation; the individual is, by necessity, tied to the sentiment.

Legal challenges to sentiment analysis under GDPR have been focused on models specifically designed for marketing, leaving the more sensitive and high-risk use cases in an area of uncertainty.

Because I’m not a lawyer, I’ve invited my friend and colleague Mark Bussey, Relativity's senior counsel for data protection and commercial transactions, to provide his thoughts on this legal dilemma:

Whilst a specific legal framework relating to artificial intelligence is still in development (for example, the draft EU AI Act), there already is a significant volume of guidance and legislation that needs to be adhered to. The “specific framework” changes depending on the context (for example: the data sets used, the purpose of the tool). This is unlikely to change even when the law is more “developed” in this area. Even looking at the likely future state, as with other laws (for example, on the subject of privacy, comparing the GDPR to the CCPA), it is unlikely that there will be any globally harmonious laws; we will likely still see differing approaches to the implementation, regulation, and use of AI depending upon context and territory. As such, understanding the context is going to be crucial to remaining compliant when using such tools.

Whilst the issue of ethical, legal, and appropriate engagement with AI and its outputs is a question that extends well beyond data protection, privacy is understandably one area that is top of mind for many AI users given the volume of data required to train the models (some of which may be personal data in order to permit for effective training).

To navigate and remain compliant with the various legal requirements, it is crucial for software businesses to understand their data sets, implement appropriate data governance (including a set of enforced AI Principles), and remain transparent regarding their product and its intended use. As I’ve alluded to above, however, this is a quickly developing area and as such, “horizon scanning”—continuing to monitor legal and practice developments—is going to be crucial for all of us.

What Does This mean for the Future of AI in Legal Technology?

Despite the current hurdles and looming changes in legislation, and the many open questions facing the use of AI (generative AI in particular), it’s clear that the industry has a great appetite and desire for more AI to supplement and support workflows.

For those in the legal industry, particularly attorneys, this means arming themselves with knowledge and exercising caution when it comes to implementing and adopting new technology. As a few key takeaways:

When using any AI, but especially generative AI, it’s important to understand how data is collected, stored, and ultimately used in the training of current and future models.
Privileged, confidential, and proprietary information has no place in chatbots that are not owned and maintained by the user’s company. Security policies should take these potential avenues for exposure and risk into consideration, and employees should be trained on acceptable use policies.
Understanding AI requires transparency from AI vendors. Black box AI will be inappropriate for some use cases, especially if the use case is high-risk (such as sentiment analysis in an investigation).
When evaluating AI, look for endurance and keep an eye on changing case law; in particular, copyright law and global data privacy laws can drastically impact the use of AI in the legal industry and should be watched closely.

Ultimately, I’d like to leave you with the knowledge that AI principals aren’t just for legal tech companies; law firms and others responsible for managing sensitive client data can and should adopt their own AI principles for the use of AI in their matters. Responsible AI is a community effort, and we’re in this together.

Developing Responsible AI Solutions for e-Discovery and Investigation

Read this white paper to further explore how Relativity built a sentiment analysis model that is industry-leading in mitigating bias.

GET YOUR COPY

Brittany Roush is a senior product manager at Relativity, working on features like search and visualizations. She has been at Relativity since 2021. Prior to Relativity, she spent over a decade conducting investigations, managing e-discovery projects, collecting data, and leading a data breach notification practice. She has a passion for building better tools for investigators and PMs to make their lives and cases easier (at least partly because her friends in the industry would hunt her down if she didn’t).