Your single source for new lessons on legal technology, e-discovery, compliance, and the people innovating behind the scenes.

The Power and Possibilities of Data Science

McKenna Brown

Since the term “data scientist” was popularized in 2008 by D.J. Patil, then at LinkedIn, and Jeff Hammerbacher, then at Facebook, the industry has seen explosive growth.

Not only have job opportunities for data scientists cropped up everywhere, but the role has transformed the work life of millions of people who benefit from their innovations. Tasks that were once laboriously performed by people have become automated, freeing us humans in legal, financial, and corporate industries (and many others) to focus on more important and well, human work.

So how did we get here, and what’s next for this growing industry? Late last year, leaders from Relativity and Text IQ, a Relativity company, gathered to talk about just that.

In a Coffee + Chat session presented by Relativity’s talent team, Apoorv Agarwal, Aron Ahmadia, and Peter Haller discussed the origins of data science, where they see the industry going in the next few years, and what about artificial intelligence makes them most excited.

What Is Data Science?

“I think of data science as fundamentally people who love data and who believe that data can be used and leveraged to solve problems,” said Aron, director of data science at Relativity. In a previous role he worked with the U.S. Department of Defense, helping to disentangle networks of sex traffickers—and using data science to identify them.

“It turns out that most of the time, when you're solving data science problems, 99 percent of the work is getting the data in front of you,” Aron said, and the other one percent is “just connecting the dots.”

And a well-functioning data science organization is not made up of just data scientists, either. Aron emphasized that a great team will have lots of different types of roles coming together to solve problems using data.

The Role of AI

“I remember advising a company doing sentiment analysis maybe 10 years ago. They were trying to commercialize sentiment analysis, and it was really hard,” said Apoorv, CEO of Text IQ. “Because people didn't believe that a machine could come in and understand emotion. That's so inherently human.”

In recent years, and after successful showcases of AI’s potential, people have become more comfortable with it, which makes commercializing AI an easier prospect.

“If a customer can trust a self-driving car to drive them around, they better trust AI to help them find bad actors,” Apoorv said.

Peter, product lead for Trace, acknowledged the companies swinging for the fences on AI, pursuing projects like sentiment analysis, speech to text, and entity extraction. But what excites him most, he said, is a bit smaller.

“What's really interesting is focusing on a very narrow problem that you need to solve—one that you can see a customer struggling with on a daily basis—and saying that it’s AI that is going to solve this very unique problem,” Peter said.

Preserving Privacy

One such problem, Peter said, is the vast amounts of communications that compliance teams must sift through as part of regulatory requirements in industries like the financial sector. The Trace team is using AI to try to refine the alerts that compliance team members actually have to look at down to what is true misconduct.

“It's an important piece of how we can use AI to ultimately reduce and protect individuals, reduce the amount of communications that other people are looking at, and ultimately increase the privacy over this type of content,” Peter said.

Apoorv cited an exponential increase in data breaches during the COVID-19 pandemic, when many employees began working from home, as an untapped opportunity for data science. Enterprises need a way to find and protect the personal information of both their customers and their employees.

He also spoke about the many shapes and forms of unstructured data that also needed to be found and protected.

“The way we find personal information from forms or images looks very different from the way we find personal information from spreadsheets or emails or other kinds of documents,” he said.

An image of a driver’s license or Social Security card may be considered an unstructured document, but Apoorv says even they have an inherent structure that is possible to extract. Once that underlying structure is identified, people could use it for building all sorts of applications we haven’t even thought of, he added.

What’s Next

Prompted by an audience question, the panelists closed out the talk by discussing their views on the future of data science.

Aron spoke about the recent history of boom and bust cycles in AI, mentioning a common fear that funding and excitement will dry up and we will enter an “AI winter.” But he wasn’t so sure that’s where we are headed.

“These are all things we didn't believe that we could crack in any reasonable way, and we've made tremendous progress on them,” Aron said. “Natural language processing of text is just one example, and there are a lot of others that still need innovation, like images and audio and video.”

Apoorv shared his optimism, saying he believes we live in an era where whatever can be automated will be automated. Anticipating a common retort, he said he didn’t think this would lead to a loss of jobs, either.

“There are enough hard problems in the world to be solved where human input would still be needed,” he said. “I think machines would help us take away a lot of the mundane things that we are doing today—things which, quite honestly, make us more robotic than human.

“My personal belief is it actually brings us closer to humanity and pushes us to do what we do best, which is being creative, or having empathy, or managing people—what humans are meant to do,” he added.

Dive into our Data Discovery Law Year in Review for 2020


McKenna Brown is a member of the marketing team at Relativity, specializing in content development.