Is AI the new ‘Snake Oil’? -

By Bill Kammer

At Legalweek New York 2019, every vendor seemed to tout their product’s artificial intelligence (AI). AI, sometimes called machine intelligence, is the intelligence demonstrated by computers in contrast to natural human intelligence.

We have all experienced AI in modern life: Netflix recommendations, Amazon’s and Spotify’s suggestions, and LinkedIn’s and Facebook’s prods. Who hasn’t heard an IBM Watson ad? These may be recent, but AI has been around the legal world for a long while, in both legal research and electronic discovery.

At Legalweek, vendors were selling AI not only for research and discovery, but also for brief analysis, contract review and data analyses (such as examinations of every case decided by a particular judge). Yet many salesmen were not able to describe how AI enhanced their products nor could they explain the design of the AI algorithms that powered their tool.

Except for those from the largest firms, most lawyers are challenged to evaluate the relative merits of competing products that tout their AI. Commentator Robert Ambrogi recently reported that the number of AI companies selling to lawyers had increased by 65% in the previous year. The products are not only competing, they are also lacking a track record.

To a certain extent, the U.S. law librarians may come to the rescue. The descriptive title of a presentation at their Association’s 2018 convention asserted that “Every Algorithm Has a POV.” The article reported the results of 50 carefully structured legal searches performed with the tools we use: Westlaw, Lexis Advance, Fastcase and several others. It compared the top 10 results from each database and found that an average of 40% of the returned results were unique to that database. While not a total disaster, those results illustrate that we may trust those everyday tools, but we should verify the results.

Deeper analysis of the “Point of View” issue requires consideration of the algorithms used by the databases and tools. Different teams of humans developed the different tools, and so each legal database makes different decisions about how to relate and analyze the search terms chosen. Programmers may consider that one term is more important than another and should be then associated with a third term. They can also train a machine progressively by teaching it with preferred or ranked facts and associations. It will then return more results from the database using the criteria it learned along the way. But machines may be misled if the initial knowledge used to fire up their artificial intelligence was inaccurate or reflected only the situation known at start-up.

More recently, the law librarians tested data analytics tools marketed to lawyers. The names of these platforms should be familiar. The study evaluated seven, including Bloomberg Law, Lex Machina, LexisNexis Context and Westlaw Edge. Twenty-seven librarians from law firms and academic law libraries worked on the project, each testing two platforms for a month. They tested those platforms with a common set of 18 questions. An example was a query: how many times a certain law firm had appeared in front of a certain judge in the district of Delaware. The exact answer was 13, but not a single platform obtained that result. Again, “trust but verify.”

Lawyers are not the only ones who should be concerned about AI’s outputs. Even technology companies can make AI mistakes. Earlier this year, published reports recalled an effort by Amazon since 2014 to develop and use a recruiting tool to review resumes. The tool ranked each resume on a five-point scale. Early on, Amazon realized that it was not rating candidates for technology and software positions in a gender-neutral manner. Though it sought more diversity, it had trained its recruiting tool by feeding it the resumes of 10 years of hires. Unfortunately, few of those previous hires had been women, so the AI tool was trained to prefer male applicants with the same experiences and professional relationships possessed by previous, successful male applicants. For instance, the tool learned to penalize resumes containing the word “women’s” and downgraded graduates of the two all-women’s colleges. Amazon gave up on that effort last year and disbanded the development team.

What can we do to avoid disastrous consequences such as a wrong answer, a missed case or citation, or an overlooked relevant and important document? Most lawyers can’t afford multiple AI-induced tools in each category that might allow machine verification of results. At a minimum, we should demand a “proof of concept” period before committing to a purchase. Then when using a tool, we should depend on common-sense analyses of search results including suspicions about the absence of an expected result. Application of that approach will probably satisfy the standard of care; and fortunately in the world of electronic discovery, it should satisfy the need to conduct a reasonable, good-faith effort to collect and produce the relevant information.

In the final analysis, we will have to learn on the job what works for us and what we can afford to obtain and use. There is an interesting journey ahead.

Bill Kammer is a Partner with Solomon Ward Seidenwurm & Smith, LLP.