BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Assessing AI Is Not Easy, But It's Now Imperative

Eric Sydell, PhD, is the co-founder and CEO of Vero AI and the co-author of the book Decoding Talent.

If you ask 10 people what artificial intelligence (AI) is, you will get 10 different answers. And most of them will be vague and centered around the idea of computers that can think like humans, or smart, evil machines that might do humans harm.

So what exactly is AI? In many ways, it’s just statistics by another name. While traditional statistics deals with numerical data, AI techniques can help us process messy, unstructured, qualitative data such as videos, imagery and text. In this sense, AI can help us understand vast amounts of data around us and the relationships between data points to essentially understand our world in much deeper ways.

As a statistical technique, AI is highly useful and represents a powerful step forward in our ability to scientifically and objectively understand our world. But of course, AI is used for many things, not just scientific discovery. As it is integrated into commercial offerings, it allows developers to automate activities to a vast new extent.

While using AI to understand our world is quite a positive capability, AI that is used for product offerings can be much more of a mixed bag. To profit from AI, a company must provide an exciting tool that goes beyond objective scientific exploration. It should promise to minimize bias, accelerate decision-making, find better leads or offer something equally compelling.

Powerful financial forces are helping to propel AI technologies to market, but buyers need to beware. Given the lack of comprehensive regulation and understanding of these tools, many of them fail in problematic ways. For example, an AI tool that purports to help summarize information might be biased or “hallucinate” results that have no basis in reality.

So how do you evaluate how well any given AI technique is working?

Understand your AI lens.

What are the top concerns for your business when it comes to deploying AI—legal compliance, bias mitigation, business process improvement or some other criteria? For most organizations, compliance is a primary concern, especially with regulations like the EU AI Act.

But if your only lens is compliance, you may end up deploying an AI tool that meets legal minima but has no demonstrable positive impact on your business. For example, in the United States, you would be legally safe if you selected new employees by flipping a coin—this would cause no bias against protected classes or invasions of data privacy, and these are the two most frequently regulated issues. But you would also be deciding who to hire in a completely random and ineffective manner. Thus, legal compliance is not enough.

Focus on the outcomes.

From a business perspective, it is not important whether a given tool has an AI component or not. What matters is the benefit the tool provides to your business. While it is always a good idea to understand how any complex tool works, it can be difficult and expensive to develop a full understanding of complex algorithmic and AI tools.

For this reason, organizations should focus on the outcomes of these tools. This does not mean the promised results that an AI vendor lists on its website or in its marketing materials. Advertised outcomes are often only one lens and may fail to meet more scientific standards of rigor.

For example, if an AI-based new hire assessment claims to mitigate bias against federally protected classes, that is likely a positive characteristic of the tool. But does the vendor have data that suggests the tool also leads to more effective decisions? If not, then you might consider saving money and just flipping that trusty coin. Also, ask the vendor how it calculated its results for mitigating bias. Remember the old saying popularized by Mark Twain, "There are three kinds of lies: lies, damned lies, and statistics.” It is exceedingly easy to create a marketing claim that sounds amazing but in reality, is flimsy.

For another example, let's say a vendor tests its AI on a sample of 50 cases, and finds that the average score of the 10 people of a particular minority group in the sample is 70 and the average score of everyone else is 80. Further, let's specify that an older, non-AI assessment showed an average score of 55 for individuals in the minority group and 60 for everyone else. The vendor could then claim that on average, in its research, these individuals in the minority group scored 27% higher using their AI tool without mentioning that everyone else scored even higher. Plus, a sample size of just 50 would not likely be large enough to draw any stable conclusions. As a consumer of tools such as this, ask questions to ascertain whether marketing claims are truthful.

Establish an impartial system for measurement.

To ensure you are evaluating an algorithmic tool in a comprehensive manner, it’s important to utilize a measurable framework that’s created in a scientific, thoughtful and objective way, looking at the following outcomes.

• Legality And Compliance: Is the tool legal and compliant with current and likely future rules?

• Fairness: Is the tool fair for all classes of users and does it generate fair results?

• Effectiveness: Does the tool produce a clear and positive business impact?

• Accuracy: Is there rigorous testing data proving that accuracy is high?

• Human Centricity: Does the tool improve the working lives of individuals, not just the organization?

When measuring these outcomes, it is also critical to understand the difference between a point-in-time audit and a continuous quality assurance process. While audit results can look fine at a given time, what you really want to do is set up a system that can monitor these criteria continuously.

The state of AI evaluation is in its infancy, but focusing on the categories above will ensure you are not snowed by marketing hyperbole and select a tool that has a comprehensively positive impact on your business. Once organizations prioritize the evaluation of their AI tools and systems, they will discover technologies that will deliver greater value, are free of bias and meet regulatory standards.


Forbes Business Council is the foremost growth and networking organization for business owners and leaders. Do I qualify?


Follow me on Twitter or LinkedInCheck out my website