AI Being Employed to Further Mission of Mass Media Company with a Heritage in Large-Scale Text Classification
Dr. Khalid Al-Kofahi, Head of AI at Thomson Reuters, has years of experience applying AI technologies to solve business problems. He sees content as key differentiator because it is both the input to the machine and the output of the machine for many AI applications. He predicts AI will automate tasks humans have to do manually, freeing knowledge workers up to move up the value chain. During his more than 20 years with Thomson Reuters, Khalid has led the development of many advanced technologies, including natural language processing applications to mine information from text, large-scale text classification, recommender systems, question answering and summarization. Dr. Al-Kofahi is based in Toronto. Thomson Reuters Corporation is a Canadian multinational mass media and information firm. The firm was founded in Toronto, Ontario, Canada.
Khalid has a Ph.D. in Computer Engineering from Rensselaer Polytechnic Institute in Troy, NY, MS in Computer Engineering from Rochester Institute of Technology in Rochester, NY, and a BS in Electrical Engineering from Jordan University of Science and Technology in Jordan. He recently took a few minutes to talk about his work with John P. Desmond, AI Trends editor.
AI Trends: Thank you for talking to us today. Can you describe your role at Thomson Reuters?
Dr. Khalid Al-Kofahi: Thank you for having me John. I head Thomson Reuters R&D and the Center for AI and Cognitive Computing. Essentially, we are a team of scientists, engineers, architects, and solution designers with expertise in AI, software development, and design-thinking methodology. As a team, we have a very simple objective, to help the business identify and realize product opportunities that are enabled by AI and machine learning, focusing primarily on how professionals find information, how they interact with information-based solutions, and how they ultimately make decisions.
Could you describe the strategic role that AI is playing at Thomson Reuters?
Thomson Reuters is in the business of building information-based solutions for professionals. That means that for each of our verticals, we aggregate content, we author content, we enhance that content, organize it, mine it and connect it. We then make it findable through push, pull, and navigate, (in other words, search recommender systems, and navigation-based discovery). We then build vertical applications and capabilities for specific use cases, whether you are trying to draft a document or perform some compliance functions, and so on.
Each of these tasks, both on the content publishing side, in the delivery process, and in the vertical applications, provide numerous opportunities for AI to either introduce automation or augmentation, or to deliver a superior experience. I can give you examples of this as we talk further, but essentially we focus on introducing AI in each of these touch points in our content creation, delivery, and customer experience.
Could you describe some of the key AI-based applications at Thomson Reuters, and the key technologies that they employ?
Let me talk about it from two perspectives, the back end and the front end. On the back end we have mature content publishing workflows to collect, create, enhance and organize content with some editorial tasks (e.g., writing and classifying headnotes) that have been going on since the late 1800s. Almost 25 ago, we started looking at each of these editorial tasks as potential target for automation or machine-assist. The objective is not necessarily to reduce cost—which is a good thing—but to improve quality, consistency and to scale our efforts while at the same time improve time to market. As a result, we put in production dozens of applications for text classification, named entity extraction and resolution, concept extraction, relation and event extraction, sentiment analysis, summarization, and so on.
On the customer-facing side, we have developed powerful online platforms focusing on helping professionals find information, understand it, and make decisions. On the findability side—we developed a number of vertical search engines that are tuned to our domains including robust open-ended natural language question answering. Personalized and self-adaptive recommender systems. On the understanding side, we developed applications to help professionals author documents, analyze their own as well as other documents, perform large scale document review tasks, perform risk mining tasks and so on. On the decision support side, we developed large scale analytics applications to help people make decisions based on data and insights as opposed to intuition.
From a technology perspective, we use most of the machine learning tools and algorithms that are out there, with significant focus on natural language processing and information retrieval.
What challenges do you face in the implementation of AI at Thomson Reuters?
In any vertical application or industry, AI solution requires three key ingredients: content, subject matter expertise, and AI expertise. Most organizations face challenges in each of these areas.
Content is key. It is both the input to the machine and the output from the machine. It is why many of our customers come to us in the first place: they want content, and they want content that is accurate, that is enhanced, that is current, that is connected, and that is trusted. So the challenge is to develop processes—manual, automatic and machine-assisted—that help us maintain world-class content that is increasingly designed not just for human consumption, but for machine consumption as well.
The second ingredient is subject matter expertise (SME), which is needed to ensure that we are solving for the right problems, that we are able to train and validate the algorithms as well as capture the nuances of our domains in a way we can compute on. Developing solutions require SMEs who have some understanding of what AI can do and how it works as well as AI and ML experts who have some understanding of the domain and its nuances. The primary challenge here is that building such teams takes time.
The third ingredient is AI expertise and skills. Our primary challenge here is how to scale beyond our resources. As you know, the competition for AI talent is very strong, and just like many organizations, we can’t find enough people to satisfy the business demand. So our answer to that challenge should not only be, “Hire more people.” Our answer should involve broadening the circle of participants in developing AI solutions to include larger technology organizations because not every AI solution requires a PhD in Machine Learning, or Natural Language Processing. That is the first part.
The second part of the answer is about finding the right balance between build, buy, and partner. We have a strong track record of building in-house. Unfortunately, our track record in partnering with the right external teams and organizations is still developing.
Finally, the biggest challenge—in my mind—is about creating a compelling vision of what AI-enabled experiences look like with sufficient details for the various job roles to make it actionable. We then need to figure out how to get there incrementally. Going for it in one shot is too disruptive and deprives us from many learnings along the way. So that’s on the product vision, the AI vision side.
Can you talk about any new products or services, or innovative work that you’re doing now?
In the last 12 months, we released three major AI-powered products for our customers: Westlaw Edge for our legal customers, Checkpoint Edge for our tax and accounting customers, and in July of this year we released Westlaw Edge Quick Check, an AI-powered document analysis tool to help attorneys write better briefs and motions, as well as defend against briefs and motions from opposing counsel.
If you look at the underlying capabilities in each of these three products, for example, you will see things like document analysis, vertical search, question answering, summarization, and analytics to support data-driven decision makings, and so on. Plenty of opportunities remain in each of these areas, and while I can’t talk about specific products that we are working on, from a capability perspective, these are the kind of capabilities that we’d like to include in many of our products.
Each of these capabilities can enable a diverse set of products. For example, a capability like document analysis cuts across a broad set of use cases from contract authoring, to contract analysis, to analyzing briefs and motions, to analyzing filings of public companies or mapping regulations to business activities, and so on.
From a product design perspective, we try to remain true to certain design principles. For example, the design principles for our search engines are Responsive (to customers’ information need), Proactive (recommends content the customer did not specifically ask for), Robust (to query / question variation), Natural (supports interactions in natural language) and Intuitive (from an experience perspective). I hope to write an article about these in the coming weeks.
Is the software and services industry providing the solutions you need and are there any partners that you’d care to mention?
The short answer is yes. For context, we are not in the business of building AI tools for others to consume; we are in the business of developing applications for our customers using available AI tools and algorithms. And as such, we heavily use open source machine learning and AI tools that have been developed by companies like Google, Facebook, Microsoft, and other institutions including universities. We are definitely very grateful for them.
In addition to using open-source tools, we have been, in the last few years, exploring more work with the cloud-based AI providers, including Amazon AWS, Microsoft Azure, and IBM Watson. There is much more to AI-powered applications than the algorithms themselves. This is where working with these providers can significantly reduce time to market.
So you’re satisfied the industry is being responsive?
There are plenty of powerful, robust and easy to use tools that are also available for free. I have no complaints. Of course, we are always looking for something better, more powerful, but we can solve plenty of relevant and impactful problems with what already exists.
Are you able to find the skilled people you need to work on the AI systems?
We are able to find skilled people, but not enough of them. The demand is much bigger than our capacity. The fact that we are looking for skills in natural language processing, information retrieval, and so on in addition to having expertise in machine learning makes finding these people a bit harder.
Given the state of the market and the competition for talent, we find ourselves in a continuous hiring mode. It can be tiring, but that is the nature of the game. I don’t think we are alone in that situation.
Do you have any programs for growing the talent you need?
Yes, like many organizations, we have internal programs focusing on growing the machine learning talent from our larger technology organization. We designed programs and learning pathways, and we make tools available as well as data sets to be used for certain types of problems. Clearly, we are not Coursera and our focus is not to develop original learning material, rather to suggest a sequence of courses and tutorials that are relevant to our business and the problems we work on.
For full transparency, a lot more is needed and I can tell you that this is an active discussion within our technology organization as well as with our HR and talent teams.
Will AI be taking the place of some number of writers and journalists of Reuters? What’s the outlook in that area?
Very good question. And if I can expand your question to other professionals in our industry: lawyers and accountants. I don’t believe that AI or automation will replace jobs in their entirety, at least not any time soon. I expect AI to have a much bigger impact at the task level. This includes both task automation as well as task augmentation, where knowledge worker and machine interact and collaborate on a specific task. And as machines learn from human interactions, the balance between what is a human task vs machine-task will continue to shift in favor of the machine, thus freeing knowledge workers to focus on more meaningful tasks.
This is not a bad thing, because, in all honesty, there are certain tasks that ought to be automated. For example, people don’t go to law school so they can spend endless hours doing document review, e.g., in e-discovery where the opposing party can literally give you a truck load of documents. For these kinds of tasks the manual process is undesirable, too slow and too expensive. Even from a human point of view, automation is a good thing.
What advice would you have for young people interested in a career involving working with AI in business?
I’d say make sure you always have time for learning. Learning is not something I do once and move on to applying what I learned. It is something I should always be doing. Especially in our fields, the technology and tools are changing at a neck-breaking pace and if I am not careful, I can [have] dated skills very quickly.
The second thing for anyone who wants to do AI in industry is, it is not about the AI, it is about the value you want to create with it. Always start with the value proposition; what do you want to achieve with AI? What are the benefits to your organization and for your customers? After that, you can have a compelling AI narrative of where you want to take the market or the industry, etc. But you need to have that starting point. And before I forget—if I learned anything in my 25 years in applied R&D, it is this: more complex algorithms do not mean more business value. Some of our biggest commercial successes came from mainstream algorithms. Algorithm complexity provides you with short-term differentiation until others catch up to you. That is it. Be a pragmatist. Make the algorithms as complex as they need to be to solve the problem, but no more complex.
Third, if you want to be successful in building AI applications for a vertical industry, you need to develop an appreciation and an understanding of the domain itself. When I joined the Legal business back in 1995, I knew nothing about the US Law. I spent weeks reading about the US legal system, read at least a couple of hundred case opinions, dozens of statutes and regulations, and a few analytical articles. It was fun for the first few hours but not after that. But it was necessary—at least for me it was. It gave me a better intuition about the domain, it made me a better problem solver and a better designer.
Finally, I’d say try to have some focus. Having a broad skillset is a good thing and is often necessary. But in addition, pick one or two areas and dive deeper. Become an expert in something. Become an outlier. That makes you a specialist, it allows you to be in the driver seat and positions you to have a bigger impact.
That sounds like good advice. It’s challenging for many people to have a focus.
When I review resumes, I often see people who have been doing machine learning or data science for 3 or 4 years and they have worked on 20 different projects, with no apparent connection between them. Strong on problem diversity, I give them that. But because these projects are not connected, such a person not well positioned for depth and if one is not careful, can up looking like the AI version of a handyman or woman.
Like an AI jack of all trades.
Yes. And the challenge is, without a theme to help someone focus, you don’t have continuity. Focus does not only help one to develop deeper expertise, it allows them to build a brand. An identity. They become experts that people go to for advice.
Learn more at Dr. Al-Kofahi’s blog at Thomson Reuters.