Knowledge Mining in Azure : Q & A
I've recently participated to the Astana EdTech forum, and during the event we have had an interesting discussion on the Knowledge Mining and the process of turning your enterprise data into valuable assets. For some reasons the interview has not been published so I decided to share it here. Hope you will find it interesting.
Can you tell us about data handling?
The process of working with data consists in extracting valuable insights from all the available information (both structured and unstructured) – in the end helping to engage customers, transform products, empower employees and improve operations.
How can you use big data?
Let me give you a concrete example. Expertime’s partner, Scinan, developed software whose goal is to make tens of thousands of scientific results accessible and understandable through knowledge graphs. The inspiration for doing so came from the founders’ experiences in the field of medical research and new technologies in computer science. There is a big difference between what is done in the research laboratory and what eventually ends up in the field. Their mission is to facilitate access and analysis of scientific publications in order to help the emergence of new progress for our societies. Based on Microsoft Academic Graph and Cognitive Search, their solution helps you expand beyond search results so you can see connections between articles, browse related research, and see how your field is evolving.
So, what is Scinan's Augmented Bibliographic Explorer? It is a knowledge base of more than 270 million articles, books, patents in 19 different disciplines. Scinan's scientific publications explorer avoids you to browse tons of web pages by its structure as a knowledge graph where sources are interconnected and enables students to easily and quickly master the research and analysis of scientific literature when they enter the job market. This tool proposes a faculty information system solution focused on the enhancement, exchange and learning of knowledge through advanced representation and interaction of scientific publications through the integration of AI and knowlege graph technologies. Through this Scinan allows us to learn faster and better in all areas of science with the scientific publication visualization solution that uses knowledge graph and AI technologies.
This is a perfect example of applying BigData and AI to construct a tool which has never existed before, and turning the data into knowledge.
How would you explain the process of Knowledge Mining in plain and simple language?
The knowledge mining solution area focuses on the core challenge of turning your data (often unstructured) into a knowledge source. Unlike simple database research this kind of challenge requires more sophisticated algorithms and approaches like ,for instance, analyzing video or scanned document archives. However, thanks to recent innovations across vision, speech, language, decision and search mean for the first time it’s possible for our applications to interpret unstructured data in a human-like fashion, and to more deeply understand traditional data in the form of text. Together, search and AI create a unique solution for finding value in your data– in the end helping to engage customers, transform products, empower employees and improve operations.
How does it help to improve the work and the business processes of the company?
The process of turning your unstructured data into a structured format can cost time and money. Internally, decisions are less informed, take longer to make, work is done manually, Externally, user experience on apps can be difficult to navigate, and customers may not be able to find relevant content and products. Thus, information has the potential to become an asset or a burden, depending on how you use it. The process of turning your unstructured data into a structured format can cost time and money. Internally, decisions are less informed, take longer to make, work is done manually, Externally, user experience on apps can be difficult to navigate, and customers may not be able to find relevant content and products. Thus, information has the potential to become an asset or a burden, depending on how you use it.
How can it be used for good - to make it an asset, and vice versa - a burden?
Knowledge mining has several key scenarios, depending on what you’re trying to do with your information. For instance the problem is surfacing the information most relevant for your customer in a website or application. The benefit of applying knowledge mining is that you finally increase sales and customer satisfaction and improve user experience. Or, the problem may be making sense of all types of content—PDFs, word docs, images—that’s being done manually today. And the benefit is to make existing business processes and decision making scalable using all information available, helping to reduce time, streamline work, increase sales, and identify risk/opportunities.
Data may also turn into a burden if you do not make any additional value of it. For instance, imagine a large company’s Business Unit which manually mines critical data from drilling and completion reports. This a laborious, time-consuming, and error-prone process. That way, developers and businesses can’t extract information from their forms and documents quickly, accurately, and tailored to their specific content. This implies heavy manual intervention at exorbitant costs. Subject matter experts have no time to focus on higher-value activities and information flows less rapidly, making operational control slow.
What companies, in what areas of business today need to implement AI-enriched Knowledge mining?
Any company willing to innovate with knowledge mining as part of its larger digital transformation initiatives may start applying Knowledge mining along with AI. Certain types of information—depending on your industry or business—might be important to understand. You need to have the room to extend your search solution to fit those specific needs by creating industry-specific classifiers and skills, such as clause extraction for a specific type of legal information or identifying one type of pharmaceutical drug. Also, you may want to optimize your search experience based on the behaviors or information specific to your business, and you can also integrate your own data to create custom models and classifiers that feed into your AI platform. At the core of every search problem, a product needs to be able to take in information, synthesize it, and surface the relevant components of it, thus the main goal for companies is to find and clearly define this search problem.
In what areas is data mining most widespread and why?
In my personal opinion, Knowledge Mining has been very useful in digitizing contracts. In combination with AI, it allows them to ingest contracts, extract content, and then find meaning through custom ML models. By helping companies better understand and manage their contractual obligations, Knowledge Mining improves compliance, reduces risk, and streamlines operations.
Moreover, for companies in highly regulated industries, the ability to verify and search through archived data with ease can be the difference between extracting timely insights and incurring hefty fines. Consequently, those companies need an easy way to manage their archived data in the cloud and enable their customers to ask complex questions of petabyte-sized datasets both quickly and cost-effectively.
For companies in highly regulated industries like financial services, healthcare, pharmaceuticals, and insurance, the ability to easily search through petabytes of archived data is a big deal. It’s something no company can afford to overlook, given the hefty fines handed out to those unable to provide specific data along with audit and accountability information on request.
Where to study?
There may be no recommendation as any skill is valuable, and any specialist’s opinion is a welcome contribution when you start your digital transformation with Knowledge Mining.
Where do you start and what would you advise for aspiring professionals in this field?
You should be conscious of the fact that developers and businesses are at different stage of their digital transformation journey. Thus, you should start by exploring the market. Ideally, you need to find a solution that will help you accelerate your digital transformation with applied AI no matter where you are in the journey. Your core search engine should be built on the back-end with sophisticated models and algorithms to rank, prioritize, and synthesize your customer’s searches. However, you can also get more insights from your data with some pre-built AI services and OCR tools to search different types of information. And, if you’re at a more advanced stage, find the platform where you can plug in your own ML models as well. You can start simple and add more complexity when ready.
For developers, I would suggest starting from learning BigData concepts, and understanding how indexing and crawling works.