READ BETWEEN THE LINES: How AI Accelerates Data Mining from Images and Text

A warehouse full of boxes

Supply chains have become more complex and difficult to manage due to market volatility, the rising complexity of product portfolios, and an increased focus on the supply chain’s impact on the environment.

Unlocking Value Hidden In Unstructured Data

“Data locked away in text, audio, social media, and other unstructured sources can be a competitive advantage for firms that figure out how to use it.” – Tom Harbert, MIT SloanT

Supply chains have become more complex and difficult to manage due to market volatility, the rising complexity of product portfolios, and an increased focus on the supply chain’s impact on the environment. But armed with Big Data and Analytics, supply chain leaders can proactively make data-driven decisions.

Data is everywhere, so the challenge is no longer where to find data but how to quickly and accurately extract value from it. Big Data Analytics can be carried out faster and easier if all data is structured. But in the real-world, this is not the case.

Previously, the majority of data was structured. Now, unstructured data dominates digital spaces (as represented in the graph above) as 80%– 90% of the data men and machines generate is unstructured. This makes unstructured data critical to optimizing the supply chain. If you analyze it effectively, you will obtain a clearer view of your supply chain processes and networks and reap greater competitive advantages.

Why Many Organizations Fail To Harness Unstructured Data

“…analysis and mining of data are getting more complex with [the] massive growth of unstructured big data.” – Journal of Big Data

According to a Deloitte survey, less than 20% of organizations can take advantage of unstructured data.1 Another study found that 95% of companies cited the need to manage unstructured data as a problem for their business.

Semi-structured and unstructured data — such as texts, images, videos, and audio — can be tedious and time-consuming to analyze because they are not organized in a clearly defined model or framework. Before analysis, it has to be extracted or transformed into easier-to-interpret data. And, it does not adhere to conventional analysis models.

  • Unstructured Data Is Harder To Search And Categorize Unlike structured data, unstructured data does not sit in standard databases or formatted rows and tables. It can be scattered across various sources in different formats — from paper records to email correspondents, engineering drawings, web pages, to social media content. Silos can make it even harder to search, amalgamate, and categorize unstructured data.
  • Extraction Is Complex And Time-Consuming Before you can take advantage of unstructured data, you need to extract it first. For example, to identify and eliminate duplicate/overlapping products, you need to analyze images or engineering drawings, which are unstructured. To match each image with its possible duplicate, you need to identify and map its key points. The extracted information (the identified and mapped key points) is structured, which is easier to interpret. Can it be manually done? The short answer is yes. But the entire analytics process — from data collection to categorization, extraction, and mining — will surely be resource-intensive and prone to errors. The more unstructured data you need to analyze, the more complex, costly, and laborious it can get.
  • You Can’t Mine Unstructured Data Using Conventional Models When mining or analyzing unstructured data, you can’t use the same models you use for analyzing structured data. You may need to formulate new models each time you analyze a new set of unstructured data for a new purpose or goal. This can be problematic when done manually.

To overcome the complexities of analyzing unstructured data, you need an intelligent and scalable technology solution that allows you to optimize unstructured data search, collection, categorization, extraction, and mining. The solution should allow you to automatically, flexibly, and nimbly build and train models for interpreting extracted information from unstructured data.

AI Can Simplify And Accelerate Text and Image Extraction

“AI’s ability to analyze huge volumes of data, understand relationships, provide visibility into operations, and support better decision making makes AI a potential game-changer.” – McKinsey

More than 97% of companies are investing in Big Data and Artificial Intelligence (AI). And those that already did were able to automate 70% of data processing and 64% of data collection tasks, which helped simplify and accelerate analytics.

AI offers more than just automation. What sets AI apart from other technologies is its learning capability — which is continuous, adaptive, dynamic, and scalable. Its intelligence grows with the data it consumes. Its inference is not rigid and limited by predefined programs, making it suitable for analyzing highly complex data formats and types.

AI in Action

To give you an idea of how AI-powered Big Data solutions work, let’s use Mareana’s Unstructured Data Hub (UDH) as an example.

Mareana’s UDH uses AI and smart algorithms to liberate, synchronize, unify, and contextualize data from images, text, emails, and other unstructured data sources. It helps accelerate and simplify text and image extraction and mining by empowering you to:

Enable a Google-like Enterprise Data Search

Mareana’s UDH uses intelligent algorithms to simplify unstructured data search and collection. It empowers all data users to easily find the relevant files, images, and data they need by liberating data and bringing them together in a single hub. Its powerful algorithms help data users to easily classify unstructured information into appropriate groups, improve search ranking, and link information from different records. Its Google-like search platform also allows for easy identification of duplicates.

Digitize Paper Batch Records And Unstructured Drawings

Digitization of paper batch records and unstructured drawings is key to accelerating data extraction and mining. Once digitized, unstructured data can be extracted and interpreted automatically — rather than manually — using digital solutions.

Manual data transformation is tedious and costly. Mareana’s UDH uses AI to:

  • Easily and quickly digitize unstructured data.
  • Understand its content.
  • Extract metadata.
  • Classify it into appropriate groups, allowing easy search and consumption of its contents.

Automate Data Extraction

Mareana’s UDH uses AI and smart algorithms to simplify data extraction, eliminating manual and non-standard steps that are prone to errors and mistakes. It automates the extraction of handwritten content and metadata associated with extracted content. As a result, workload and cost are significantly minimized and extraction is reduced to hours or days, instead of weeks or months (depending on the volume of data).

Quickly And Automatically Analyze Large Datasets

Mareana’s UDH uses industrialized machine learning and modeling to analyze large datasets from multiple internal and external systems. You can take advantage of its library of proprietary industry/process heuristics, Machine Learning, Natural Language Processing, and Mathematical Modeling to comprehensively mine data and quickly make inferences.

Tangible Benefits

Businesses that leveraged Mareana’s UDH were able to reap the following business benefits:

  • Working Capital Improvement A global medical device company went through a series of successful acquisitions. This resulted in numerous duplicate parts and overlapping products and a significant portion of its working capital tied up in excess inventory. It leveraged Mareana’s UDH to reduce duplicate parts and saved $300M.
  • Optimized PLM, Reduced Regulatory Risks By using Mareana’s UDH, a global pharma company was able to expeditiously digitize, extract, and analyze paper documents scattered across 100+ global locations. The company was able to establish data governance, eliminate costly rework, improve product portfolio management, and reduce regulatory risks.
  • Improved Collaboration And Product Innovation Mareana helped a global pharmaceutical company implement a Google-search-like capability using UDH. This helped accelerate evidence generation, improve evidence sharing, and simplify access to previously hard-to-locate documents. The client saved $1.5 million in touch time and reduced cycle time.

The possibilities of AI are limitless. Its ability to continuously learn and infinitely scale makes it a viable solution for transforming texts, images, and other unstructured information into data that is easy to analyze and interpret.


Unlocking the value trapped in unstructured data should not be difficult, complex, and costly. You can easily and quickly extract and mine unstructured data if you take advantage of AI.

Mareana’s Unstructured Data Hub is AI-powered, allowing you to expedite the transformation, extraction, and mining of unstructured data. It empowers you to simplify the entire data analytics process by liberating data, optimizing data search, automating data classification, and building flexible models based on current needs. It allows you to make data-driven decisions; significantly reduce costs, risks, errors, and waste; and focus on quality and process improvement.