Email Data Extraction Based Catalog Enrichment

Helped a shipping brokerage company enrich its ships catalogue by automatically ingesting incoming emails.

Project Synopsis

One of the factors influencing the ability of a ship management firm to close a business deal depends on how much information they have about shipping fleet of different vessels from all over the world. One of Rare Mile's clients - a large shipping company based out of Europe received thousands of emails every day from their affiliates all over the world with information about ships available in different parts of the world. These emails were processed manually by a few associates who used to read the information from the emails and update the central ships repository.

This approach had the following constraints:

  • Manual update of data was usually very slow and error prone
  • There was always a backlog of information about ships that has been received but not processed
  • The client was not sure if they are losing business opportunity in the absence of updated ship information

Rare Mile Solution

We worked closely with the client's team and rolled out its Email Analytics engine that automatically ingested incoming emails as they arrived, extracted information about ships written in plain english and updated the central ships repository. The solution was based on our proprietary information ingestion algorithm which was enriched by business provided rules. As the system processed more information and identified new patterns, it gained accuracy over time.

This solution had following highlights:

  • Real time, automated enrichment of ships catalogue
  • Availability of accurate information for business users
  • Cost savings as a result of process automation

Project Highlights

  • Solution Designed & Implemented in 8 Weeks
  • Fixed Price Engagement
  • Completely freed up 3 people manually managing catalog information

About The Project

This project involved extracting and ship attributes from unstructured information coming in from several hundred daily emails

Technologies Used

Java Based Backend
Natural Language Processing (NLP)
Open Source Libraries
Custom Rules Engine

Client Details