
Challenges
The Contacts PDFs needs to be run through an Optical Character Recognition system.
We are only searching for topics/headings, therefore we only need to extract the headings from the text.
Handling and Searching in a large amount of data.
Results
API which takes contact number and search term as an input was built.
It outputs the potential pages where the term could be found.
The client’s team then integrates them in their PDF reader to enable direct search and read operations.