data capture

Harness the Power of AI for Statutes with Our Cognitive Data Capture Solution

Soko helps clients stay ahead of financial crimes with their participation reporting service. Their expert team focuses on reducing fraud, criminal activity, and money laundering. With this service, clients can quickly access a person’s involvement in other companies, gaining an advantage in preventing financial crimes.

Technologies:
Algorithms: Transformers, Embeddings, TL-GAN, GAN-based noise, YOLO, Faster R-CNN, NER (Named entity recognition), LSTM
Libraries: Tensorflow, AsanteOCR, Camelot, QR detection, OpenCV, spaCy
Development: Stack MEAN (Mongo, Express, Angular, Node), FastAPI
work-detail2.jpg

The Challenge

A team of experts in compliance, technology, risk and data management has the goal of reducing financial crimes and that its clients can protect themselves from fraudsters, criminals, terrorists and money launderers.

For this purpose, the client offers a company participation reporting service. Given the RUT of a natural or legal person, a report is prepared that shows all the companies of which it is (or was) a part and the percentage of participation that corresponds to said person.

The company has a team of 30 people who are in charge of reviewing the history of commercial statutes published in the Official Gazette of Chile and uploading them manually in a web form designed for this purpose. This involves a very repetitive job, subject to errors due to the lexical complexity with which notaries write these documents and entails an inordinate amount of time, since it requires processing the history of 3 million commercial statutes.

The process of drilling in mining consists of obtaining a soil sample by diamond drilling. These samples, which easily reach thousands of feet, are placed in trays intended for this purpose, tabulated and high-resolution photographs are taken. The geologist visually detects and counts fractures, classifying them as natural or induced, depending on whether they are real fractures existing in the earth layers or were caused by drilling or moving the samples. 

The Solution

The solution proposed by the Mootech team includes, first of all, the training of multiple machine learning models that involved manually placing labels on each of the existing entities, in a total of 1,000 corporate bylaws.

The project was divided into 3 stages according to the type of document in statutes of creation, statutes of modification and statutes of dissolution so that our data science team could concentrate on specific models.

The development team carried out the implementation of a web application that presents users with the corporate bylaws with the respective entities detected that did not approve the automatic validations. Then the user will be in charge of reviewing them (or updating them if necessary) and thus ensure that erroneous data is not inserted in the database.

Our data scientists selected semantic segmentation algorithms for image processing and model training with the previously labeled data.

CRISP-DM methodology was used and at each completed iteration benchmarking was performed with different experts (geologists) to feed back the model with new training.

Our solution also included the development of a web service to enable interoperability of the proposed system, i.e., to be consumed and integrated by the client entity’s systems. Additionally, we developed scalability elements of the solution through automation and scalability practices known as MLOps, providing capabilities for continuous model retraining and identifying issues that could affect the solution in a production environment.

results

What we achieved

Thanks to the model implemented by Mototech, the client was able to process the history of 3 million corporate statutes in 3 months of execution. 70% of the documents were approved automatically, without requiring manual intervention.

30% of the documents that followed the manual process showed an average statistic of time required for validation or update of 1 minute per document showing the recognized entities.

Geologists have a visual tool that preloads (in less than a second) all the fractures detected in an image and their task is reduced to verify this information in an agile and efficient way.

In addition to the reduction of analysis time, the introduction of the prototype reduced the seniority level required for this task and collaborated in the process of unification of criteria for fracture selection.

In real-world testing, system users report that response capacity and quality of analysis improve significantly, reducing the average time a claim spends waiting to be classified from a week to less than 5 seconds, resulting in greater efficiency in the management process and financial market supervision in general.