Success Story: Large-Scale Real-Time Image Content Moderation

NCC presenting the success story

TÜBİTAK-TRUBA, one of the two capacious HPC centers authoritatively in Turkey, coordinates NCC Turkey. The Middle East Technical University (METU), Sabanci University (SU), and Istanbul Technical University National Center for High-Performance Computing (UHeM) are third-party partners and cooperate with TÜBİTAK-TRUBA in the decision-making process. In a project, TÜBİTAK-TRUBA provides technical support to researchers and monitor running jobs. Furthermore, TÜBİTAK-TRUBA regularly organises online meetings with the researchers to discuss drawbacks and progressions. This project is conducted under the auspices of TÜBİTAK-TRUBA with the inspections mentioned above. On the researcher side, the METU researcher team performs the project and present the success story.    

Industrial Organisations Involved:

Founded in 2010, Machinetutors (https://machinetutors.com/) provides machine learning consultancy and customized AI software development services. Machinetutors empowers businesses all over the world by solving real-world problems. Machinetutors has two products; mtDATA, a data collection and annotation services platform, and mtAPI, SaaS AI solutions with pre-trained models, customization options, and scalable infrastructure.

The client is a British content moderation SaaS start-up. Company name and further details are confidential.

Technical/scientific Challenge:

This project addresses the problem of large-scale real-time image-based content moderation. The system is deployed to a production environment where tens of thousands of users browse the internet daily. The system must be both accurate and run in real-time to meet the business requirements. Moreover, the model size must be small so that multiple copies of the model can be run simultaneously on a GPU to reduce server costs. A major challenge has been making several models work efficiently together.

Business impact:

Our client, a SaaS online content moderation start-up, is currently the number one content filter in their specific target market around the world thanks to the success of AI-supported high-tech features developed during this project.

User feedback on all our models from our clients’ users are positive and they consider it to be the best product on the market. Thousands of users now browse the Internet with their adjusted moderation level. Our client has already reached a breakeven point financially.

HPC’s speed and cost benefits enabled the project to be successfully delivered on time. All of the engineers on Machinetutors team are now proficient in using the TRUBA infrastructure due to this collaboration. We were able to work effectively and efficiently with our colleagues from TRUBA and look forward to the next project.

Solution:

In order to solve the problem defined, we develop three main models. In the first model, we propose a multi-label NSFW classifier that can detect the NSFW levels (light, medium, hard) and predict other labels, such as the real person and clothing characteristics. The second model is a one-stage body -based age & gender detection model. Current age & gender methods are both face based i.e. they use face bounding boxes and are two-stage processes, they first run a face detector and then run the model on these boxes. When multiple faces are present in an image, this approach fails to meet the real-time requirement. The third one is a segmentation model. These three models run in a pipeline via which we can run various scenarios.

Benefits:

Machinetutors:

  • With this collaboration, we were able to run many experiments in parallel and quickly see the effects of the model updates.
  • With the ability to run large batch size trainings on newer GPUs, our experiments completed much faster.
  • Being able to access many GPUs at the same time enabled us to tune the hyper-parameters of each model to improve the results.
  • The speed and cost-efficiency provided by this support have helped us gain a considerable competitive advantage in the global AI ecosystem.

SUCCESS STORY # HIGHLIGHTS:

  • Keywords (Min 5): Artificial Intelligence, Machine Learning, Deep Learning, Content Moderation, Classification, Segmentation, Object Detection, Data Collection and Annotation.
  • Industry sector: Computer Science, Artificial Intelligence, Software
  • Technology: HPC, AI

Contact:

contact@machinetutors.com

This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732. The JU receives support from the European Union’s Horizon 2020 research and innovation program and Germany, Bulgaria, Austria, Croatia, Cyprus, the Czech Republic, Denmark, Estonia, Finland, Greece, Hungary, Ireland, Italy, Lithuania, Latvia, Poland, Portugal, Romania, Slovenia, Spain, Sweden, the United Kingdom, France, the Netherlands, Belgium, Luxembourg, Slovakia, Norway, Switzerland, Turkey, Republic of North Macedonia, Iceland, Montenegro