Objective:

Automate the evaluation of image and caption submissions to ensure quality, relevance, and compliance with safety standards.

Key Features:

  • AI-powered caption analysis using Nvidia’s NeMo Guardrail to filter inappropriate, explicit, or nonsensical content.
  • Image assessment using Langsmith’s Image Analyzer to detect NSFW content and assign a quality score.
  • Automated text-image relevancy check to verify if captions accurately describe corresponding images.
  • A structured scoring mechanism integrates text and image evaluation for comprehensive content moderation.
  • Reduction of manual moderation effort while enhancing data integrity and compliance.

Results:

  • Enhanced content quality with automated rejection of inappropriate or irrelevant submissions.
  • Reliable validation of image-text correlation improves overall data consistency.
  • Significant reduction in manual intervention required for content review.

Customization for IT Staff:

  • Adaptable for IT service management (ITSM) platforms to analyze documentation, screenshots, and support tickets.
  • It can be modified to assess internal documentation quality and accuracy, ensuring technical correctness.
  • AI-driven automation to categorize, validate, and improve the quality of IT knowledge base articles.

Timeline 13 Weeks:

  • Caption Quality Check (Nvidia’s NeMo Guardrail): 2 weeks
  • Image Analysis (Langsmith’s Image Analyzer): 3 weeks
  • Text Analysis: 1 week
  • Score Calculation: 3 weeks
  • Image-Text Relevancy Check: 4 weeks

Techstack:

Python, Flask, Langchain, LLM, GPT-4o, Bing Search, ElevenLabs