Wasabi AiR provides key machine learning (ML) services:
Audio Classification automatically identifies and categorizes sounds or audio clips based on predefined labels such as speech, music, or environmental noise.
Logo Detection identifies brand logos within images or video frames using computer vision techniques.
Natural Language Description captures the key topics, scenes, or events so the content is easier to understand, search, and organize with metadata tags.
OCR (optical character recognition) converts printed or handwritten text in images or scanned documents into machine-readable text.
Person Detection determines if people are present in an image or video, and generates vector embeddings.
Speech To Text converts spoken language in audio or video recordings into written text using natural language processing.