Search API in Wasabi AiR
    • 09 Sep 2024
    • 3 Minutes to read
    • PDF

    Search API in Wasabi AiR

    • PDF

    Article summary

    To search the metadata in Wasabi AiR, make the following request:

    POST /api/data/search
    {
    	"query": {query},
    	"limit": {limit},
    	"page": {page},
    
    	"types": {types},
        "fields": {fields},
        "sort_fields":{sort_fields},
    	"only": {only},
        "filters": {filters},
        "aggregations": {aggregations},
        "hit_counts": {
            "advertising": 1,
            "audio_classification": 1,
            "caption": 2,
            "description": 3,
            "labels": 5,
            "locations": 8,
            "logos": 13,
            "keyword": 21,
            "nsfw": 21,
            "ocr": 13,
            "people": 8,
            "sound": 5,
            "speech_to_text": 3,
            "sport": 2,
            "tag": 1,
            "text_content": 1
        }
    }
    • query (string) - The text to search for (see Full Text Search).
    • limit (int) - The number of items to return per page.
    • page (int) - Index of the page number to get (0 is the first page).
    • types (array of strings) Optional. A list of types of media to include (default: all types).
    • fields (array of strings) Optional. A list of fields to perform the full-text search (default: all fields).
    • sort_fields.field (string) Optional. The field that is used for sorting.
    • sort_fields.asc (bool) Optional. Set to true to sort in ascending order.
    • only (array of strings) Optional. A list of fields that are returned with the results (default: all types).
    • filters (object) Optional. An object describing additional filters to apply (see Filters).
    • aggregations - (object) Optional. An object describing aggregations to request (see Aggregations).
    • hit_counts (object) - A list of hit counts by item data type (any type that has 0 hits are omitted).

    Response

    A typical search query response is:

    {
        "query": "{query}",
        "limit": {limit},
        "page": {page},
        "total_hits": {total_hits},
        "results": [
            {
                "result": {
                    "_id": "b44698697fc7de5725d1e1fe22d38e32",
                    "last_modified": "2016-07-20T21:21:09Z",
                    "location_id": "AVbdTJJcuT9aoTJxqbUD",
                    "location_kind": "azure",
                    "location_name": "Demo Content",
                    "mime_type": "audio/x-wav",
                    "name": "Annie and Brie/Raw Audio/AB3-1C_2561.wav",
                    "stow_container_id": "annie-and-brie",
                    "stow_container_name": "annie-and-brie",
                    "stow_url": "azure://democontent.blob.core.windows.net/annie-and-brie/Annie%20and%20Brie/Raw%20Audio/AB3-1C_2561.wav"
                },
                "highlight": [
    								{"field": "title", "fragments": ["example <em>query</em>"] },
    								{"field": "otherfield", "fragments": ["another text <em>highlight here</em>"] }
                ],
                "score": 0.93453264
            }
    	],
    	"filters": {},
    	"aggregations": {},
        "hit_counts": {
            "advertising": 1,
            "audio_classification": 1,
            "caption": 2,
            "description": 3,
            "labels": 5,
            "locations": 8,
            "logos": 13,
            "keyword": 21,
            "nsfw": 21,
            "ocr": 13,
            "people": 8,
            "sound": 5,
            "speech_to_text": 3,
            "sport": 2,
            "tag": 1,
            "text_content": 1
        }
    }

    query, limit and page repeat the input values that yielded the results.

    • total_hits (int) The approximate number of total hits for the given search.
    • results - (array) An array of result objects (see the Result Fields section below).
    • highlight - (object) An object containing HTML indicating why an item was matched (see the Highlights Fields section below).
    • score - (int) A decimal percentage value of how relevant this item is to the search query (0 being not relevant, 1 being most relevant). Low numbers are common.
    • filters - (object) An object describing the filters that were applied in the search request.
    • aggregations - (object) The aggregation results.
    • hit_counts (object) - A list of hit counts by item data type (any type that has 0 hits are omitted).

    Highlight Fields

    • title - (string) If the match occurred within the name, the title explains where. If this is empty, assume result.name.
    • fulltext - (string) An HTML preview of why a search result is relevant to the query.

    Some examples of setting time ranges as a filter are:

     "filters":{
          "ranges":[
             {
                "field":"last_harvested",
                "from":"now-1h",
                "to":""
             }
          ]
    "filters":{
          "ranges":[
             {
                "field":"last_harvested",
                "from":"2017-12-31T19:30:000.000Z",
                "to":"2017-12-31T19:45:00.000Z"
             }
          ]
    ```   
    
    ## Result fields
    
    The result objects contain an overview of Item Object metadata.
    
    * `_id` - (string) Unique Item ID
    * `last_modified` - (timestamp) When the item was last modified
    * `location_id` - (string) The ID of the Location where this Item was found
    * `location_kind` - (string) The Location Kind (see the [Location Kinds API](Location_Kinds_API.md) for more information) of the Location where this Item was found
    * `mime_type` - (string) MIME type for the item
    * `name` - (string) Name of the item (usually filename)
    * `stow_container_id` - (string) Stow Container ID of where this Item was found
    * `stow_container_name` - (string) Name of the Stow Container where this Item was found
    * `stow_url` - (string) The Stow URL of this Item.
    
    ## Search Analytics
    
    To view an overview of your platform data, hit this endpoint to obtain some high level groupings.
    
    GET /api/v3/search/analytics
    ### Analytics Response
    
    ```json
    {
        "explicit_content": [
            {
                "key": "false",
                "count": 58
            },
            {
                "key": "true",
                "count": 4
            }
        ],
        "files_by_category": [
            {
                "key": "cat2",
                "count": 5
            },
            {
                "key": "cat3",
                "count": 3
            },
            ...
        ],
        "files_by_extension": [
            {
                "key": "jpg",
                "count": 200
            },
            {
                "key": "mp3",
                "count": 17
            },
            {
                "key": "mp4",
                "count": 16
            },
            ...
        ],
        "files_by_location": [
            {
                "key": "theberg",
                "count": 144
            },
            {
                "key": "loadnstore",
                "count": 65
            },
            ...
        ],
        "files_by_type": [
            {
                "key": "image",
                "count": 191
            },
            {
                "key": "audio",
                "count": 24
            },
            {
                "key": "video",
                "count": 17
            },
            {
                "key": "document",
                "count": 11
            },
            {
                "key": "archive",
                "count": 1
            },
            ...
        ]
    }

    A successful response returns a Status OK (200) and a Status Internal Server Error (500) for any unexpected errors.