Items API

Once harvesting has begun, you can access the item data using the Items API.

Item Object

The data that makes up an Item is outlined in Item Object.

Getting Item Without Metadata

GET /api/data/v3/items/{id}

{id} - (string) The ID of the item to get.

Response

{
    "item": {
        "id": "f66082c13a7f3a10ebf89405433bb80f",
        "location_id": "5c7ec14434f90e13219e3ece821dce55",
        "container_id": "da971705615c7d57583a57fa170ab696",
        "file_size": 15327,
        "etag": "cff1e0c414d74cbcc436e1502f61cc8f",
        "file_extension": "jpg",
        "path": "",
        "file_path": "",
        "folder_path": "",
        "gm_item_type": "video",
        "hash_c4id": "c42YSyLSKjGuuqoUQPiM4qHeYs5CBNhmt5DDoAbzdxxEprrQwD6YmAi3FurRK9tS2kgj5msCq8rUqk95YWeAqwB7CM",
        "hash_md5": "cff1e0c414d74cbcc436e1502f61cc8f",
        "hash_sha1": "06fdba756542993f0070f4a863eceeb1bcefda33",
        "hash_sha512": "630fa3947ed13bb6cdc4a303dba65eb4edfbf66b7ba2670bbb14a744419e0ab101106d1a5a901f1e0d4f876142585b316f1e1fb461db9758500e2e26b677bc85",
        "harvester_version": "2.0.3134",
        "last_harvested": "2019-06-05T19:46:30.176133Z",
        "last_modified": "2019-04-16T13:44:34Z",
        "location_kind": "local",
        "location_name": "local",
        "mime_type": "video/mp4",
        "mime_category": "video",
        "name": "tears.mp4",
        "parent_id": "",
        "root_id": "0eed8099520a60a2bd3701655f0fbe81",
        "segment_interval": 2,
        "shared_link": "",
        "stow_container_id": "/data/videos",
        "stow_container_name": "",
        "stow_url": "s3://https://s3-us-west-2.amazonaws.com/3item/kid.jpg",
        "thumbnail": {
            "path": "thumbnailer/sprite.jpg",
            "type": "sprite",
            "frame_count": 30,
            "height": 152,
            "width": 270
        },
        "stow_metadata": [
            {
                "name": "mtime",
                "value": "2019-07-02T21:57:41Z"
            },
            {
                "name": "mode",
                "value": "644"
            },
            {
                "name": "name",
                "value": "Jeff_with_location.JPG"
            },
            ...
        ],
        "stow_tags": [],
        "drm": false,
        "created_at": "2019-06-05T19:46:30.266769Z",
        "updated_at": "2019-06-05T19:46:30.266769Z",
        "in_progress": false,
       	"preview": {
			"path": "fb3b37ce2a06c3aba49e07c7eb87acae/video_previews/preview.mp4",
			"mime_type": "video/mp4"
		},
        "duration": 734167
    }
}

Status codes:

200 (success)
404 (item not found)
500 (unexpected error)

Deleting by ID

To delete an item by its ID:

DELETE /api/data/v3/items/{id}

Status codes:

204 (no content)
500 (unexpected error)

Bulk Reading by IDs

You can return several items by providing a list of their IDs.

At least one ID must be provided, and no more than 50 IDs per request.

POST /api/data/v3/items/bulk
{
    "ids": ["76ef280bc613f9eb3dace1c89efe982e", "5187feb5044ef4485e7bc0e2e72f79d1"]
}

Status codes:

204 (no content)
422 (unprocessable entity)
500 (unexpected error)

Bulk Deleting by IDs

You can delete several items by providing a list of their IDs.

At least one ID must be provided.

DELETE /api/data/v3/items/bulk
{
    "ids": ["76ef280bc613f9eb3dace1c89efe982e", "5187feb5044ef4485e7bc0e2e72f79d1"]
}

Status codes:

204 (no content)
422 (unprocessable entity)
500 (unexpected error)

Updating an Item's Custom Asset Title

This requires the “edit:item” scope.

PATCH /api/data/v3/items/{id}
{
    "gm_asset_title": "my custom asset title"
}

{id} - (string) The ID of the item to get.
{gm_asset_title - (string) The custom asset title.

Response

{
    "item": {
        "id": "f66082c13a7f3a10ebf89405433bb80f",
        "gm_asset_title": "my custom asset title"
    }
}

Status codes:

200 (success)
403 (the user does not have permission to alter data)
404 (item not found)
500 (unexpected error)

Searching Within an Item

To search within the information associated with an item:

GET /api/data/v3/search/item/{item_id}?q={query}

{item_id} - (string) The ID of the item within which to search.
{query} - (string) The query string.

Response

The response is an object of histogram/timelines consisting of contiguous chunks of where the query value shows up within the item. If fields do not have matches, they are not populated.

{
	"advertising": {
		"histogram": [
			{
				"start": 2,
				"end": 4
			},
			...
			{
				"start": 80,
				"end": 88
			}
		]
	},
	"audio_classification": {
		"histogram": [
			{
				"start": 0,
				"end": 4
			},
			...
			{
				"start": 60,
				"end": 72
			}
		]
	},
	"caption": {
		"histogram": [
			{
				"start": 0.03,
				"end": 4.92
			},
			...
			{
				"start": 207.355,
				"end": 211.436
			}
		]
	},
	"description": {
		"histogram": [
			{
				"start": 8,
				"end": 20
			},
			...
			{
				"start": 110,
				"end": 122
			}
		]
	},
	"location": {
		"histogram": [
			{
				"start": 10,
				"end": 14
			}
		]
	},
	"logo": {
		"histogram": [
			{
				"start": 20,
				"end": 34
			}
		]
	},
	"keyword": {
		"histogram": [
			{
				"start": 2,
				"end": 134
			}
		]
	},
	"mature_content": {
		"histogram": [
			{
				"start": 30,
				"end": 34
			}
		]
	},
	"ocr": {
		"histogram": [
			{
				"start": 16,
				"end": 46
			},
			...
			{
				"start": 112,
				"end": 122
			}
		]
	},
	"people": {
		"Kim Ryan": [
			{
				"start": 8,
				"end": 14
			},
			...
			{
				"start": 118,
				"end": 122
			}
		],
		"Kim Smith": [
			{
				"start": 68,
				"end": 72
			}
		]
	},
	"sound": {
		"histogram": [
			{
				"start": 16,
				"end": 46
			},
			...
			{
				"start": 112,
				"end": 122
			}
		]
	},
	"speech_to_text": {
		"histogram": [
			{
				"start": 0.1,
				"end": 50.53
			},
			...
			{
				"start": 110.32,
				"end": 130.08
			}
		]
	},
	"sport": {
		"histogram": [
			{
				"start": 100,
				"end": 164
			}
		]
	},
	"tag": {
		"histogram": [
			{
				"start": 64,
				"end": 66
			}
		]
	},
	"text_content": {
		"histogram": [
			{
				"start": 55,
				"end": 101
			}
		]
	}
}

A successful response returns a Status OK (200) and, if an unexpected error occurs, a Status Internal Server Error (500) is returned.

Getting All Metadata

To get all metadata of an item by its ID, make the following request:

GET /api/data/items/{id}

{id} - (string) The ID of the item to get.
Do not include the only parameter.

Response

The response is a JSON document containing ALL metadata for the item.

Getting the metadata.json File for an Item

GET /files/{item_id}/metadata2.json

Response

The response is a JSON document containing the same data as the metadata.json file.

Selective Data

The Items API enables you to be selective about what data you get. You may:

Use the only parameter to get specific leaf fields, or
Use the include parameter to specify root objects to include.

You cannot use both only and include parameters at the same time.

Getting Specific Leaf Fields

To get only a list of specific fields, specify them using the only parameter.

GET /api/data/items/{id}?only=field1,field2,field3

field1, field2, field3 - (comma separated list) List of fields to include.
Only leaf fields are supported, so you must know the full path to the fields to get.
If you need to get entire groups of data, consider using the include parameter.

This endpoint is extremely efficient and is preferred over include. For a complete list of acceptable fields, perform a GET request without any parameters to see the entire data payload.

Getting Specific Groups of Data

You can get groups of data at a time using the include parameter.

GET /api/data/items/{id}?include=obj1,obj2,obj3

obj1, obj2, obj3 - (comma separated list) List of fields to include.
Only root fields are supported. To get only fields from within objects, use the only parameter instead.

This endpoint is not as efficient as using the only parameter, but it is more convenient. For a complete list of acceptable fields, see Item Object.

Associating an Item With Categories

An item could be associated with one or many categories. One can associate an item with categories using:

POST /api/data/items/{id}/categories
{
	"categories": ["cat1", "cat2"]
}

To disassociate categories from an item, use:

DELETE /api/data/items/{id}/categories/{categories}

{categories} - A URL-encoded, comma-separated list of categories (for example, cat1,other%20category).

Getting Timelines

A timeline is a contiguous blocks of time where data is found. The response is separated by individual labels/identifiers.

A Status Not Found (404) is returned if {id} cannot be found.

Technical Cues

The technical cues endpoint is a wrapper for all the technical spanning metadata for a video.

GET /api/data/v3/items/{id}/timeline/technical-cues

Response

{
    "technical_cues": {
        "black_frames": {
            "histogram": [
                {
                    "start": 3.26993,
                    "end": 7.07373,
                    "start_frame": 99,
                    "end_frame": 213,
                    "start_control_time_code": "00:00:03:08",
                    "end_control_time_code": "00:00:07:02",
                    "start_relative_time_code": "00:13:25:10",
                    "end_relative_time_code": "00:13:29:04"
                },
                {
                    "start": 137.104,
                    "end": 137.471,
                    "start_frame": 4110,
                    "end_frame": 4121,
                    "start_control_time_code": "00:02:16:29",
                    "end_control_time_code": "00:02:17:10",
                    "start_relative_time_code": "00:15:39:01",
                    "end_relative_time_code": "00:15:39:12"
                }
            ]
        },
        "color_bars": {
            "histogram": [
                {
                    "start": 32,
                    "end": 54
                }
            ]
        },
        "credits": {
            "histogram": [
                {
                    "start": 3.26993,
                    "end": 7.07373,
                    "start_frame": 99,
                    "end_frame": 213,
                    "start_control_time_code": "00:00:03:08",
                    "end_control_time_code": "00:00:07:02",
                    "start_relative_time_code": "00:13:25:10",
                    "end_relative_time_code": "00:13:29:04"
                },
                {
                    "start": 137.104,
                    "end": 137.471,
                    "start_frame": 4110,
                    "end_frame": 4121,
                    "start_control_time_code": "00:02:16:29",
                    "end_control_time_code": "00:02:17:10",
                    "start_relative_time_code": "00:15:39:01",
                    "end_relative_time_code": "00:15:39:12"
                }
            ]
        },
        "detected_shots": {
			"histogram": [
				{
					"start": 3.26993,
					"end": 7.07373,
					"start_frame": 99,
					"end_frame": 213,
					"start_control_time_code": "00:00:03:08",
					"end_control_time_code": "00:00:07:02",
					"start_relative_time_code": "00:13:25:10",
					"end_relative_time_code": "00:13:29:04"
				},
				{
					"start": 137.104,
					"end": 137.471,
					"start_frame": 4110,
					"end_frame": 4121,
					"start_control_time_code": "00:02:16:29",
					"end_control_time_code": "00:02:17:10",
					"start_relative_time_code": "00:15:39:01",
					"end_relative_time_code": "00:15:39:12"
				}
			]
		},
        "digital_slates": {
            "histogram": [
                {
                    "start": 3.26993,
                    "end": 7.07373,
                    "start_frame": 99,
                    "end_frame": 213,
                    "start_control_time_code": "00:00:03:08",
                    "end_control_time_code": "00:00:07:02",
                    "start_relative_time_code": "00:13:25:10",
                    "end_relative_time_code": "00:13:29:04"
                },
                {
                    "start": 137.104,
                    "end": 137.471,
                    "start_frame": 4110,
                    "end_frame": 4121,
                    "start_control_time_code": "00:02:16:29",
                    "end_control_time_code": "00:02:17:10",
                    "start_relative_time_code": "00:15:39:01",
                    "end_relative_time_code": "00:15:39:12"
                }
            ]
        },
        "silence": {
            "histogram": [
                {
                    "start": 3.23333,
                    "end": 7.27165,
                    "start_frame": 98,
                    "end_frame": 219,
                    "start_control_time_code": "00:00:03:07",
                    "end_control_time_code": "00:00:07:08",
                    "start_relative_time_code": "00:13:25:09",
                    "end_relative_time_code": "00:13:29:10"
                },
                {
                    "start": 82.278,
                    "end": 83.9624,
                    "start_frame": 2467,
                    "end_frame": 2517,
                    "start_control_time_code": "00:01:22:06",
                    "end_control_time_code": "00:01:23:26",
                    "start_relative_time_code": "00:14:44:08",
                    "end_relative_time_code": "00:14:45:28"
                }
            ]
        },
        "slates": {
            "all": [
                {
                    "start": 3.26993,
                    "end": 7.07373,
                    "start_frame": 99,
                    "end_frame": 213,
                    "start_control_time_code": "00:00:03:08",
                    "end_control_time_code": "00:00:07:02",
                    "start_relative_time_code": "00:13:25:10",
                    "end_relative_time_code": "00:13:29:04"
                },
                {
                    "start": 137.104,
                    "end": 137.471,
                    "start_frame": 4110,
                    "end_frame": 4121,
                    "start_control_time_code": "00:02:16:29",
                    "end_control_time_code": "00:02:17:10",
                    "start_relative_time_code": "00:15:39:01",
                    "end_relative_time_code": "00:15:39:12"
                }
            ]
        },
        "start_end": {
            "histogram": [
                {
                    "start": 0,
                    "end": 137.471,
                    "start_frame": 1,
                    "end_frame": 4121,
                    "start_control_time_code": "00:00:00:00",
                    "end_control_time_code": "00:02:17:10",
                    "start_relative_time_code": "00:13:22:02",
                    "end_relative_time_code": "00:15:39:12"
                }
            ]
        },
        "textless": {
            "histogram": [
                {
                    "start": 3.23333,
                    "end": 7.27165,
                    "start_frame": 98,
                    "end_frame": 219,
                    "start_control_time_code": "00:00:03:07",
                    "end_control_time_code": "00:00:07:08",
                    "start_relative_time_code": "00:13:25:09",
                    "end_relative_time_code": "00:13:29:10"
                },
                {
                    "start": 82.278,
                    "end": 83.9624,
                    "start_frame": 2467,
                    "end_frame": 2517,
                    "start_control_time_code": "00:01:22:06",
                    "end_control_time_code": "00:01:23:26",
                    "start_relative_time_code": "00:14:44:08",
                    "end_relative_time_code": "00:14:45:28"
                }
            ]
        },
        }
    }
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Audio

GET /api/data/v3/items/{id}/timeline/audio

Response

{
	"audio": {
		"Speech": [
			{
				"start": 10,
				"end": 60
			},
			{
				"start": 120,
				"end": 130
			}
		],
		"explosion": [
			{
				"start": 3.014,
				"end": 4.56
			}
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Color Bars

GET /api/data/v3/items/{id}/timeline/color-bars

Response

{
	"color_bars": [
     	{
            "start": 3.26993,
            "end": 7.07373,
            // if no frame information found the fields below will not be set
            "start_frame": 99,
            "end_frame": 213,
            "start_control_time_code": "00:00:03:08",
            "end_control_time_code": "00:00:07:02",
            "start_relative_time_code": "00:13:25:10",
            "end_relative_time_code": "00:13:29:04"
     	}
 	]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Black Frames

GET /api/data/v3/items/{id}/timeline/black-frames

Response

{
	"black_frames": [
		{
            "start": 3.26993,
            "end": 7.07373,
            "start_frame": 99,
            "end_frame": 213,
            "start_control_time_code": "00:00:03:08",
            "end_control_time_code": "00:00:07:02",
            "start_relative_time_code": "00:13:25:10",
            "end_relative_time_code": "00:13:29:04"
        },
        {
            "start": 137.104,
            "end": 137.471,
            "start_frame": 4110,
            "end_frame": 4121,
            "start_control_time_code": "00:02:16:29",
            "end_control_time_code": "00:02:17:10",
            "start_relative_time_code": "00:15:39:01",
            "end_relative_time_code": "00:15:39:12"
        }
		...
	]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Credits

GET /api/data/v3/items/{id}/timeline/credits

Response

{
	"credits": [
     	{
            "start": 0.1,
            "end": 1.1
     	}
 	]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Custom Tags (Amazon Rekognition)

GET /api/data/v3/items/{id}/timeline/customtags/amazonrek

Response

{
	"tags": {
		"Tag1Name": [
			{
				"start": 10,
				"end": 60
			},
			{
				"start": 120,
				"end": 130
			}
		],
		"Tag2Name": [
			{
				"start": 3.014,
				"end": 4.56
			}
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Detected Shots (Valossa Extractor)

GET /api/data/v3/items/{id}/timeline/detected-shots

Response

{
	"detected_shots": [
		{
            "start": 3.26993,
            "end": 7.07373,
            "start_frame": 99,
            "end_frame": 213,
            "start_control_time_code": "00:00:03:08",
            "end_control_time_code": "00:00:07:02",
            "start_relative_time_code": "00:13:25:10",
            "end_relative_time_code": "00:13:29:04"
        },
        {
            "start": 137.104,
            "end": 137.471,
            "start_frame": 4110,
            "end_frame": 4121,
            "start_control_time_code": "00:02:16:29",
            "end_control_time_code": "00:02:17:10",
            "start_relative_time_code": "00:15:39:01",
            "end_relative_time_code": "00:15:39:12"
        }
		...
	]
}

Digital Slates

GET /api/data/v3/items/{id}/timeline/digital-slates

Response

{
	"digital_slates": [
     	{
            "start": 0.1,
            "end": 1.1
     	}
 	]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Insights

GET /api/data/v3/items/{item_id}/insights/{insight_group_id}

Response

{
    "insights": [
        {
            "group_name": "Supplier",
            "color": "#4DD0E1",
            "words": [
                "content delivery",
                "exclusive",
                "hollywood",
                "payments",
                "pepsi",
                "price increase",
                "term",
                "termination"
            ],
            "matches": [
                {
                    "type": "captions",
                    "timeline": [
                        {
                            "start_at": 30.03,
                            "end_at": 44.97,
                            "count": 1
                        }
                    ],
                    "source": "2minuteVideo.srt"
                },
                {
                    "type": "captions",
                    "timeline": [
                        {
                            "start_at": 30.03,
                            "end_at": 44.97,
                            "count": 1
                        }
                    ],
                    "source": "2minuteVideo.srt"
                },
                {
                    "type": "captions",
                    "timeline": [
                        {
                            "start_at": 30.03,
                            "end_at": 44.97,
                            "count": 1
                        }
                    ],
                    "source": "2minuteVideo.srt"
                }
            ]
        }
    ]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Mature Content

GET /api/data/v3/items/{id}/timeline/mature-content

Response

{
	"mature_content": {
		"nudity": [
			{
				"start": 0.1,
				"end": 1.1
			}
		],
		"gore": [
			{
				"start": 0.0,
				"end": 3.5
			}
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Locations

GET /api/data/v3/items/{id}/timeline/locations

Response

{
	"locations": {
		"Rome": [
			{
				"start": 0.1,
				"end": 1.1
			},
			{
				"start": 13.0,
				"end": 15.5
			}
		],
		"Paris": [
			{
				"start": 2.1,
				"end": 4.3
			}
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Logos

GET /api/data/v3/items/{id}/timeline/logos

Response

{
	"logos": {
		"Pepsi": [
			{
				"start": 0.1,
				"end": 1.1
			},
			{
				"start": 13.0,
				"end": 15.5
			}
		],
		"GrayMeta": [
			{
				"start": 2.1,
				"end": 4.3
			}
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Slates

GET /api/data/v3/items/{id}/timeline/slates

Response

{
	"slates": {
    	"all": [
        	{
               "start": 3.26993,
                "end": 7.07373,
                // if no frame information found the fields below will not be set
                "start_frame": 99,
                "end_frame": 213,
                "start_control_time_code": "00:00:03:08",
                "end_control_time_code": "00:00:07:02",
                "start_relative_time_code": "00:13:25:10",
                "end_relative_time_code": "00:13:29:04"
        	},
        	...
        ]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Sports

GET /api/data/v3/items/{id}/timeline/sports

Response

{
	"sport_events": {
		"soccer": {
			"penalties": [
				{
					"start": -0.001338,
					"end": 3.998662
				}
			],
			"shots on goal": [
				{
					"start": 4.998662,
					"end": 4.998662
				}
			]
		}
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Silence

GET /api/data/v3/items/{id}/timeline/silence

Response

{
	"silence": {
		"histogram": [
			{
                "start": 3.23333,
                "end": 7.27165,
                "start_frame": 98,
                "end_frame": 219,
                "start_control_time_code": "00:00:03:07",
                "end_control_time_code": "00:00:07:08",
                "start_relative_time_code": "00:13:25:09",
                "end_relative_time_code": "00:13:29:10"
            },
            {
                "start": 82.278,
                "end": 83.9624,
                "start_frame": 2467,
                "end_frame": 2517,
                "start_control_time_code": "00:01:22:06",
                "end_control_time_code": "00:01:23:26",
                "start_relative_time_code": "00:14:44:08",
                "end_relative_time_code": "00:14:45:28"
            }
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Start End

GET /api/data/v3/items/{id}/timeline/start-end

Response

{
	"start_end": {
		"histogram": [
			{
                "start": 0,
                "end": 137.471,
                "start_frame": 1,
                "end_frame": 4121,
                "start_control_time_code": "00:00:00:00",
                "end_control_time_code": "00:02:17:10",
                "start_relative_time_code": "00:13:22:02",
                "end_relative_time_code": "00:15:39:12"
            }
		]
	}
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Textless Material

GET /api/data/v3/items/{id}/timeline/textless

Response

{
	"textless": {
		"histogram": [
			{
				"start": 3.83129,
				"end": 23.308867
			}
		]
	}
}

Texted

GET /api/data/v3/items/{id}/timeline/texted

Response

{
	"texted": [
     	{
            "start": 3.26993,
            "end": 7.07373,
            // if no frame information found the fields below will not be set
            "start_frame": 99,
            "end_frame": 213,
            "start_control_time_code": "00:00:03:08",
            "end_control_time_code": "00:00:07:02",
            "start_relative_time_code": "00:13:25:10",
            "end_relative_time_code": "00:13:29:04"
     	}
 	]
}

A successful call returns a Status OK (200). If any unexpected errors occurred in the process of fulfilling the response, a Status Internal Server Error (500) is returned.

Getting Technical Metadata

Technical metadata is item-encoded within files that Wasabi AiR could extract directly. Wasabi Air made this information available either through specific-type or batch APIs.

To get all technical metadata found within a given file, use the following API.

The response may not include all fields (audio_info, audio_peak, exiv2, geocoding, media_info, and pdf), depending on file type/contents. This also applies to the fields within. Different types of files with different content do not always have all information for which Wasabi AiR checks.

Request

GET /api/data/v3/items/{id}/technical

id - (string) The identifier of the item.

Response

{
    "audio_info": {
        "streams": [
            {
                "avg_frame_rate": "0/0",
                "bit_rate": "192000",
                "bits_per_sample": 0,
                "channel_layout": "stereo",
                "channels": 2,
                "codec_long_name": "AAC (Advanced Audio Coding)",
                "codec_name": "aac",
                "codec_tag": "0x000f",
                "codec_tag_string": "[15][0][0][0]",
                "codec_time_base": "1/48000",
                "codec_type": "audio",
                "index": 5,
                "r_frame_rate": "0/0",
                "sample_fmt": "fltp",
                "sample_rate": "48000",
                "start_pts": 725280,
                "start_time": "8.058667",
                "time_base": "1/90000",
                "disposition": {
                    "attached_pic": 0,
                    "clean_effects": 0,
                    "comment": 0,
                    "default": 0,
                    "dub": 0,
                    "forced": 0,
                    "hearing_impaired": 0,
                    "karaoke": 0,
                    "lyrics": 0,
                    "original": 0,
                    "timed_thumbnails": 0,
                    "visual_impaired": 0
                },
                "tags": {
                    "encoder": "",
                    "language": "eng",
                    "title": ""
                }
            },
            {
                "avg_frame_rate": "0/0",
                "bit_rate": "192000",
                "bits_per_sample": 0,
                "channel_layout": "stereo",
                "channels": 2,
                "codec_long_name": "AAC (Advanced Audio Coding)",
                "codec_name": "aac",
                "codec_tag": "0x000f",
                "codec_tag_string": "[15][0][0][0]",
                "codec_time_base": "1/48000",
                "codec_type": "audio",
                "index": 6,
                "r_frame_rate": "0/0",
                "sample_fmt": "fltp",
                "sample_rate": "48000",
                "start_pts": 725280,
                "start_time": "8.058667",
                "time_base": "1/90000",
                "disposition": {
                    "attached_pic": 0,
                    "clean_effects": 0,
                    "comment": 0,
                    "default": 0,
                    "dub": 0,
                    "forced": 0,
                    "hearing_impaired": 0,
                    "karaoke": 0,
                    "lyrics": 0,
                    "original": 0,
                    "timed_thumbnails": 0,
                    "visual_impaired": 0
                },
                "tags": {
                    "encoder": "",
                    "language": "spa",
                    "title": ""
                }
            }
        ]
    },
    "audio_peak": {
        "integrated_loudness": {
            "i_lufs": -29.1,
            "threshold_lufs": -39.1
        },
        "loudness_range": {
            "lra_lu": 0.1,
            "threshold_lufs": -49.1,
            "lra_low_lufs": -29.1,
            "lra_high_lufs": -29
        },
        "true_peak_dbfs": -19.2
    },
    "exiv2": {
        "normalized": {
            "resolution_x": 1024,
            "resolution_y": 680,
            "format": "image/jpeg",
            "photo": {
                "exif_version": "48 50 50 49",
                "color_space": 1,
                "pixel_x_dimension": 1024,
                "pixel_y_dimension": 680
            },
            "application2": {},
            "image": {
                "image_width": 1024,
                "image_length": 680,
                "bits_per_sample": "8 8 8",
                "photometric_interpretation": 2,
                "orientation": 1,
                "samples_per_pixel": 3,
                "x_resolution": "720000/10000",
                "y_resolution": "720000/10000",
                "resolution_unit": 2,
                "software": "Adobe Photoshop CC 2018 (Macintosh)",
                "date_time": "2018:10:12 13:43:31",
                "exif_tag": 236
            },
            "xmp": {
                "create_date": "2018-10-12T13:34:17-07:00",
                "modify_date": "2018-10-12T13:43:31-07:00",
                "metadata_date": "2018-10-12T13:43:31-07:00"
            }
        }
    },
    "geocoding": {
        "place_name": "Los Angeles",
        "country_code": "US",
        "admin_name1": "California",
        "admin_name2": "Los Angeles"
    },
    "media_info": {
        "general": {
            "audio_codecs": "AAC LC / AAC LC / AAC LC / AAC LC / AAC LC / AAC LC",
            "audio_format_list": "AAC LC / AAC LC / AAC LC / AAC LC / AAC LC / AAC LC",
            "audio_format_with_hint_list": "AAC LC / AAC LC / AAC LC / AAC LC / AAC LC / AAC LC",
            "audio_language_list": "English /  / English / Spanish / English / Spanish",
            "codec": "MPEG-TS",
            "codecs_video": "AVC",
            "commercial_name": "MPEG-TS",
            "complete_name": "/tmp/d89a7c3bdaa3cdf23420cc4e905349f1.ts",
            "count": 333,
            "count_of_audio_streams": 6,
            "count_of_stream_of_this_kind": 1,
            "count_of_video_streams": 1,
            "duration": 6034,
            "duration_time": "00:00:06.035 (00:00:06;00)",
            "file_extension": "ts",
            "file_name": "d89a7c3bdaa3cdf23420cc4e905349f1",
            "file_size": 4012484,
            "folder_name": "/tmp",
            "format": "MPEG-TS",
            "format_extensions_usually_used": "ts m2t m2s m4t m4s tmf ts tp trp ty",
            "frame_count": 360,
            "frame_rate": 59.94,
            "internet_media_type": "video/MP2T",
            "kind_of_stream": "General",
            "overall_bit_rate": 5083438,
            "overall_bit_rate_mode": "VBR",
            "video_format_list": "AVC",
            "video_format_with_hint_list": "AVC"
        },
        "audio": {
            "bit_rate": 112000,
            "bit_rate_mode": "CBR",
            "channels": 2,
            "codec": "MPEG Audio",
            "commercial_name": "MPEG Audio",
            "compression_mode": "Lossy",
            "count": 277,
            "count_of_stream_of_this_kind": 1,
            "duration": 89808,
            "duration_time": "00:01:30:18",
            "format": "MPEG Audio",
            "format_profile": "Layer 3",
            "frame_count": 3438,
            "id": "1",
            "kind_of_stream": "Audio",
            "proportion_of_this_stream": 0.99971,
            "samples_count": 3960576,
            "sampling_rate": 44100,
            "stream_order": "1",
            "stream_size": 127325
        },
        "audio_tracks": [
            {
                "bit_rate": 112000,
                "bit_rate_mode": "CBR",
                "channels": 2,
                "codec": "MPEG Audio",
                "commercial_name": "MPEG Audio",
                "compression_mode": "Lossy",
                "count": 277,
                "count_of_stream_of_this_kind": 1,
                "duration": 89808,
                "duration_time": "00:01:30:18",
                "format": "MPEG Audio",
                "format_profile": "Layer 3",
                "frame_count": 3438,
                "id": "1",
                "kind_of_stream": "Audio",
                "proportion_of_this_stream": 0.99971,
                "samples_count": 3960576,
                "sampling_rate": 44100,
                "stream_order": "1",
                "stream_size": 127325
            },
            {
                "bit_rate": 112000,
                "bit_rate_mode": "CBR",
                "channels": 2,
                "codec": "MPEG Audio",
                "commercial_name": "MPEG Audio",
                "compression_mode": "Lossy",
                "count": 277,
                "count_of_stream_of_this_kind": 1,
                "duration": 89808,
                "duration_time": "00:01:30:18",
                "format": "MPEG Audio",
                "format_profile": "Layer 3",
                "frame_count": 3438,
                "id": "2",
                "kind_of_stream": "Audio",
                "proportion_of_this_stream": 0.99971,
                "samples_count": 3960576,
                "sampling_rate": 44100,
                "stream_order": "2",
                "stream_size": 1257325
            }
        ],
        "video": {
            "bit_depth": 8,
            "bits_pixel_frame": 0.072,
            "chroma_subsampling": "4:2:0",
            "codec": "AVC",
            "codec_id": "27",
            "color_range": "Limited",
            "color_space": "YUV",
            "colour_description_present": "Yes",
            "commercial_name": "AVC",
            "count": 377,
            "count_of_stream_of_this_kind": 1,
            "display_aspect_ratio": 1.778,
            "duration": 6006,
            "duration_time": "00:00:06.006 (00:00:06;00)",
            "format": "AVC",
            "format_info": "Advanced Video Codec",
            "format_profile": "High@L4.1",
            "format_settings": "CABAC / 2 Ref Frames",
            "format_settings_cabac": "Yes",
            "format_url": "http://developers.videolan.org/x264.html",
            "frame_count": 360,
            "frame_rate": 59.94,
            "height": 720,
            "id": 481,
            "internet_media_type": "video/H264",
            "kind_of_stream": "Video",
            "pixel_aspect_ratio": 1,
            "scan_type": "Progressive",
            "stream_order": "0-0",
            "width": 1280
        },
        "image": {
            "bit_depth": 8,
            "chroma_subsampling": "4:4:4",
            "codec": "JPEG",
            "color_space": "YUV",
            "commercial_name": "JPEG",
            "compression_mode": "Lossy",
            "count": 125,
            "count_of_stream_of_this_kind": 1,
            "format": "JPEG",
            "height": 680,
            "internet_media_type": "image/jpeg",
            "kind_of_stream": "Image",
            "proportion_of_this_stream": 1,
            "stream_size": 424430,
            "width": 1024
        }
    },
    "pdf": {
        "title": "Microsoft Word - Backup4all_network_backup_solution.doc",
        "subject": "",
        "keywords": "",
        "author": "Administrator",
        "creator": "Microsoft Word - Backup4all_network_backup_solution.doc",
        "producer": "novaPDF Professional Server Ver 5.4 Build 260 (Windows XP  x32)",
        "creation_date": "2008-05-26T09:02:00Z",
        "mod_date": "0001-01-01T00:00:00Z",
        "pages": 4,
        "javascript": false,
        "encrypted": true,
        "password_protected": false,
        "page_size": "612 x 792 pts (letter)",
        "optimized": false,
        "pdf_version": 1.4,
        "page_rotation": 0,
        "tagged": false,
        "form": false
    }
}

Response codes:

200 (StatusOK) - Success.
404 (StatusNotFound) - Item not found.
500 (StatusInternalServerError) - An unexpected error occurred.

Identifying Items

To determine the Wasabi AiR ID and Stow URL for an item, you need to know the location ID, container ID, and identifier for the item within the container. For more information about item IDs, see the Stow project.

You can make the following request:

POST /api/control/item-id
{
	"location_id": "abc123",
	"container_id": "MyContainer",
	"item_id": "MyItem"
}

location_id - (string) The Wasabi AiR location ID that indicates which storage location the item is in.
container_id - (string) The container ID where the item is located (usually the bucket name).
item_id - (string) The identifier of the item (usually its name within the storage).

Response

Provided that the location, container, and item values are all valid, you are given the following response:

{
	"stow_url": "s3://unique/url/to/item",
	"gm_item_id": "67779468b22af637e2dd6a2616264b6c"
}

stow_url - (string) The Stow URL for the item.
gm_item_id - (string) The internal Wasabi AiR ID for this item.

It is not necessary for the item to have been harvested in order for the ID to be returned, but once harvested, you can trust that the ID matches the gm_item_id returned.

Once you have obtained the identifiers for an item, you can use them in the Harvest API.

Getting a List of Item Captions

Get a list of captions for an item:

GET /api/data/v3/items/{id}/captions

Response

A successful call returns a Status OK (200) with the following response body:

{
  "captions": [
    {
      "id": "c57149e1f0b9387294e1f5efe6cb1ef0",
      "item_id": "71ab3889e1c559865ed6bce99b349d4f",
      "source": "captions",
      "language": {
        "code": "eng",
        "confidence": 1
      }
    }
  ]
}

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Getting a List of Text in an Item Caption

Get a list of text contained in an item caption, along with possible NLP data:

GET /api/data/v3/items/{id}/caption/{item-captions-id}?mask={mask}

item-captions-id - (string) The ID value contained in the results of a captions request.
mask - Enables you to mask the embedded NLP data for a caption text, which may result in faster results. Set mask=nlp to remove NLP data from being provided.

Response

A successful call returns a Status OK (200) with the following response body:

{
  "caption": [
    {
      "id": "f27f081531c42d304c285dc7306f29e7",
      "item_captions_id": "c57149e1f0b9387294e1f5efe6cb1ef0",
      "start_at": 0.03,
      "end_at": 4.92,
      "text": "Mr. Jones will speak now",
      "nlp_properties": {
        "entities": [
          {
            "text": "Mr. Jones",
            "confidence": 0.9995918273925781,
            "type": "person"
          }
        ],
        "key_phrases": [
          {
            "text": "Mr. Jones",
            "confidence": 0.9994778037071228
          }
        ],
        "sentiment": {
          "text": "neutral",
          "sentiment_confidence": {
            "Mixed": 0.014400332234799862,
            "Negative": 0.10051420331001282,
            "Neutral": 0.8749107718467712,
            "Positive": 0.010174600407481194
          }
        },
        "language": {
          "language": "en",
          "confidence": 0.9737588763237
        }
      }
    }
  ]
}

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Item Descriptions

Getting descriptions for an Item

GET /api/data/v3/items/{id}/descriptions?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provides all entries for an item regardless of whether or not the description data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "description": {
                "id": "b2fe0c8e0ee298eb07c3c5dce457c907",
                "item_id": "fe154badd7b2d78349c214938f27547c",
                "confidence": 0.443617173003186,
                "language": {
                         "language": "en-US",
                         "confidence": 0.8
                },
                "text": "a drawing of a face"
            }
        },
        "pages": null,
        "video_frames": null
    },
    "next_page": ""
}

For an asset that is a video:

{
    "contents": {
        "img": null,
        "pages": null,
        "video_frames": [
            {
                "description": {
                    "id": "d4acd831bf8d56e6a6e1fcb054228f29",
                    "item_id": "dddcbe782d810eb80f531361b5799a53",
                    "confidence": 0.7532465814725752,
                    "language": {
                    	"code": "en",
                    	"confidence": 0.96
                    },
                    "text": "a close up of a person"
                },
                "frame_id": "59ad92a0ed0d873de84d2cc2bd080898",
                "thumbnail_path": "video_main_frames/frame-0000000000.jpg",
                "time": 0
            },
            {
                "description": {
                    "id": "8a7f430ebb24d621021e2014ee05c6eb",
                    "item_id": "dddcbe782d810eb80f531361b5799a53",
                    "confidence": 0.2997128373277878,
                    "language": null,
                    "text": "a close up of a man with smoke coming out of it"
                },
                "frame_id": "677deeef7865c9e1b0bb497164aeca50",
                "thumbnail_path": "video_main_frames/frame-0000000001.jpg",
                "time": 2
            },
            {
                "description": {
                    "id": "ff31ffc38b95e420dfca366ef02b550d",
                    "item_id": "dddcbe782d810eb80f531361b5799a53",
                    "confidence": 0.8331186858981754,
                    "language": null,
                    "text": "a blurry image of smoke"
                },
                "frame_id": "b767a35a35e5a873a109f2d3b4df5ec2",
                "thumbnail_path": "video_main_frames/frame-0000000002.jpg",
                "time": 4
            }
        ]
    },
    "next_page": "NextPageTokenString"
}

For an asset that is a document:

{
    "contents": {
        "img": null,
        "pages": [
            {
                "images": [
                    {
                        "description": {
                            "id": "4c51651ee15c7ce414ae381bdc252622",
                            "item_id": "0ca4a8e17b66d3946f611621936896c1",
                            "confidence": 0.7192287180775676,
                             "language": {
								"code": "en",
								"confidence": 0.86
							},
                            "text": "a man standing in front of a mirror posing for the camera"
                        },
                        "image_id": "24235782fb645ba35c6410617f8c3527",
                        "image_index": 0,
                        "thumbnail_path": "document_pages/thumb-pg-00000-img-00000.png"
                    }
                ],
                "page": 0,
                "description": "optionally, a page can have a description as well, or it can be embedded in the images within the page",
				"thumbnail_path": "optionally, a page can have a thumbnail path as well as well as each embedded images thumbnail",
				"page_id": "a uuid to identify the page, optional, will be available when all query param is set"
            },
            {
                "images": [
                    {
                        "description": {
                            "id": "62a0967c75af0b1f8ba65edd7b287929",
                            "item_id": "0ca4a8e17b66d3946f611621936896c1",
                            "confidence": 0.9312026737395315,
                            "language": null,
                            "text": "Robb Wells, John Paul Tremblay that are looking at the camera"
                        },
                        "image_id": "5fd88f4121c499aa04b9f77fa59e7788",
                        "image_index": 0,
                        "thumbnail_path": "document_pages/thumb-pg-00001-img-00000.png"
                    }
                ],
                "page": 1
            }
        ],
        "video_frames": null
    },
    "next_page": "NextPageTokenString"
}

If start and window are provided, the results may be paginated. The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned. If the item ID is for an item that does not exist, a 404 is returned indicating the item is not found.

Curating a Description for an Item

To create a curated description, you can post to this endpoint with a valid request body:

POST /api/data/v3/items/{id}/descriptions
{
	"segment_index": float64,
	"image_index": int,
	"item_type": ENUM["image" | "video" | "document"],
	"text": string
}

item_type - The type of item.
segment_index - Set to -1 if the item is an image.
image_index - Set to -1 if the item is an image and -1 for a video.
text - Must not be an empty string.

Response

A successful call returns a Status Create (201) with the following response body that includes the related metadata data associated with that segment/image index.

{
	"description": {
		"id": string,
		"item_id": string,
		"confidence": float64,
		"language": string,
		"language_confidence": float64,
		"text": string,
	}
}

If there is a conflict with the segment or image index, a Status Unprocessable Entity (422) is returned. If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Editing a Description

To edit an existing description text, use the following request:

PATCH /api/data/v3/items/{id}/descriptions/{desc_id}

{
	"text": "new description text"
}

id - The item ID.
desc_id - The description ID.
text - Empty string allowed.

Response

A successful call returns a Status OK (200) with the new description after updating.

{
	"description": {
		"id": string,
		"item_id": string,
		"confidence": float64,
		"language": string,
		"language_confidence": float64,
		"text": string,
	}
}

If the description by the desc_id is not found, a Status Not Found (404) is returned. If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Item OCRs

Getting OCRs for an Item

GET /api/data/v3/items/{id}/ocrs?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provides all entries for an item regardless of whether or not the OCR data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "ocrs": [
                {
                    "id": "0a4877c31bd094a32a124a1c5571f751",
                    "item_id": "c21b3e9fe1fbe09814d6a7bfdf5ba313",
                    "bounding_box": {
                        "top": 441,
                        "left": 220,
                        "width": 52,
                        "height": 16
                    },
                    "confidence": 0,
                    "language": null,
                    "text": "adidas",
                    "text_type": "lines"
                },
                {
                    "id": "1afb7fa88d193d5d9ae766da04e1517f",
                    "item_id": "c21b3e9fe1fbe09814d6a7bfdf5ba313",
                    "bounding_box": {
                        "top": 462,
                        "left": 295,
                        "width": 86,
                        "height": 15
                    },
                    "confidence": 0,
                    "language": null,
                    "text": "SNALMUNCH 2012",
                    "text_type": "lines"
                },
                {
                    "id": "f4ee60381a3cd69a0787848b71255b62",
                    "item_id": "c21b3e9fe1fbe09814d6a7bfdf5ba313",
                    "bounding_box": {
                        "top": 497,
                        "left": 222,
                        "width": 243,
                        "height": 53
                    },
                    "confidence": 0,
                    "language": null,
                    "text": "SAMSUNG",
                    "text_type": "lines"
                }
            ]
        },
        "pages": null,
        "video_frames": null
    },
    "next_page": ""
}

For an asset that is a video:

{
    "contents": {
        "img": null,
        "pages": null,
        "video_frames": [
            {
                "frame_id": "5248fd3acceab63163a5bcc5ddc15d62",
                "ocrs": [
                    {
                        "id": "e45aa09fd7d6121e474459e59e8a7d4c",
                        "item_id": "3b103330378acb3ad604045ba1f4aecd",
                        "bounding_box": {
                            "top": 12,
                            "left": 14,
                            "width": 169,
                            "height": 11
                        },
                        "confidence": 0.87,
                        "language": null,
                        "text": "HIT THAT LIKE BUTTON, NATION!",
                        "text_type": "lines"
                    }
                ],
                "thumbnail_path": "video_main_frames/frame-0000000001.jpg",
                "time": 2.002
            },
            {
                "frame_id": "c5501c083166450e9885782aac29fad8",
                "ocrs": [
                    {
                        "id": "7a5d590df38aa2494a1a854640ea6b47",
                        "item_id": "3b103330378acb3ad604045ba1f4aecd",
                        "bounding_box": {
                            "top": 19,
                            "left": 249,
                            "width": 33,
                            "height": 16
                        },
                        "confidence": 0,
                        "language": null,
                        "text": "BEA",
                        "text_type": "lines"
                    },
                    {
                        "id": "8485e7771e8f0ecdc871cf8856554be1",
                        "item_id": "3b103330378acb3ad604045ba1f4aecd",
                        "bounding_box": {
                            "top": 12,
                            "left": 14,
                            "width": 169,
                            "height": 11
                        },
                        "confidence": 0,
                        "language": null,
                        "text": "HIT THAT LIKE BUTTON, NATION!",
                        "text_type": "lines"
                    }
                ],
                "thumbnail_path": "video_main_frames/frame-0000000002.jpg",
                "time": 4.004
            }
        ]
    },
    "next_page": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAKP-CAwMDYWxsAAVzdGFydA0tMTExMTEwMC45OTk5BndpbmRvdwIxMAA="
}

For an asset that is a document:

{
    "contents": {
        "img": null,
        "pages": [
            {
                "images": [
                    {
                        "image_id": "daea649e947b430acceccc089f653c3f",
                        "image_index": 0,
                        "ocrs": [
                            {
                                "id": "095a5ca47bda4e0a9db562e23070125c",
                                "item_id": "43b89caeb2fd3c23f25b8431e644cabc",
                                "bounding_box": {
                                    "top": 15,
                                    "left": 522,
                                    "width": 217,
                                    "height": 24
                                },
                                "confidence": 0,
                                "language": null,
                                "text": "Pilgrim Programming, LLC",
                                "text_type": "lines"
                            }
                        ],
                        "thumbnail_path": "document_pages/thumb-pg-00000-img-00000.png"
                    }
                ],
                "page": 0,
                "ocrs": "optionally, a page can have OCR as well, or it can be embedded in the images within the page",
                "thumbnail_path": "optionally, a page can have a thumbnail path as well as well as each embedded images thumbnail",
                "page_id": "a uuid to identify the page, optional, will be available when all query param is set"
            },
            {
                "images": [
                    {
                        "image_id": "18061410280b001f26f34890d0da6b04",
                        "image_index": 0,
                        "ocrs": [
                            {
                                "id": "2b6d00e3d4d014267a011d64d50144ff",
                                "item_id": "43b89caeb2fd3c23f25b8431e644cabc",
                                "bounding_box": {
                                    "top": 1194,
                                    "left": 270,
                                    "width": 817,
                                    "height": 23
                                },
                                "confidence": 0,
                                "language": null,
                                "text": "There are no liens , claims or encumbrances which might conflict with or otherwise affect",
                                "text_type": "lines"
                            }
                        ],
                        "thumbnail_path": "document_pages/thumb-pg-00001-img-00000.png"
                    }
                ],              
                "page": 1,
            }
        ],
        "video_frames": null
    },
    "next_page": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAJ_-CAwMDYWxsAAVzdGFydA0tMTExMTEwOS45OTk5BndpbmRvdwExAA=="
}

Curating an OCR for an Item

To create a curated OCR, you can post to the following endpoint with a valid request body.

POST /api/data/v3/items/{id}/ocrs
{
	"segment_index": float64,
	"image_index": int,
	"item_type": ENUM["image" | "video" | "document"],
	"text": string
}

item_type - The type of item.
segment_index - Set to -1 if the item is an image.
image_index - Set to -1 if the item is an image and -1 for a video.
text - Must not be an empty string.

Response

A successful call returns a Status Create (201) with the following response body that includes the related metadata data associated with that segment/image index:

{
	"ocr": {
		"id": string,
		"item_id": string,
		"bounding_box": {
			"top": int,
			"left": int,
			"width": int,
			"height": int
		},
		"confidence": float64,
		"language": string,
		"language_confidence": float64,
		"text": string,
		"text_type": string,
	}
}

Editing an OCR

To edit an existing OCR text, use the following request:

PATCH /api/data/v3/items/{id}/ocrs/{ocr_id}
{
	"text": "new ocr text"
}

id - The item ID.
ocr_id - The description ID.
text - An empty string is allowed.

Response

A successful call returns a Status OK (200) with the new OCR after updating.

{
	"ocr": {
		"id": string,
		"item_id": string,
		"bounding_box": {
			"top": int,
			"left": int,
			"width": int,
			"height": int
		},
		"confidence": float64,
		"language": string,
		"language_confidence": float64,
		"text": string,
		"text_type": string,
	}
}

If the OCR by the ocr_id is not found, a Status Not Found (404) is returned. If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Deleting an OCR

To delete an existing OCR text, use the following request:

DELETE /api/data/v3/items/{id}/ocrs/{ocr_id}

id - The item ID.
ocr_id - The description ID.

Response

A successful call returns a Status OK (200) with the ocr ID after deleting.

{
	"id": "5218778fd97e4960ebfe40529985fc17",
}

Item Speech to Texts

To get speech to texts for an item:

GET /api/data/v3/items/{id}/speech-to-texts?mask={mask}

mask - Enables you to mask the embedded NLP data for a STT, which may result in faster results. Set mask=nlp to remove NLP data from being provided.

Response

A successful call returnS a Status OK (200) with the following response body:

"transcripts": [
        {
            "source": "amazon_transcribe",
            "track": 2,
            "transcript": [
                {
                    "id": "f83e777b8bf149438d61109a5d9dbf6f",
                    "item_id": "8447caddb4c1501a291bf343d6886586",
                    "start_at": 0,
                    "end_at": 10.05,
                    "text": "Wiggle room Small additions to Cuba with yourself So look at my harvest one file here I have a number of different",
                    "language": null,
                    "nlp_properties": {
                        "entities": [
                            {
                                "text": "Cuba",
                                "confidence": 0.9795709848403931,
                                "type": "location"
                            },
                            {
                                "text": "one file",
                                "confidence": 0.6889344453811646,
                                "type": "quantity"
                            }
                        ],
                        "key_phrases": [
                            {
                                "text": "Wiggle room Small additions",
                                "confidence": 0.7353704571723938
                            },
                            {
                                "text": "Cuba",
                                "confidence": 0.9996484518051147
                            },
                            {
                                "text": "my harvest one file",
                                "confidence": 0.8326694965362549
                            },
                            {
                                "text": "a number",
                                "confidence": 0.9973674416542053
                            }
                        ],
                        "sentiment": {
                            "text": "neutral",
                            "sentiment_confidence": {
                                "Mixed": 0.004104138817638159,
                                "Negative": 0.012511652894318104,
                                "Neutral": 0.934111475944519,
                                "Positive": 0.04927277937531471
                            }
                        },
                        "language": {
                            "language": "en",
                            "confidence": 0.9973103404045105
                        }
                    }
                },
                ...
            ]
        }
    ]
}

The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Downloading a Subtitles File for an Item

GET /api/data/v3/items/{id}/speech-to-texts/downloads?format={format}&source={source}

format - Optional.vtt or srt. Generates the downloaded file in the specified format. (Defaults to srt if not specified.)
source - Optional. If specified, it looks specifically for the given source. If not specified, it traverses all possible sources looking for a transcript.

Response

A successful call returns a Status OK (200) with the file in the body of the response.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

If the source parameter is specified and there are no entries from the given source, a Not Found (404) error is returned.

If no source parameter is specified and a transcript cannot be found for any source, a Not Found (404) error is returned.

Getting Item Thumbnails

To get thumbnails for an item:

GET /api/data/v3/items/{id}/thumbnails

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is a video:

{
    "contents": {
        "thumbnail": {
			"path": "thumbnailer/sprite.jpg",
			"type": "sprite",
			"frame_count": 30,
			"height": 152,
			"width": 270
		},
        "video_frames": [
            {
                "time": 1,
                "frame_id": "16d63b1c711727218a44e4c0a8d43a20",
                "thumbnail": "video_main_frames/frame-0000000000.jpg"
            },
            {
                "time": 2,
                "frame_id": "0b9d3ac13e51b621d535f812bcdd45fb",
                "thumbnail": "video_main_frames/frame-0000000001.jpg"
            },
            ...
        ]
    }
}

For an asset that is a document:

{
    "contents": {
        "thumbnail": {
			"path": "thumbnailer/thumb.png",
			"type": "image",
			"frame_count": 0,
			"height": 152,
			"width": 270
		},
        "pages": [
            {
                "page": 0,
                "page_id": "28e5fa9e0b736baa7f2f7843a024adc9",
                "thumbnail_path": "document_pages/thumb-pg-00000.png",
                "images": [
                    {
                        "image_index": 0,
                        "image_id": "907b26c326a53e67f32f1cbf8ccbba54",
                        "thumbnail_path": "document_pages/thumb-pg-00000-img-00000.png"
                    }
                ]
            },
            {
                "page": 1,
                "page_id": "6b3446f091ad62dc1ea90e6b619666f8",
                "thumbnail_path": "document_pages/thumb-pg-00001.png",
                "images": [
                    {
                        "image_index": 0,
                        "image_id": "09ca78c5e94ed7cb328bc2980aacacfe",
                        "thumbnail_path": "document_pages/thumb-pg-00001-img-00000.png"
                    }
                ]
            },
			...
        ]
    }
}

Images in a document show up when an embedded image is detected within the document and are nil if they are not detected.

For all other assets:

{
    "contents": {
        "thumbnail": {
			"path": "thumbnailer/thumb.jpg",
			"type": "image",
			"frame_count": 0,
			"height": 152,
			"width": 270
		}
    }
}

If the thumbnail field is " ", the asset did not have a thumbnail created. This happens with text, caption, and archive files.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Updating a Speech to Text Entry

PATCH /api/data/v3/items/{item_id}/speech-to-texts/{s2t_id}
{
	"text": "my new speech to text"
}

Response

This returns the updated speech to text entry. Otherwise, an error is returned:

Status Not Found (404) if passing an invalid item_id or s2t_id.
Status Unprocessable Entity (422) if there is a validation error.
Status Internal Server Error (500) if some other error occurs.

Deleting a Speech to Text Entry

DELETE /api/data/v3/items/{item_id}/speech-to-texts/{s2t_id}

Response

Upon success, a 204 No Content is returned. Otherwise, an error is returned:

Status Not Found (404) if passing an invalid item_id or s2t_id.
Status Internal Server Error (500) if some other error occurs.

Getting Item Custom Tags

To get custom tags for an item:

GET /api/data/v3/items/{id}/customtags/amazonrek?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provides all entries for an item regardless of whether or not the logo data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "custom_tags": [
                {
                    "id": "8abdb55f697f0a28a8264f3f0a320d09",
                    "confidence": 0.851,
                    "name": "Les Paul",
                    "bounding_box": {
                        "top": 1.1,
                        "left": 2.2,
                        "width": 3.3,
                        "height": 4.4
                    }
                }
            ]
        }
    },
    "next_page_token": ""
}

For an asset that is a video:

{
    "contents": {
        "video_frames": [
            {
                "time": 4.004,
                "frame_id": "adb7e0d8787d2da396ee6e79ab80b0b0",
                "thumbnail": "video_main_frames/frame-0000000002.jpg",
                "custom_tags": [
                    {
                        "id": "cb76d5633637c0d060026adaf87a2804",
                        "confidence": 0.8071,
                        "name": "Les Paul",
                        "bounding_box": {
                            "top": 1.1,
                            "left": 2.2,
                            "width": 3.3,
                            "height": 4.4
                        }
                    }
                ]
            },
            {
                "time": 8.008,
                "frame_id": "8e9ebfb5b7fa100a636fe1aeef01ea5f",
                "thumbnail": "video_main_frames/frame-0000000004.jpg",
                "custom_tags": [
                    {
                        "id": "a7be4458334a717525a902ce4df2f358",
                        "confidence": 0.81195,
                        "name": "Stratocaster",
                        "bounding_box": {
                            "top": 1.1,
                            "left": 2.2,
                            "width": 3.3,
                            "height": 4.4
                        }
                    }
                ]
            }
        ]
    },
    "next_page_token": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAIv-CAwMFc3RhcnQHMzYuMDAwMQZ3aW5kb3cCMTYDYWxsAAA="
}

For an asset that is a document:

{
    "contents": {
        "pages": [
            {
                "page": 22,
                "images": [
                    {
                        "image_index": 11,
                        "image_id": "7f6d4b027ba6cf303a3cd4108e99b866",
                        "thumbnail_path": "document_pages/thumb-pg-00022-img-00011.png",
                        "custom_tags": [
                            {
                                "id": "72157a50ea72d788ca102171639a3f45",
                                "confidence": 0.8273,
                                "name": "Gibson SG",
                                "bounding_box": {
                                    "top": 1.1,
                                    "left": 2.2,
                                    "width": 3.3,
                                    "height": 4.4
                                }
                            }
                        ]
                    }
                ]
            },
            {
                "page": 34,
                "images": [
                    {
                        "image_index": 0,
                        "image_id": "ebf1c28240735ad2b63a348a9f6671ee",
                        "thumbnail_path": "document_pages/thumb-pg-00034-img-00000.png",
                        "custom_tags": [
                            {
                                "id": "628cd34b9b13f7b5a16d19d62923ce47",
                                "confidence": 0.81385,
                                "name": "Paul Reed Smith",
                                "bounding_box": {
                                    "top": 1.1,
                                    "left": 2.2,
                                    "width": 3.3,
                                    "height": 4.4
                                }
                            }
                        ]
                    }
                ]
            }
        ]
    },
    "next_page_token": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAIv-CAwMFc3RhcnQHMzYuMDAwMQZ3aW5kb3cCMTYDYWxsAAA="
}

The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Getting Item Logos

To get logos for an item:

GET /api/data/v3/items/{id}/logos?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provides all entries for an item regardless of whether or not the logo data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "logos": [
                {
                    "id": "8abdb55f697f0a28a8264f3f0a320d09",
                    "confidence": 0.851,
                    "name": "Adidas",
                    "bounding_box": {
                        "top": 1083,
                        "left": 236,
                        "width": 62,
                        "height": 88
                    }
                }
            ]
        }
    },
    "next_page_token": ""
}

For an asset that is a video:

{
    "contents": {
        "video_frames": [
            {
                "time": 4.004,
                "frame_id": "adb7e0d8787d2da396ee6e79ab80b0b0",
                "thumbnail": "video_main_frames/frame-0000000002.jpg",
                "logos": [
                    {
                        "id": "cb76d5633637c0d060026adaf87a2804",
                        "confidence": 0.8071,
                        "name": "eastern connecticut state university",
                        "bounding_box": {
                            "top": 7,
                            "left": 6,
                            "width": 180,
                            "height": 21
                        }
                    }
                ]
            },
            {
                "time": 8.008,
                "frame_id": "8e9ebfb5b7fa100a636fe1aeef01ea5f",
                "thumbnail": "video_main_frames/frame-0000000004.jpg",
                "logos": [
                    {
                        "id": "a7be4458334a717525a902ce4df2f358",
                        "confidence": 0.81195,
                        "name": "eastern connecticut state university",
                        "bounding_box": {
                            "top": 7,
                            "left": 5,
                            "width": 182,
                            "height": 21
                        }
                    }
                ]
            }
        ]
    },
    "next_page_token": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAIv-CAwMFc3RhcnQHMzYuMDAwMQZ3aW5kb3cCMTYDYWxsAAA="
}

For an asset that is a document:

{
    "contents": {
        "pages": [
            {
                "page": 22,
                "images": [
                    {
                        "image_index": 11,
                        "image_id": "7f6d4b027ba6cf303a3cd4108e99b866",
                        "thumbnail_path": "document_pages/thumb-pg-00022-img-00011.png",
                        "logos": [
                            {
                                "id": "72157a50ea72d788ca102171639a3f45",
                                "confidence": 0.8273,
                                "name": "misako",
                                "bounding_box": {
                                    "top": 0,
                                    "left": 306,
                                    "width": 1079,
                                    "height": 782
                                }
                            }
                        ]
                    }
                ]
            },
            {
                "page": 34,
                "images": [
                    {
                        "image_index": 0,
                        "image_id": "ebf1c28240735ad2b63a348a9f6671ee",
                        "thumbnail_path": "document_pages/thumb-pg-00034-img-00000.png",
                        "logos": [
                            {
                                "id": "628cd34b9b13f7b5a16d19d62923ce47",
                                "confidence": 0.81385,
                                "name": "colgate",
                                "bounding_box": {
                                    "top": 436,
                                    "left": 375,
                                    "width": 172,
                                    "height": 151
                                }
                            }
                        ]
                    }
                ]
            }
        ]
    },
    "next_page_token": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAIv-CAwMFc3RhcnQHMzYuMDAwMQZ3aW5kb3cCMTYDYWxsAAA="
}

The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Getting Item Mature Content

Get mature content categories for an item:

GET /api/data/v3/items/{id}/mature-content?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provides all entries for an item regardless of whether or not the logo data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "img_id": "0dc050b8997014a97a7585d57ba7a842",
            "mature_content": [
                {
                    "id": "89b0a7ceae5c157ffa2bb609112b13ac",
                    "item_id": "e926f9da91bb002aeb9eb4affcd0b885",
                    "segment_index": -1,
                    "image_index": -1,
                    "metadata_id": "0dc050b8997014a97a7585d57ba7a842",
                    "name": "adult",
                    "confidence": 0.9845221042633057,
                    "source": "azure"
                },
                {
                    "id": "1694acfc0920e4c5c1ed1818a7d374f6",
                    "item_id": "e926f9da91bb002aeb9eb4affcd0b885",
                    "segment_index": -1,
                    "image_index": -1,
                    "metadata_id": "0dc050b8997014a97a7585d57ba7a842",
                    "name": "racy",
                    "confidence": 0.9923509359359741,
                    "source": "azure"
                }
            ]
        }
    },
    "next_page": ""
}

For an asset that is a video:

{
    "contents": {
        "video_frames": [
            {
                "time": 10.01,
                "frame_id": "8700e93f682bad0af6c568500041c381",
                "thumbnail": "video_main_frames/frame-0000000005.jpg",
                "mature_content": [
                    {
                        "id": "d5dcbcd6a5f9bbde9eb8b66d4ba96ff2",
                        "item_id": "f7f611715fe91c98abd241fd1f9567ba",
                        "segment_index": 10.01,
                        "image_index": -1,
                        "metadata_id": "8700e93f682bad0af6c568500041c381",
                        "name": "racy",
                        "confidence": 0.9602822661399841,
                        "source": "azure"
                    }
                ]
            }
        ]
    },
    "next_page": ""
}

For an asset that is a document:

{
    "contents": {
        "pages": [
            {
                "page": 0,
                "images": [
                    {
                        "image_index": 0,
                        "image_id": "88e096b93f357ce138ffb84c12971837",
                        "thumbnail_path": "document_pages/thumb-pg-00000-img-00000.png",
                        "mature_content": [
                            {
                                "id": "18198fe070802fca122b400de113b196",
                                "item_id": "738a3f8a44ad00b0a5a3d17ea4bd4673",
                                "segment_index": 0,
                                "image_index": 0,
                                "metadata_id": "88e096b93f357ce138ffb84c12971837",
                                "name": "racy",
                                "confidence": 0.9949294924736023,
                                "source": "azure"
                            },
                            {
                                "id": "4784891bf0b8c051a69edbce5bf4e605",
                                "item_id": "738a3f8a44ad00b0a5a3d17ea4bd4673",
                                "segment_index": 0,
                                "image_index": 0,
                                "metadata_id": "88e096b93f357ce138ffb84c12971837",
                                "name": "adult",
                                "confidence": 0.9898415207862854,
                                "source": "azure"
                            }
                        ]
                    }
                ]
            }
        ]
    },
    "next_page": ""
}

The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Item Tags

Getting Tags for an Item

GET /api/data/v3/items/{id}/tags?page-token={page_token}&start={start}&window={window}&all={all}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The time (video) or page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The time (video) or page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IIMG item.
all - Provides all entries for an item regardless of whether or not the tag data is present.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit, and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is an IMG:

{
    "contents": {
        "img": {
            "tags": [
                {
                    "id": "eec2accad1cc1b757f3035bd8253ac04",
                    "text": "window",
                    "confidence": 0.9015815854072571
                },
                {
                    "id": "e95fc00150dbdf08ec3eb2e75c638ac1",
                    "text": "stained glass",
                    "confidence": 0.9015815854072571
                },
                {
                    "id": "c06ee06a066475e692b8b56433ab371a",
                    "text": "light",
                    "confidence": 0.8380934019465514
                },
                {
                    "id": "7cfb5e8e12cdfaeb10086a5a9f693786",
                    "text": "sphere",
                    "confidence": 0.5156272603992966
                },
                {
                    "id": "ef689cbeb566845b46be786438cb8782",
                    "text": "church",
                    "confidence": 0.25103029243584873
                }
            ]
        },
        "pages": null,
        "video_frames": null
    },
    "next_page": ""
}

For an asset that is a video:

{
    "contents": {
        "img": null,
        "pages": null,
        "video_frames": [
            {
                "frame_id": "3a222b4f2e58529ce0dc321cf299dadb",
                "tags": [
                    {
                        "id": "b12f152bda07508c9a482d74fef7dac3",
                        "text": "summer",
                        "confidence": 0.312589002008962
                    },
                    {
                        "id": "a090dafce32110bde870b89055e75939",
                        "text": "autumn",
                        "confidence": 0.20671001655433419
                    }
                ],
                "thumbnail_path": "video_main_frames/frame-0000000000.jpg",
                "time": 0
            },
            {
                "frame_id": "6decb0045799aef3f95dbc6327a992fd",
                "tags": [
                    {
                        "id": "8153baacdde17515bb7d89aa09e10347",
                        "text": "firefighter",
                        "confidence": 0.9791649580001832
                    },
                    {
                        "id": "e5c846b2a75496779eb7c32b1f567a86",
                        "text": "person",
                        "confidence": 0.9791649580001831
                    },
                    {
                        "id": "6dcff09bdec2deb23d355d89e48a8f34",
                        "text": "smoke",
                        "confidence": 0.5069368303763367
                    }
                ],
                "thumbnail_path": "video_main_frames/frame-0000000001.jpg",
                "time": 2
            }
        ]
    },
    "next_page": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAJ_-CAwMDYWxsAAVzdGFydA0tMTExMTEwOS45OTk5BndpbmRvdwExAA=="
}

For an asset that is a document:

{
    "contents": {
        "img": null,
        "pages": [
            {
                "images": [
                    {
                        "image_id": "6ae8b0cd1a26070884f08dac7336a2c2",
                        "image_index": 0,
                        "tags": [
                            {
                                "id": "e6d5e65e71d93bbada112bfbcb6a5ed0",
                                "text": "person",
                                "confidence": 0.9971379041671753
                            },
                            {
                                "id": "5a322b094c87b670ea71bf8eea7cc7ed",
                                "text": "man",
                                "confidence": 0.9918940663337708
                            }
                        ],
                        "thumbnail_path": "document_pages/thumb-pg-00000-img-00000.png"
                    }
                ],
                "page": 0,
                "tags": "optionally, a page can have tags as well, or it can be embedded in the images within the page as shown above",
				"thumbnail_path": "optionally, a page can have a thumbnail path as well as well as each embedded images thumbnail above",
				"page_id": "a uuid to identify the page, optional, is guaranteed to be available when all query param is set"
            },
            {
                "images": [
                    {
                        "image_id": "b604734057d3462894d6c1e75a8517b3",
                        "image_index": 0,
                        "tags": [
                            {
                                "id": "2ca9919a848d662a254315f34de006ca",
                                "text": "standing",
                                "confidence": 0.8040973544120789
                            },
                            {
                                "id": "807f196d29b0094d558479d50fddbb74",
                                "text": "crowd",
                                "confidence": 0.0062681203708052635
                            }
                        ],
                        "thumbnail_path": "document_pages/thumb-pg-00001-img-00000.png"
                    }
                ],
                "page": 1
            }
        ],
        "video_frames": null
    },
    "next_page": "NP-BAwEBBVRva2VuAf-CAAEDAQVMaW1pdAEEAAEGT2Zmc2V0AQQAAQZQYXJhbXMB_4QAAAAh_4MEAQERbWFwW3N0cmluZ11zdHJpbmcB_4QAAQwBDAAAJ_-CAwMDYWxsAAVzdGFydA0tMTExMTEwOS45OTk5BndpbmRvdwExAA=="
}

The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Adding Tags to an Item

POST /api/data/v3/items/{id}/tags

Request body:

{
	"metadata_id": "metadata UUID",
	"tags": ["tag1", "tag2","tag3"]
}

"tags" - A list of tags you want to add. All tags are deduplicated before being applied to the segment.

Response

A successful call returns a Status Created (201) with the metadata segment (or segment parent) with new state of the segment, including any added tags that may have been added. This looks identical to the get tests response, except the response includes only the edited IMG/timeframe/page data.

Deleting Tags From a Segment by Name

DELETE /api/data/v3/items/{id}/tags?metaID={meta_id}&tagName={tag_name}

meta_id - The segment or image UUID. If an image metadata ID is provided, it removes the tagname from all sibling images under that segment.
tag_name - The name of the tag(s) to be deleted. This may result in deletion of multiple tags from the segment if they share names.

Response

A successful call returns a Status No Content (204) with no response body.

Getting Contents for a Document Item

GET /api/data/v3/items/{id}/text-contents?page-token={page_token}&start={start}&window={window}&all={all}&mask={mask}

page_token - The next page token provided to page the results. When provided, a next page token is all that is needed to retrieve the next page of results. If page_token is set along with other query parameters, the page_token takes precedence.
start - The page (documents) to indicate where to start retrieving results for a given item. This has no affect on an IMG item.
window - The page (documents) to indicate where to end retrieving results for a given item. This has no affect on an IMG item.
all - Provide all entries for an item regardless of whether or not the text content data is present.
mask - Enables you to mask the embedded MLP data for a text content entry, which may result in faster results. Set mask=nlp to remove NLP data from being provided.

If none are set, the full collection is returned without pagination. A valid page_token can be used without the addition of all, limit and offset.

Response

A successful call returns a Status OK (200) with the following response body.

For an asset that is a document:

{
    "contents": {
        "pages": [
            {
                "page": 0,
                "page_id": "1a15b610dd82bdfec2ee8ecf5597cc97",
                "thumbnail_path": "document_pages/thumb-pg-00000.png",
                "text_content": {
                    "id": "65a5d28b38228ebce12f8bab67e0f386",
                    "metadatas_id": "1a15b610dd82bdfec2ee8ecf5597cc97",
                    "text": "This is an example pdf\n\n\f",
                    "language": {
                    	"code": "en-US",
                    	"confidence": 0.7
                    }
                    "nlp_properties": {
                        "entities": null,
                        "key_phrases": null,
                        "sentiment": {
                            "text": "neutral",
                            "sentiment_confidence": {
                                "Mixed": 0.013882513158023357,
                                "Negative": 0.16380225121974945,
                                "Neutral": 0.7018685936927795,
                                "Positive": 0.12044669687747955
                            }
                        },
                        "language": {
                            "language": "en",
                            "confidence": 0.9962568283081055
                        }
                    }
                }
            },
            {
                "page": 1,
                "page_id": "1bf644159b1b1c1e3025df76f7c66110",
                "thumbnail_path": "document_pages/thumb-pg-00001.png",
                "text_content": {
                    "id": "62773d56d314a578f746a080460195a7",
                    "metadatas_id": "1bf644159b1b1c1e3025df76f7c66110",
                    "text": "This is page 2 of the example pdf\n\n\f",
                    "language": null
                    "nlp_properties": {
                        "entities": [
                            {
                                "text": "page 2",
                                "confidence": 0.8359338045120239,
                                "type": "quantity"
                            }
                        ],
                        "key_phrases": [
                            {
                                "text": "page 2",
                                "confidence": 0.9814304709434509
                            }
                        ],
                        "sentiment": {
                            "text": "neutral",
                            "sentiment_confidence": {
                                "Mixed": 0.007130879443138838,
                                "Negative": 0.10643889009952545,
                                "Neutral": 0.783736526966095,
                                "Positive": 0.10269377380609512
                            }
                        },
                        "language": {
                            "language": "en",
                            "confidence": 0.9866665005683899
                        }
                    }
                }
            }
        ]
    },
    "next_page": ""
}

For the same asset but with the mask set to remove NLP information:

{
    "contents": {
        "pages": [
            {
                "page": 0,
                "page_id": "1a15b610dd82bdfec2ee8ecf5597cc97",
                "thumbnail_path": "document_pages/thumb-pg-00000.png",
                "text_content": {
                    "id": "65a5d28b38228ebce12f8bab67e0f386",
                    "metadatas_id": "1a15b610dd82bdfec2ee8ecf5597cc97",
                    "text": "This is an example pdf\n\n\f",
                    "language": {
                       "code": "en-US",
                       "confidence": 0.7
                    }
                }
            },
            {
                "page": 1,
                "page_id": "1bf644159b1b1c1e3025df76f7c66110",
                "thumbnail_path": "document_pages/thumb-pg-00001.png",
                "text_content": {
                    "id": "62773d56d314a578f746a080460195a7",
                    "metadatas_id": "1bf644159b1b1c1e3025df76f7c66110",
                    "text": "This is page 2 of the example pdf\n\n\f",
                    "language": null
                    }
                }
            }
        ]
    },
    "next_page": ""
}

If start and window are provided, then the results may be paginated. The next page token provides a stringified token that can be used to retrieve the next page of results. If the next page token is an empty string, there are no more results.

Getting Item Text Tokens

To get the content of text files (.txt):

GET /api/data/v3/items/{id}/tokens

Response

A successful call returns a Status OK (200) with the following response body:

{
  "tokens": "The quick brown fox jumps over the lazy dog"
}

If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Getting Item Extractor Runs

To get extractor runs for an item:

GET /api/data/v3/items/{id}/extractors?run-type={run-type}

run-type- An enum from one of the following values:
- all - Returns all the extractors run for an item in its lifetime.
- latest - Returns the extractors from the last run.
- erroneous - Returns the extractors that have errors registered from their last run.
- unique - Returns a list of unique extractors from the history of the item. The latest extractor run for each extractor type is returned.

Response

A successful call returns a Status OK (200) with the following response body:

{
	"extractors_runtime": [
		{
			"request_id": String,
			"err": String,
			"success": Bool,
			"skipped": Bool,
			"runtime": Integer duration in nanoseconds,
			"start_at": Zulu Timestamp,
			"end_at": Zulu Timestamp,
			"info": {
				"name": String,
				"version": Integer
			} 
		}
		...
	]
}

If the item is not found, a Status Not Found 404 is returned. If any unexpected errors occurred in the process of fulfilling this request or response cycle, a Status Internal Server Error (500) is returned.

Getting Extractor Source Files

Item Amazon Transcribe Source Word File Download

To download the source words for the Amazon Transcribe transcriptions, retrieve it using the following endpoint:

GET /api/files/{item_id}/sourcefiles/amazon_transcribe.json

Response

A response is a list of words and punctuation that make up dictation. The following example shows how each type would look in a source file.

{
    "words": [
        {
            "start_time": "0",
            "end_time": "0",
            "type": "punctuation",
            "alternatives": [
                {
                    "confidence": "0",
                    "content": "."
                }
            ]
        },
        {
			"start_time": "0.28",
			"end_time": "0.34",
			"type": "pronunciation",
			"alternatives": [
				{
					"confidence": "0.215",
					"content": "Yeah"
				}
			]
		},
		...
	]
}

Getting Distinct Item Types

Request

GET /api/data/v3/items/types

Response

A list of distinct item types from analyzed items in the system is returned.

{
    "types": [
        "audio",
        "image/raster",
        "video"
    ]
}

Listing Extractor History for an Item

Request

GET /api/data/v3/items/{id}/extractors/history

Response

The response includes a list of every extractor that has run, along with details about each historical run of that extractor.

{
    "extractors": [
        {
            "id": "archive",
            "runs": [
                {
                    "request_id": "5df2a0e51fa4fd2993cf27a4ac4d26ab",
                    "error": "",
                    "success": true,
                    "skipped": false,
                    "start_at": "2019-12-12T20:19:50.094366Z",
                    "end_at": "2019-12-12T20:19:50.094429Z"
                },
                {
                    "request_id": "5deacee3b42b927cd17fc10094ab16d6",
                    "error": "",
                    "success": true,
                    "skipped": false,
                    "start_at": "2019-12-06T21:57:55.924024Z",
                    "end_at": "2019-12-06T21:57:55.924068Z"
                }
            ]
        },
        {
            "id": "document_pages",
            "runs": [
                {
                    "request_id": "5df2a0e51fa4fd2993cf27a4ac4d26ab",
                    "error": "",
                    "success": true,
                    "skipped": false,
                    "start_at": "2019-12-12T20:19:51.257149Z",
                    "end_at": "2019-12-12T20:19:51.257279Z"
                },
                {
                    "request_id": "5deacee3b42b927cd17fc10094ab16d6",
                    "error": "",
                    "success": true,
                    "skipped": false,
                    "start_at": "2019-12-06T21:58:06.406967Z",
                    "end_at": "2019-12-06T21:58:06.407036Z"
                }
            ]
        }
    ]
}

Listing Frames (FrameDNA)

To list frames in an item available for FrameDNA:

GET /api/data/v3/items/{id}/frames

Response

A list of frames is returned. Use the “frame_id” for the FrameDNA detail call.

{
	"count": 2,
	"frames": [
		{
			"frame_id": "b00fd699530f12452353f2532ebcefcf",
			"time_seconds": 0,
			"thumbnail": {
				"path": "video_main_frames/frame-0000000000.jpg",
				"type": "",
				"frame_count": 0,
				"height": 336,
				"width": 624
			}
		},
		{
			"frame_id": "8dd3e5ec19525a1b764fcac5318fa4de",
			"time_seconds": 2,
			"thumbnail": {
				"path": "video_main_frames/frame-0000000001.jpg",
				"type": "",
				"frame_count": 0,
				"height": 336,
				"width": 624
			}
		}
    ]
}

Getting FrameDNA for a Given Frame

GET /api/data/v3/items/{id}/frame-dna/{frame_id}

Response

A list of visual metadata is returned for the requested frame.

{
	"frame_dna": {
		"frame_id": "3a74a72d97414034ca3a23d0fb47fcbf",
		"time_seconds": 2.002,
		"thumbnail": {
			"path": "video_main_frames/frame-0000000001.jpg",
			"type": "",
			"frame_count": 0,
			"height": 360,
			"width": 640
		},
		"adult_categories": [
            {
                "id": "812ee64889e16096fbe63a3bd0310a9e",
                "category": "porn_detection"
            },
            {
                "id": "f8d6138daf6d2f98d6f4360efcb9f517",
                "category": "suggestive_nudity_detection"
            }
        ],
		"faces": null,
		"ocr": [
			{
				"id": "9204e628d5ceb246eb1c25b40e65f335",
				"text": "ALL OF THE",
				"text_type": "lines",
				"order": 0
			},
			{
				"id": "d69999f160e966bc31e33d26f6b3a799",
				"text": "GAME OF HRONES",
				"text_type": "lines",
				"order": 1
			},
			{
				"id": "a5e65387116cc45e85dd17f36db14b46",
				"text": "SEX & NUDITY",
				"text_type": "lines",
				"order": 2
			},
			{
				"id": "546bc7fd9960a137719a08285fdeb23e",
				"text": "SEASON FIVE",
				"text_type": "lines",
				"order": 3
			}
		],
		"tags": [
			{
				"id": "6c399a5adb5d37e30c8995653d9d149b",
				"text": "text"
			},
			{
				"id": "550f30757f06d6a77c22326367a268e9",
				"text": "design"
			},
			{
				"id": "608dada3bb3f0377bb54d1061dd7d68c",
				"text": "poster"
			},
			{
				"id": "4cf6facfbbc7af5da784e913db542e38",
				"text": "alcohol"
			}
		],
		"description": {
			"id": "346b1c99ee0f9de724c5ef65f3acc00f",
			"text": "a close up of food",
			"language": ""
		},
		"custom_tags": null,
		"locations": null,
		"logos": [
			{
				"id": "869dc02aa066b3d7f1f051af220759dc",
				"name": "A Game of Thrones"
			}
		],
		"technical_cues": {
			"black_frame": true,
			"color_bars": false,
			"credits": false,
			"digital_slate": false,
			"slate": false,
			"texted": false
		}
	}
}

Items API in Wasabi AiR

Item Object

Getting Item Without Metadata

Response

Deleting by ID

Bulk Reading by IDs

Bulk Deleting by IDs

Updating an Item's Custom Asset Title

Response

Searching Within an Item

Response

Getting All Metadata

Response

Getting the metadata.json File for an Item

Response

Selective Data

Getting Specific Leaf Fields

Getting Specific Groups of Data

Associating an Item With Categories

Getting Timelines

Technical Cues

Response

Audio

Response

Color Bars

Response

Black Frames

Response

Credits

Response

Custom Tags (Amazon Rekognition)

Response

Detected Shots (Valossa Extractor)

Response

Digital Slates

Response

Insights

Response

Mature Content

Response

Locations

Response

Logos

Response

Slates

Response

Sports

Response

Silence

Response

Start End

Response

Textless Material

Response

Texted

Response

Getting Technical Metadata

Request

Response

Identifying Items

Response

Getting a List of Item Captions

Response

Getting a List of Text in an Item Caption

Response

Item Descriptions

Getting descriptions for an Item

Response

Curating a Description for an Item

Response

Editing a Description

Response

Item OCRs

Getting OCRs for an Item

Response

Curating an OCR for an Item

Response

Editing an OCR

Response

Deleting an OCR