Duplicates API in Wasabi AiR

Prev Next

The Duplicates API is a way to access information about items that have identical checksums and file sizes. Such items are considered to be duplicate copies of the same file.

Retrieving Duplicate Stats

GET /api/data/v3/duplicates-footprint

Response

{
    "total_count": 14,
    "total_number": 126,
    "total_size": 184925196
}
  • total_count - (int) The total count of duplicate hashes and file_size.
  • total_number - (int) The total number of duplicate items.
  • total_size - (int) The total size of duplicate items.

Status codes:

  • 200 (success)
  • 500 (unexpected error)

Listing Duplicated Hashes

GET /api/data/v3/duplicates

Optional Query Parameters

  • page-token - (string) A page token to fetch an additional page of results.
  • limit - (int) The number of results to return (default: 50, min: 1, max: 1000).
  • limit-items - (int) The number of item IDs to return for each hash (default: 20, min: 1, max: 1000).

Response

{
    "duplicates": [
        {
            "hash": "1bf6808152b518ec38b18780f96526ef",
            "items": [
                "0ddc337d7f399b406073f5f8e5324fc4",
                "3566d2ff4aea65d30eda983c2c749274",
                "444ccc35017f1e39188d735d7ec126a0",
                "4a86bb348ba1050511777d40a2471d3e",
                "69af63e7f10ad1b8fa5e9a05f149bbea",
                "7d63cf40504e9f2501250f731fea03b0",
                "84deb4ebb35d5c78cd69b9dce4eaf4b5",
                "cbbf806d29b43fdbeb9f2516e4813525",
                "faa93144670268595c8acc823e861de7"
            ],
            "count": 9,
            "file_size": 614912,
            "footprint_size": 39803040,
            "items_next_page": ""
        },
        {
            "hash": "fd083c80752b24b6085eb773ed1b9609",
            "items": [
                "4223082baa8abcb70c1fcd600b0f7b83",
                "82d37a0a16911b49a001e5686ad58a07",
                "92306656ab6c366b113af0cc6ca0abad",
                "92fb0a09883700b8c2116eaccf2b80e8",
                "9c8b1b15613e22f3474291f3ea2d8ca6",
                "be265a032be6d5b759e1ead68df875b9",
                "c27582011f2966350c7a64c834d88637",
                "dcb2c7f92cecac5b2a89fdd046a9bac8",
                "fe63caef8d1089b930a76dfc21dd5634"
            ],
            "count": 9,
            "file_size": 3598,
            "footprint_size": 28784,
            "items_next_page": ""
        },
        {
            "hash": "c37470f71fb1bc9c7242c5282274cdd8",
            "items": [
                "017dc9a61a040855f819f48b85d3119b",
                "651f0dc8cd2d2fe9d2b8462dfa03d04b",
                "82aaa25ff23826e21629d9f391b83d18",
                "84e5ae7f3fcd7caa4259d1480f057a73"
            ],
            "count": 4,
            "file_size": 54047,
            "footprint_size": 162141,
            "items_next_page": ""
        },
        {
            "hash": "6461a4d3c389f561207065dfd8d4c01b",
            "items": [
                "0fc10bd128c62699c80d3af21d5356ba",
                "eb41c0739e90a24d7d855933c5d3c231"
            ],
            "count": 2,
            "file_size": 367922,
            "footprint_size": 367922,
            "items_next_page": ""
        }
    ],
    "next_page": ""
}
  • duplicates- (array) An array of hash objects:
    • hash - (string) The cryptographic hash.
    • items - (array of strings) A list of item IDs that share that hash.
    • count - (int) The total number of items with that hash.
    • file_size - (int) The size (in bytes) of one of the files.
    • footprint_size - (int) The total size of the duplicates added up. ((count - 1) * file_size)
    • items_next_page - (string) A page token that can be used to retrieve the next page of items with a given hash (see the GET /api/data/v3/duplicates/{hash} endpoint below). If this is an empty string, there are no additional items with the given hash.
  • next_page - A page token that can be used to retrieve the next page of hashes. If this is an empty string, there are no additional results to return.

Status codes:

  • 200 (success)
  • 500 (unexpected error)

Retrieving a List of Items for a Given Hash

GET /api/data/v3/duplicates/{hash}

Optional Query Parameters

  • page-token - (string) A page token to fetch an additional page of results.
  • limit - (int) The number of results to return (default: 20, min: 1, max: 1000).

Response

{
    "hash": "1bf6808152b518ec38b18780f96526ef",
    "items": [
        "0ddc337d7f399b406073f5f8e5324fc4",
        "3566d2ff4aea65d30eda983c2c749274",
        "444ccc35017f1e39188d735d7ec126a0",
        "4a86bb348ba1050511777d40a2471d3e",
        "69af63e7f10ad1b8fa5e9a05f149bbea",
        "7d63cf40504e9f2501250f731fea03b0",
        "84deb4ebb35d5c78cd69b9dce4eaf4b5",
        "cbbf806d29b43fdbeb9f2516e4813525",
        "faa93144670268595c8acc823e861de7"
    ],
    "count": 9,
    "file_size": 614912,
    "footprint_size": 4919296,
    "items_next_page": ""
}
  • hash - (string) The cryptographic hash.
  • items - (array of strings) A list of item IDs that share that hash.
  • count - (int) The total number of items with that hash.
  • file_size - (int) The size (in bytes) of one of the files.
  • footprint_size - (int) The total size of the duplicates added up. ((count - 1) * file_size)
  • items_next_page - (string) A page token that can be used to retrieve the next page of items with the hash. If this is an empty string, there are no additional items with the given hash.

Status codes:

  • 200 (success)
  • 500 (unexpected error)