Batches endpoint
The Batches endpoint lets you create and manipulate batches. A batch is a RESTful object that contains a user-defined list of files. A batch can be used as input for an AI Hub app, allowing for batch processing of all files you’ve added to the batch. Batches are useful for logically grouping files that you want to easily and repeatedly use as an app’s input.
For commercial users, a batch is stored in the default drive of the workspace you specified when creating the batch. If the default drive changes, but is still connected, you can still use the batch as input for running an app. However, you won’t be able to upload additional files to the batch. If the default drive is disconnected, you won’t be able to use batches stored on that drive as input for any app run.
In this document, URL_BASE
refers to the root URL of your Instabase instance, such as aihub.instabase.com
. API_ROOT
defines where to route API requests for file operations, and its value is URL_BASE
appended by /api/v2/batches
.
import json, requests
url_base = "https://aihub.instabase.com"
api_root = url_base + '/api/v2/batches'
To make calls to AI Hub APIs, you must define your API token and send it with your requests. API_TOKEN
in the following examples refers to your API token.
You can generate and manage API tokens from your AI Hub user settings. See the authorization documentation for details.
Create a batch
Method | Syntax |
---|---|
PUT | API_ROOT |
Description
Create a new batch by sending a PUT request. You must upload files to the batch separately.
Request headers
Name | Description | Values |
---|---|---|
IB-Context |
Optional. Specify the organization ID as this header’s value to create the batch inside the organization. Learn more. |
Request body
Key | Type | Description |
---|---|---|
batch_name |
string | Name of batch. Maximum length is 255 characters. |
workspace_name |
string | Optional. The name of the workspace in which to add the batch; the batch is created in the default drive of the workspace. If not specified, the default location for commercial users is the default drive of your personal workspace. For community users, the default location is your personal workspace’s Instabase Drive. |
Response status
A 2XX status code indicates the request was successful.
Status | Meaning |
---|---|
200 OK | Batch was created |
Response schema
Key | Type | Description |
---|---|---|
batch_id |
string | Batch ID of the newly created batch. |
Example request
import requests
url = api_root
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
data = {
'batch_name': 'my new batch'
}
resp = requests.put(url, headers=headers, data=data)
Upload file to batch
Method | Syntax |
---|---|
PUT | API_ROOT/<BATCH_ID>/files/<FILENAME> |
Description
Use this endpoint to upload a file to a batch, or to update the contents of a previously uploaded file in a batch.
In the request URL, <BATCH_ID>
is the batch’s ID and <FILENAME>
is a user-defined name for the file. The file name must include the file extension.
Files can be uploaded one at a time and the suggested max size for each file is 10 MB. For larger files, use the multipart file upload endpoint.
Request headers
Name | Description | Values |
---|---|---|
Content-Length |
The size of the file, in bytes. |
Request body
Supply the raw contents of file to be uploaded. See the example request, being sure to define the <LOCAL_FILEPATH>
with the full path to the file in the machine that’s running the script.
Response status
Status | Meaning |
---|---|
204 No Content | File was successfully uploaded. |
404 Not Found | Batch with ID <BATCH_ID> does not exist. |
Response schema
There is no response body.
Example request
import requests
url = api_root + f'/<BATCH_ID>/files/<FILENAME>'
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
with open('<LOCAL_FILEPATH>') as f:
data = f.read()
resp = requests.put(url, headers=headers, data=data)
Where <LOCAL_FILEPATH>
is the full path to the file in the machine that’s running the script.
Get batch
Method | Syntax |
---|---|
GET | API_ROOT/<BATCH_ID> |
Description
Retrieve information about a batch by sending this GET request.
Request headers
There are no additional request headers.
Request body
There is no request body. Use the request URL to provide the batch’s batch ID (BATCH_ID
).
Response status
Status | Meaning |
---|---|
200 OK | Batch successfully retrieved. |
404 Not Found | Batch <BATCH_ID> does not exist, or denied access. |
Response schema
Key | Type | Description |
---|---|---|
id |
string | The batch’s batch ID. |
name |
string | The batch’s name. |
workspace_name |
string | The name of the workspace in which the batch exists. |
mount_point_name |
string | The name of the connected drive in which the batch is stored. |
repo_owner |
string | The owner of the workspace (also known as the repo) in which the batch exists. |
batch_owner |
string | Username of user that created the batch |
created_at_ms |
int | When the batch was created, in Unix time. |
updated_at_ms |
int | When the batch was last updated, in Unix time. |
path_suffix |
string | Batch path suffix from mount point. |
{
"id": 2465,
"name": "my new batch",
"workspace_name": "my-repo",
"mount_point_name": "Instabase Drive",
"repo_owner": "test_admin",
"batch_owner": "test_admin",
"created_at_ms": 1709592306000,
"updated_at_ms": 1709592306000,
"path_suffix": "file-batches/c53a8a50-6b7b-4184-9aa3-132e9e8fa780"
}
Example request
import requests
url = api_root + f'/<BATCH_ID>'
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
resp = requests.get(url, headers=headers)
List batches
Method | Syntax |
---|---|
GET | API_ROOT?workspace_name=<WORKSPACE_NAME>&username=<USERNAME>&limit=<LIMIT>&offset=<OFFSET> |
Description
Return a list of batches, with results filtered using query parameters.
Query parameter | Description |
---|---|
workspace_name |
Optional. Filter to batches in the specified workspace. |
username |
Optional. Filter to batches created by the specified username (user ID). |
limit |
Optional. If paginating results, specify how many batches to return. |
offset |
Optional. If paginating results, specify the offset of the returned list. |
Request headers
Name | Description |
---|---|
IB-Context |
Optional. To list batches under a commercial organization, specify its organization ID here. Learn more. |
Request body
There is no request body. Use the request URL’s query parameters to filter the list of batches.
Response status
Status | Meaning |
---|---|
200 OK | Request successful. |
401 Unauthorized | User is not allowed to query for batches with provided filters. |
Response schema
Key | Type | Description |
---|---|---|
batches |
List | List of batches. See response schema for each batch object. |
Example request
import requests
# define filters
workspace_name = 'my-workspace'
limit = 100
headers = {
'Authorization': f'Bearer {API_TOKEN}',
'IB-Context': 'my-org'
}
url = api_root
resp = requests.get(url, headers=headers, params={ 'workspace_name': workspace_name, 'limit': limit })
Delete batch
Method | Syntax |
---|---|
DELETE | API_ROOT/<BATCH_ID> |
Description
Delete a batch and all of its files. This is an asynchronous operation that must be checked for completion. See details on polling jobs.
Specify the <BATCH_ID>
in the URL to identify the batch to be deleted.
Request headers
There are no additional request headers.
Request body
There is no request body. Use the request URL to specify the batch to be deleted.
Response status
Status | Meaning |
---|---|
202 Accepted | Batch deletion request accepted. Poll the job ID to check completion status. |
404 Not Found | A batch with ID <BATCH_ID> does not exist. |
Response schema
Key | Type | Description |
---|---|---|
job_id |
string | The job ID of the operation. Use the job ID with the poll batches job endpoint to check batch deletion status. |
Example request
import requests
url = api_root + f'/<BATCH_ID>'
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
resp = requests.delete(url, headers=headers)
poll_url = api_root + f'/jobs/{resp.json()['job_id']}'
# poll until complete
while True:
resp = requests.get(poll_url, headers=headers)
if resp.json()['state'] == 'COMPLETE':
break
Delete file from batch
Method | Syntax |
---|---|
DELETE | API_ROOT/<BATCH_ID>/files/<FILENAME> |
Description
Delete a file from a batch by sending this DELETE request. Use <BATCH_ID>
and <FILENAME>
in the request URL to specify the file to be deleted.
Request headers
There are no additional request headers.
Request body
There is no request body. Use the request URL to specify the file to be deleted.
Response status
A 2XX status code indicates the request was successful.
Status | Meaning |
---|---|
202 Accepted | Indicates that the deletion request has been accepted. Poll the deletion job for completion status. |
404 Not Found | Batch with <BATCH_ID> does not exist. |
Response headers
Name | Description | Values |
---|---|---|
Location |
Optional. Present if the status is 202 Accepted. | The full URL to poll the status of the deletion job. |
Response schema
There is no response body.
Example request
import requests
url = api_root + f'/<BATCH_ID>/files/<FILENAME>'
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
resp = requests.delete(url, headers=headers)
Poll batches job
Method | Syntax |
---|---|
GET | API_ROOT/jobs/<JOB_ID> |
Description
Use this endpoint to poll asynchronous jobs created when deleting a batch or deletinng a file from a batch. In the request URL, <JOB_ID>
is the job_id
value returned by a Delete batch request or the Location
header value returned by a Delete file from a batch request.
Request headers
There are no additional request headers.
Request body
There is no request body. Use the request URL to specify the job to poll.
Response status
Status | Meaning |
---|---|
200 OK | Request successful |
Response schema
Key | Type | Description |
---|---|---|
state |
string | The status of the job. Possible values are COMPLETE , FAILED , CANCELLED , RUNNING , or PENDING . |
message |
string | Job status message. |
Multipart file upload
To upload a file larger than 10 MB in size to a batch, you must use multipart upload. Multipart upload involves three endpoints and the following steps:
-
Start a multipart upload session, specifying the batch ID, file name, and file size. This call returns a session ID and maximum
part_size
to reference when uploading each part. -
Split the file into parts, according to the specified
part_size
, and upload each part individually. -
When all parts are uploaded, commit the session.
Multipart upload example request
headers = {
'Authorization': f'Bearer {API_TOKEN}'
}
# 1. create session
local_filepath = '<LOCAL_FILEPATH>'
size = os.path.getsize(local_filepath)
resp = requests.post(f'<API_ROOT>/multipart-upload',
headers=headers,
data=json.dumps({'path': destination_filepath, 'file_size': size}))
session_endpoint = resp.headers['location']
part_size = resp.json()['part_size']
# 2. upload parts
parts = []
part_num = 1
with open(local_filepath, 'rb') as input_file:
part = input_file.read(part_size)
while part:
part_resp = requests.put(f'{session_endpoint}/parts/{part_num}', headers=headers, data=part)
parts.append({'part_num': part_num, 'part_id': part_resp.json()['part_id']})
part = input_file.read(part_size)
part_num += 1
# 3. Commit all the uploaded parts.
commit_resp = requests.post(session_endpoint, headers=headers, data=json.dumps({'action': 'commit', 'parts': parts}))
Where <LOCAL_FILEPATH>
is the full path to the file in the machine that’s running the script.
Start multipart upload session
Method | Syntax |
---|---|
POST | API_ROOT/multipart-upload |
Description
Start a multipart upload session. Use this endpoint when you need to upload a file larger than 10 MB to a batch.
Request headers
There are no additional request headers.
Request body
Key | Type | Description |
---|---|---|
batch_id |
int | The batch’s batch ID. |
filename |
string | A file name for the uploaded file, including the file extension. Maximum of 255 characters. |
file_size |
int | The file size, in bytes. |
Response status
A 2XX status code indicates the request was successful.
Status | Meaning |
---|---|
201 Created | Indicates that the multipart upload session has been initiated. |
404 Not Found | Batch with <BATCH_ID> does not exist. |
Response headers
Name | Description | Values |
---|---|---|
Location |
Optional. Present if the status is 201 Created. | The session endpoint URL to use in the subsequent multipart upload requests, in the form: API_ROOT/multipart-upload/sessions/<SESSION_ID> . |
Response schema
Key | Type | Description |
---|---|---|
part_size |
int | The number of bytes each part should be when uploading to the session. Each part should match the part_size , except for the final part, which can be smaller than the part_size . |
Example request
See full multipart upload request.
Upload part to session
Method | Syntax |
---|---|
PUT | API_ROOT/multipart-upload/sessions/<SESSION_ID>/parts/<PART_NUM> |
Description
Upload part of a file to the multipart upload session, where each part’s size matches the part_size
returned by the Start multipart upload session call. Each part should match the part_size
, except for the final part, which can be smaller than the part_size
.
<SESSION_ID>
should be obtained in the Location header from the start multipart upload session response, and <PART_NUM>
is an increasing consecutive integer sequence starting at 1 for every part uploaded.
Request headers
There are no additional request headers.
Request body
Raw content of the part to be uploaded.
Response status
A 2XX status code indicates the request was successful.
Status | Meaning |
---|---|
201 Created | Indicates that the part has been successfully uploaded. |
Response schema
Key | Type | Description |
---|---|---|
part_id |
int | ID of uploaded part. |
part_num |
int | The part number of the uploaded part, indicating upload order. Identical with <PART_NUM> in request URL. |
Example request
See full multipart upload request.
Commit session
Method | Syntax |
---|---|
POST | API_ROOT/multipart-upload/sessions/<SESSION_ID> |
Description
After uploading all parts to a multipart upload session, use this endpoint to commit and close the multipart upload session, or to cancel the session.
Request headers
There are no additional request headers.
Request body
Key | Type | Description |
---|---|---|
action |
string | String literal of either commit or abort . |
parts |
List | A list of the uploaded parts. |
parts/part_num |
int | The part number of the uploaded part. |
parts/part_id |
int | The part ID of the uploaded part. |
Response status
A 2XX status code indicates the request was successful.
Status | Meaning |
---|---|
204 No Content | Indicates that the part has been successfully committed to the multipart upload session. |
Response schema
There is no response body.
Example request
See full multipart upload request.