Apps quick start
The AI Hub APIs let you create and integrate an end-to-end solution. From uploading files to running an app, you can chain these API calls together using a script for a complete workflow. In this quick start guide, you’ll find a complete script you can use to create your own complete workflow. This guide also walks you through each of the following steps, explaining each section of the script and any values you must define.
While the scripts in this guide are written in Python, you can write them in any scripting language.
API setup
When making calls to AI Hub APIs, you use request headers to authorize and identify your requests.
-
Authorization is done using the
Authorization
header and is how you pass your API token. All requests must include an API token. You can generate and manage API tokens from your AI Hub user settings. In the sample script, anAPI_TOKEN
variable is used. -
Identification is done using the
IB-Context
header. This header is required for all commercial users who want to complete the call using the commercial organization’s context.-
To use your commercial account to complete the request, set the
IB-Context
header as your organization ID. -
To use your community account to complete the request, omit the header or define it as your user ID.
-
If the
IB-Context
header is undefined, your community account is used to complete the request.
TipYou can find you user ID and organization ID on the APIs settings page.
-
Script excerpt
This section of the script is where you define the API_TOKEN
variable’s value and can define the IB-Context
header.
API_TOKEN = '<YOUR-API-TOKEN>' # Enter your API token.
API_ROOT = 'https://aihub.instabase.com/api'
API_HEADERS = {
'Authorization': f'Bearer {API_TOKEN}',
'IB-Context': '<ORGANIZATION-ID or USER-ID>' # Enter ORGANIZATION_ID to use your commercial account. Omit or use USER_ID to use your community account.
}
Uploading files to a batch
You can create a batch of files to use as input for your app run. In AI Hub, a batch is a user-defined group of files. You can add and remove files from the batch, but the batch itself has a constant ID, so you can easily and repeatedly use all files in the batch as an app’s input.
For more information, see the batches endpoint.
Script excerpt
When run, this section of the script makes a series of calls to the Batches endpoint to create a batch and upload files to it.
INPUT_FILEPATHS = ['<../sample1.pdf>', '<sample2.docx>', '<inner/folder/sample3.png>'] # Replace with file paths for the input files to be uploaded.
def create_batch(batch_name, workspace_name=None):
resp = requests.put(f'{API_ROOT}/v2/batches', headers=API_HEADERS, data=json.dumps({'batch_name': batch_name, 'workspace_name': workspace_name }))
return resp.json()['batch_id']
def upload_input_file(local_filepath, batch_id, filename):
with open(local_filepath, 'rb') as input_file:
requests.put(f'{API_ROOT}/v2/batches/{batch_id}/files/{filename}',
headers=API_HEADERS,
data=input_file.read())
BATCH_ID = create_batch('<YOUR-BATCH-NAME>', '<WORKSPACE-NAME>') # Define a name for the batch and, optionally, specify a workspace in which to create the batch. If WORKSPACE-NAME is not defined, the default is accepted.
for input_filepath in FILENAMES:
upload_input_file(input_filepath, BATCH_ID, os.path.basename(input_filepath))
User-defined values
You can define the following values in the script.
Parameter or variable | Type | Required | Description |
---|---|---|---|
INPUT_FILEPATHS |
list of strings | Yes | A list of file paths to the input files to be uploaded to the created batch. For each file, specify the complete path to the file in the machine that’s running the script. Each file can be up to 50 MB or 800 pages in size. For files larger than 10 MB, use the batches multipart file upload endpoint. See limitations for complete information about storage and upload limits. |
<YOUR-BATCH-NAME> |
string | Yes | Name of the batch. Maximum length is 255 characters. |
<WORKSPACE-NAME> |
string | No | The name of the workspace in which to add the batch; the batch is created in the default drive of the workspace. If not specified, the default location for commercial users is the default drive of your personal workspace. For community users, the default location is your personal workspace’s Instabase Drive. |
Processing files through an AI Hub app
To process your input files through an AI Hub app, call the /v2/zero-shot-idp/projects/app/run
endpoint.
When running apps, there are two methods to specify the app’s input:
-
Preferred: Using a batch ID and processing all files in the specified batch. The sample script uses this method and automatically passes through the batch ID created in the previous step.
-
Supported: Using the
input_dir
parameter to specify a file path within your Instabase Drive or a connected drive, and processing all files in the specified folder. The sample script must be adjusted to use this method.
Script excerpt
When run, this section of the script runs the app and process all files in the specified batch. A job ID is returned, which is used to check job status.
APP_NAME = '<EXAMPLE-APP-NAME>' # Enter the app name.
# Step 1: Run app
run_app_payload = json.dumps({
'batch_id': BATCH_ID, # BATCH_ID from previous step
'name': APP_NAME
# optional: 'input_dir': '<file path>', If using 'input_dir' to specify a file path, remove 'batch_id': BATCH_ID.
# optional: 'output_workspace': '<workspace-name>', if not specified, default is used.
# optional: 'owner': '<owner-name>' If not specified, default is your user ID.
})
url = f'{API_ROOT}/v2/zero-shot-idp/projects/app/run'
run_app_resp = requests.post(url, headers=API_HEADERS, data=run_app_payload)
job_id = run_app_resp.json().get('job_id')
User-defined values
You can define the following values in the script.
You don’t need to define the batch_id
value in this section script, as its passed through from the results of the previous step.
Parameter or variable | Type | Required | Description |
---|---|---|---|
<EXAMPLE-APP-NAME> |
string | Yes | The name of the app to run. Defines the APP_NAME variable. |
input_dir |
string | No | The path of the input folder, in a connected drive or Instabase Drive. See using file paths as input for formatting guidance. If using this parameter to specify input, you must remove the batch_id parameter from the script. |
output_workspace |
string | No | The workspace in which to run the app. The output is saved to the default drive of the specified workspace. If not defined, the default is: - Community users: Runs in and saves to the personal workspace’s Instabase Drive ( <USER-ID>/my-repo/Instabase Drive ). - Commercial users: Runs in and saves to the organization’s default drive ( <ORGANIZATION-ID>/<USER-ID>/<DEFAULT-DRIVE> ). |
owner |
string | Yes | The account that generated the app. If not specified, defaults to your AI Hub username. For custom AI Hub apps belonging to you, accept the default. For public AI Hub apps published by Instabase, specify instabase . |
For more information about customizing an app run, including using webhooks and running apps in-memory, see the run apps endpoint.
Checking job status
During the app run, you can check the job status with the /v1/jobs/status
endpoint. While you can use the job ID returned by the previous step to check job status yourself, if using the provided script, the job status is continuously polled until the status is DONE
. When the processing files step is done, the get results step starts automatically.
You can also configure webhooks with the previous request to provide alerts when the app run is complete.
For more information about this endpoint, see the job status endpoint.
Script excerpt
When run, this section of the script checks the job status, and continues polling the job status until the job returns a status of DONE
. There are no values you must define in the script, as the job ID is passed in from the previous response.
url = f'{API_ROOT}/v1/jobs/status?type=flow&job_id={job_id}'
job_status = ''
while job_status != 'DONE':
time.sleep(5)
job_status_resp = requests.get(url, headers=API_HEADERS)
job_status = job_status_resp.json().get('state')
Getting results
After the app run is complete, you can fetch your results with the {API_ROOT}/v1/flow_binary/results
endpoint.
For more information about this endpoint, see the run results endpoint.
Script excerpt
When run, this section of the script gets the results of your app run and returns them in JSON format. The API can return results in JSON and CSV.
# Now the job is complete, get the results
output_folder_path = job_status_resp.json()['results'][0]['output_folder']
results_payload = json.dumps({
'file_offset': 0, # Results are paginated with a limit of 20 files per page. This parameter needs to be updated across requests for a job with >20 files.
'ibresults_path': f'{output_folder_path}/batch.ibflowresults'
})
url = f'{API_ROOT}/v1/flow_binary/results'
results_resp = requests.post(url, headers=API_HEADERS, data=results_payload)
User-defined values
You can define the following values in the script.
Parameter or variable | Type | Required | Description |
---|---|---|---|
file_offset |
integer | No | Initial file index to start returning results from. Use this when dealing with large results that are paginated and exceed the default limit of 20 that is returned by the API. Defaults to 0 . |
Complete workflow
This example Python script shows a complete, end-to-end workflow based on calls to AI Hub APIs. When run, this script performs the following tasks:
-
Creates a batch object and uploads files to the batch.
-
Runs the app you’ve specified using the batch as input.
-
Automatically polls the app run job’s status for completion.
-
Returns the app run’s results in JSON format.
User-defined values
To recap, this tables outlines all values in the complete script that can or must be defined.
Parameter or variable | Type | Required | Description |
---|---|---|---|
<YOUR-API-TOKEN> |
string | Yes | An OAuth token. Defines the API_TOKEN variable. |
IB-Context |
string | No, but must be defined to use commercial context. | Defines the context (account) to use when completing the request. If you’re a member of a commercial organization, to use your commercial account to complete the request, enter your organization ID. If left blank or omitted, defaults to your user ID and your community account is used to complete the request. |
INPUT_FILEPATHS |
list of strings | Yes | A list of file paths to the input files to be uploaded to the created batch. For each file, specify the complete path to the file in the machine that’s running the script. Each file can be up to 50 MB or 800 pages in size. For files larger than 10 MB, use the batches multipart file upload endpoint. |
<YOUR-BATCH-NAME> |
string | Yes | Name of the batch. Maximum length is 255 characters. |
<WORKSPACE-NAME> |
string | No | The name of the workspace in which to add the batch; the batch is created in the default drive of the workspace. If not specified, the default location for commercial users is the default drive of your personal workspace. For community users, the default location is your personal workspace’s Instabase Drive. |
<EXAMPLE-APP-NAME> |
string | Yes | The name of the app to run. Defines the APP_NAME variable. |
output_workspace |
string | No | The workspace in which to run the app. The output is saved to the default drive of the specified workspace. If not defined, the default is: - Community users: Runs in and saves to the personal workspace’s Instabase Drive ( <USER-ID>/my-repo/Instabase Drive ). - Commercial users: Runs in and saves to the organization’s default drive ( <ORGANIZATION-ID>/<USER-ID>/<DEFAULT-DRIVE> ). |
input_dir |
string | No | The path of the input folder, in a connected drive or Instabase Drive. See using file paths as input for formatting. If using this parameter to specify input, you must remove the batch_id parameter from the script. |
owner |
string | No | The account that generated the app. If not specified, defaults to your AI Hub username. For custom AI Hub apps belonging to you, accept the default. For public AI Hub apps published by Instabase, specify instabase . |
file_offset |
integer | No | Initial file index to start returning results from. Use this when dealing with large results that are paginated and exceed the default limit of 20 that is returned by the API. Defaults to 0 . |
Complete script
import requests
import time
import json
API_TOKEN = '<YOUR-API-TOKEN>' # Enter your API token.
APP_NAME = '<EXAMPLE-APP-NAME>' # Enter your app name
INPUT_FILEPATHS = ['<../sample1.pdf>', '<sample2.docx>', '<inner/folder/sample3.png>'] # Replace with file paths for the input files to be uploaded.
API_ROOT = 'https://aihub.instabase.com/api'
API_HEADERS = {
'Authorization': f'Bearer {API_TOKEN}',
'IB-Context': '<ORGANIZATION-ID or USER-ID>' # Enter ORGANIZATION_ID to use your commercial account. Omit or use USER_ID to use your community account.
}
def create_batch(batch_name, workspace_name=None):
resp = requests.put(f'{API_ROOT}/v2/batches', headers=API_HEADERS, data=json.dumps({'batch_name': batch_name, 'workspace_name': workspace_name }))
return resp.json()['batch_id']
def upload_input_file(local_filepath, batch_id, filename):
with open(local_filepath, 'rb') as input_file:
requests.put(f'{API_ROOT}/v2/batches/{batch_id}/files/{filename}',
headers=API_HEADERS,
data=input_file.read())
def run_app(batch_id, app_name):
run_app_payload = json.dumps({
'batch_id': batch_id,
'name': APP_NAME,
# optional: 'input_dir': '<file path>, if using 'input_dir' to specify a file path, omit 'batch_id': BATCH_ID.
# optional: 'output_workspace': '<workspace-name>', if not specified, default is used.
# optional: 'owner': '<owner-name>', if not specified, default is your user ID.
})
url = f'{API_ROOT}/v2/zero-shot-idp/projects/app/run'
run_app_resp = requests.post(url, headers=API_HEADERS, data=run_app_payload)
job_id = run_app_resp.json().get('job_id')
return job_id
def get_results(job_id):
# Using the Job ID from the app run, gets the current status. If complete, gets the results.
url = f'{API_ROOT}/v1/jobs/status?type=flow&job_id={job_id}'
job_status = ''
while job_status != 'DONE':
time.sleep(5)
job_status_resp = requests.get(url, headers=API_HEADERS)
job_status = job_status_resp.json().get('state')
# Now the job is complete, gets the results.
output_folder_path = job_status_resp.json()['results'][0]['output_folder']
results_payload = json.dumps({
'file_offset': 0,
'ibresults_path': f'{output_folder_path}/batch.ibflowresults'
})
url = f'{API_ROOT}/v1/flow_binary/results'
results_resp = requests.post(url, headers=API_HEADERS, data=results_payload)
return results_resp.json()
if __name__ == "__main__":
# 1. Creates a new batch.
batch_id = create_batch('<YOUR-BATCH-NAME>', '<WORKSPACE-NAME>') # Define a name for the batch and, optionally, specify a workspace in which to create the batch. If WORKSPACE-NAME is not defined, the default is accepted.
# 2. Uploads files into batch.
for input_filepath in INPUT_FILEPATHS:
upload_input_file(input_filepath, batch_id, os.path.basename(input_filepath))
# 3. Runs app using batch as input.
job_id = run_app(batch_id, APP_NAME)
# 4. Gets results of app run.
results = get_results(job_id)