Apps quick start

The AI Hub APIs let you create and integrate an end-to-end solution. From uploading files to running an app, you can chain these API calls together using a script for a complete workflow. In this quick start guide, you’ll find a complete script you can use to create your own complete workflow. This guide also walks you through each of the following steps, explaining each section of the script and any values you must define.

Info

While the scripts in this guide are written in Python, you can write them in any scripting language.

API setup

When making calls to AI Hub APIs, you use request headers to authorize and identify your requests.

  • Authorization is done using the Authorization header and is how you pass your API token. All requests must include an API token. You can generate and manage API tokens from your AI Hub user settings. In the sample script, an API_TOKEN variable is used.

  • Identification is done using the IB-Context header. This header is required for all commercial users who want to complete the call using the commercial organization’s context.

    • To use your commercial account to complete the request, set the IB-Context header as your organization ID.

    • To use your community account to complete the request, omit the header or define it as your user ID.

    • If the IB-Context header is undefined, your community account is used to complete the request.

    Tip

    You can find you user ID and organization ID on the APIs settings page.

Script excerpt

This section of the script is where you define the API_TOKEN variable’s value and can define the IB-Context header.

API_TOKEN = '<YOUR-API-TOKEN>' # Enter your API token.
API_ROOT = 'https://aihub.instabase.com/api'
API_HEADERS = {
    'Authorization': f'Bearer {API_TOKEN}',
    'IB-Context': '<ORGANIZATION-ID or USER-ID>' # Enter ORGANIZATION_ID to use your commercial account. Omit or use USER_ID to use your community account.
}

Uploading files to a batch

You can create a batch of files to use as input for your app run. In AI Hub, a batch is a user-defined group of files. You can add and remove files from the batch, but the batch itself has a constant ID, so you can easily and repeatedly use all files in the batch as an app’s input.

For more information, see the batches endpoint.

Script excerpt

When run, this section of the script makes a series of calls to the Batches endpoint to create a batch and upload files to it.

INPUT_FILEPATHS = ['<../sample1.pdf>', '<sample2.docx>', '<inner/folder/sample3.png>']  # Replace with file paths for the input files to be uploaded.
def create_batch(batch_name, workspace_name=None):
  resp = requests.put(f'{API_ROOT}/v2/batches', headers=API_HEADERS, data=json.dumps({'batch_name': batch_name, 'workspace_name': workspace_name }))
  return resp.json()['batch_id']

def upload_input_file(local_filepath, batch_id, filename):
  with open(local_filepath, 'rb') as input_file:
    requests.put(f'{API_ROOT}/v2/batches/{batch_id}/files/{filename}',
                 headers=API_HEADERS,
                 data=input_file.read())

BATCH_ID = create_batch('<YOUR-BATCH-NAME>', '<WORKSPACE-NAME>')  # Define a name for the batch and, optionally, specify a workspace in which to create the batch. If WORKSPACE-NAME is not defined, the default is accepted.
for input_filepath in FILENAMES:
  upload_input_file(input_filepath, BATCH_ID, os.path.basename(input_filepath))

User-defined values

You can define the following values in the script.

Parameter or variable Type Required Description
INPUT_FILEPATHS list of strings Yes A list of file paths to the input files to be uploaded to the created batch. For each file, specify the complete path to the file in the machine that’s running the script.

Each file can be up to 50 MB or 800 pages in size. For files larger than 10 MB, use the batches multipart file upload endpoint.

See limitations for complete information about storage and upload limits.
<YOUR-BATCH-NAME> string Yes Name of the batch. Maximum length is 255 characters.
<WORKSPACE-NAME> string No The name of the workspace in which to add the batch; the batch is created in the default drive of the workspace. If not specified, the default location for commercial users is the default drive of your personal workspace. For community users, the default location is your personal workspace’s Instabase Drive.

Processing files through an AI Hub app

To process your input files through an AI Hub app, call the /v2/zero-shot-idp/projects/app/run endpoint.

Info

When running apps, there are two methods to specify the app’s input:

  • Preferred: Using a batch ID and processing all files in the specified batch. The sample script uses this method and automatically passes through the batch ID created in the previous step.

  • Supported: Using the input_dir parameter to specify a file path within your Instabase Drive or a connected drive, and processing all files in the specified folder. The sample script must be adjusted to use this method.

Script excerpt

When run, this section of the script runs the app and process all files in the specified batch. A job ID is returned, which is used to check job status.

APP_NAME = '<EXAMPLE-APP-NAME>'  # Enter the app name.

# Step 1: Run app
run_app_payload = json.dumps({
      'batch_id': BATCH_ID, # BATCH_ID from previous step
      'name': APP_NAME
      # optional: 'input_dir': '<file path>', If using 'input_dir' to specify a file path, remove 'batch_id': BATCH_ID. 
      # optional: 'output_workspace': '<workspace-name>', if not specified, default is used.
	  # optional: 'owner': '<owner-name>' If not specified, default is your user ID.
    })

url = f'{API_ROOT}/v2/zero-shot-idp/projects/app/run'
run_app_resp = requests.post(url, headers=API_HEADERS, data=run_app_payload)

job_id = run_app_resp.json().get('job_id')

User-defined values

You can define the following values in the script.

Info

You don’t need to define the batch_id value in this section script, as its passed through from the results of the previous step.

Parameter or variable Type Required Description
<EXAMPLE-APP-NAME> string Yes The name of the app to run. Defines the APP_NAME variable.
input_dir string No The path of the input folder, in a connected drive or Instabase Drive. See using file paths as input for formatting guidance. If using this parameter to specify input, you must remove the batch_id parameter from the script.
output_workspace string No The workspace in which to run the app. The output is saved to the default drive of the specified workspace. If not defined, the default is:

- Community users: Runs in and saves to the personal workspace’s Instabase Drive (<USER-ID>/my-repo/Instabase Drive).

- Commercial users: Runs in and saves to the organization’s default drive (<ORGANIZATION-ID>/<USER-ID>/<DEFAULT-DRIVE>).
owner string Yes The account that generated the app. If not specified, defaults to your AI Hub username. For custom AI Hub apps belonging to you, accept the default. For public AI Hub apps published by Instabase, specify instabase.

For more information about customizing an app run, including using webhooks and running apps in-memory, see the run apps endpoint.

Checking job status

During the app run, you can check the job status with the /v1/jobs/status endpoint. While you can use the job ID returned by the previous step to check job status yourself, if using the provided script, the job status is continuously polled until the status is DONE. When the processing files step is done, the get results step starts automatically.

Tip

You can also configure webhooks with the previous request to provide alerts when the app run is complete.

For more information about this endpoint, see the job status endpoint.

Script excerpt

When run, this section of the script checks the job status, and continues polling the job status until the job returns a status of DONE. There are no values you must define in the script, as the job ID is passed in from the previous response.

url = f'{API_ROOT}/v1/jobs/status?type=flow&job_id={job_id}'

job_status = ''
while job_status != 'DONE':
    time.sleep(5)
    job_status_resp = requests.get(url, headers=API_HEADERS)
    job_status = job_status_resp.json().get('state')

Getting results

After the app run is complete, you can fetch your results with the {API_ROOT}/v1/flow_binary/results endpoint.

For more information about this endpoint, see the run results endpoint.

Script excerpt

When run, this section of the script gets the results of your app run and returns them in JSON format. The API can return results in JSON and CSV.

# Now the job is complete, get the results
output_folder_path = job_status_resp.json()['results'][0]['output_folder']

results_payload = json.dumps({
        'file_offset': 0, # Results are paginated with a limit of 20 files per page. This parameter needs to be updated across requests for a job with >20 files.
        'ibresults_path': f'{output_folder_path}/batch.ibflowresults'
})

url = f'{API_ROOT}/v1/flow_binary/results'
results_resp = requests.post(url, headers=API_HEADERS, data=results_payload)

User-defined values

You can define the following values in the script.

Parameter or variable Type Required Description
file_offset integer No Initial file index to start returning results from. Use this when dealing with large results that are paginated and exceed the default limit of 20 that is returned by the API. Defaults to 0.

Complete workflow

This example Python script shows a complete, end-to-end workflow based on calls to AI Hub APIs. When run, this script performs the following tasks:

  • Creates a batch object and uploads files to the batch.

  • Runs the app you’ve specified using the batch as input.

  • Automatically polls the app run job’s status for completion.

  • Returns the app run’s results in JSON format.

User-defined values

To recap, this tables outlines all values in the complete script that can or must be defined.

Parameter or variable Type Required Description
<YOUR-API-TOKEN> string Yes An OAuth token. Defines the API_TOKEN variable.
IB-Context string No, but must be defined to use commercial context. Defines the context (account) to use when completing the request. If you’re a member of a commercial organization, to use your commercial account to complete the request, enter your organization ID. If left blank or omitted, defaults to your user ID and your community account is used to complete the request.
INPUT_FILEPATHS list of strings Yes A list of file paths to the input files to be uploaded to the created batch. For each file, specify the complete path to the file in the machine that’s running the script.

Each file can be up to 50 MB or 800 pages in size. For files larger than 10 MB, use the batches multipart file upload endpoint.
<YOUR-BATCH-NAME> string Yes Name of the batch. Maximum length is 255 characters.
<WORKSPACE-NAME> string No The name of the workspace in which to add the batch; the batch is created in the default drive of the workspace. If not specified, the default location for commercial users is the default drive of your personal workspace. For community users, the default location is your personal workspace’s Instabase Drive.
<EXAMPLE-APP-NAME> string Yes The name of the app to run. Defines the APP_NAME variable.
output_workspace string No The workspace in which to run the app. The output is saved to the default drive of the specified workspace. If not defined, the default is:

- Community users: Runs in and saves to the personal workspace’s Instabase Drive (<USER-ID>/my-repo/Instabase Drive).

- Commercial users: Runs in and saves to the organization’s default drive (<ORGANIZATION-ID>/<USER-ID>/<DEFAULT-DRIVE>).
input_dir string No The path of the input folder, in a connected drive or Instabase Drive. See using file paths as input for formatting. If using this parameter to specify input, you must remove the batch_id parameter from the script.
owner string No The account that generated the app. If not specified, defaults to your AI Hub username. For custom AI Hub apps belonging to you, accept the default. For public AI Hub apps published by Instabase, specify instabase.
file_offset integer No Initial file index to start returning results from. Use this when dealing with large results that are paginated and exceed the default limit of 20 that is returned by the API. Defaults to 0.

Complete script

import requests
import time
import json

API_TOKEN = '<YOUR-API-TOKEN>' # Enter your API token.
APP_NAME = '<EXAMPLE-APP-NAME>' # Enter your app name
INPUT_FILEPATHS = ['<../sample1.pdf>', '<sample2.docx>', '<inner/folder/sample3.png>']  # Replace with file paths for the input files to be uploaded.

API_ROOT = 'https://aihub.instabase.com/api'
API_HEADERS = {
    'Authorization': f'Bearer {API_TOKEN}',
    'IB-Context': '<ORGANIZATION-ID or USER-ID>' # Enter ORGANIZATION_ID to use your commercial account. Omit or use USER_ID to use your community account.
}

def create_batch(batch_name, workspace_name=None):
    resp = requests.put(f'{API_ROOT}/v2/batches', headers=API_HEADERS, data=json.dumps({'batch_name': batch_name, 'workspace_name': workspace_name }))
    return resp.json()['batch_id']

def upload_input_file(local_filepath, batch_id, filename):
    with open(local_filepath, 'rb') as input_file:
        requests.put(f'{API_ROOT}/v2/batches/{batch_id}/files/{filename}',
                        headers=API_HEADERS,
                        data=input_file.read())

def run_app(batch_id, app_name):
    run_app_payload = json.dumps({
      'batch_id': batch_id,
      'name': APP_NAME,
      # optional: 'input_dir': '<file path>, if using 'input_dir' to specify a file path, omit 'batch_id': BATCH_ID.
      # optional: 'output_workspace': '<workspace-name>', if not specified, default is used.
      # optional: 'owner': '<owner-name>', if not specified, default is your user ID.
    })

    url = f'{API_ROOT}/v2/zero-shot-idp/projects/app/run'
    run_app_resp = requests.post(url, headers=API_HEADERS, data=run_app_payload)

    job_id = run_app_resp.json().get('job_id')
    return job_id

def get_results(job_id):
    # Using the Job ID from the app run, gets the current status. If complete, gets the results.
    url = f'{API_ROOT}/v1/jobs/status?type=flow&job_id={job_id}'

    job_status = ''
    while job_status != 'DONE':
        time.sleep(5)
        job_status_resp = requests.get(url, headers=API_HEADERS)
        job_status = job_status_resp.json().get('state')

    # Now the job is complete, gets the results.
    output_folder_path = job_status_resp.json()['results'][0]['output_folder']

    results_payload = json.dumps({
        'file_offset': 0,
        'ibresults_path': f'{output_folder_path}/batch.ibflowresults'
    })

    url = f'{API_ROOT}/v1/flow_binary/results'
    results_resp = requests.post(url, headers=API_HEADERS, data=results_payload)

    return results_resp.json()

if __name__ == "__main__":
    # 1. Creates a new batch.
    batch_id = create_batch('<YOUR-BATCH-NAME>', '<WORKSPACE-NAME>') # Define a name for the batch and, optionally, specify a workspace in which to create the batch. If WORKSPACE-NAME is not defined, the default is accepted.
    # 2. Uploads files into batch.
    for input_filepath in INPUT_FILEPATHS:
        upload_input_file(input_filepath, batch_id, os.path.basename(input_filepath))
    # 3. Runs app using batch as input.
    job_id = run_app(batch_id, APP_NAME)
    # 4. Gets results of app run.
    results = get_results(job_id)