Document Type Settings JSON Structure
Document Type Settings JSON Structure
This section describes how to create Settings JSON for the following Human Task Types:
Form Task
Below you can find an example of the Settings JSON for the Sample From Task:
These settings contain:
- autoSave (boolean) (optional) - allows saving intermediate task results automatically. By default, the setting has a "false" value.
- taskTypeLabel (string) (optional) - allows to configure the task title. By default, it is set to "Forms".
- appLanguage (string) (optional) - allows the user to set up the Human Task localization. Currently available options are "en" and "ru". By default, the task displays as "en".
- showOrderNumber (boolean) (optional) - allows displaying the position of the item in the group. By default, the setting has a "false" value.
- groups (list of objects) (required) - list of group settings. Each group in the list has the following structure:
- groupTitle (string) (required) - identifies the group name to display in Human Task.
- fields (list of objects) (required) - list of field settings. Each field in the list has the following structure:
- name (string) (required) - is used as a key to get value from output result. It's never displayed on the Human Task and must be unique for the entire form. Required for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- label (string) (required) - label to display on Human Task.
- type (string) (required) - specifies the type of field. Should be one of the supported types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, info, date.
- required (boolean) (optional) - shows if the field is required for filling, so the user won't be able to submit Human Task without specifying the values for all required fields.
- multiple (boolean) (optional) - is used only if "type": "select". Allows to select multiple items in the select dropdown list. Can work with autocomplete setting.
- autocomplete (boolean) (optional) - is used only if "type": "select". Allows to search by items in the select dropdown list. Can work with multiple setting.
- mask (string) (optional) - is used only if "type": "input". Allows to set a string of characters that indicates the format of valid input values. Default format characters: "9" for 0-9 characters; "a" for A-Z and a-z characters; "" for A-Z, a-z, 0-9 characters. If required to have exactly "9" character (e.g., a phone code), use "\" before the character (e.g., "+4\9 99 999 99"). If "required": true and a mask are set for "type": "input", input must be filled completely according to the mask.
- minRows (number) (optional) - is used only if "type": "textarea". Specifies the minimum number of lines to display in the textarea. By default, the setting has the value "2".
- maxRows (number) (optional) - is used only if "type": "textarea". Specifies the maximum number of lines to display in the textarea without a scroll appearing. If the number of rows is not within maxRows setting, a scroll appears. There is no default value and the field grows without limit.
- disabled (boolean) (optional) - provides the possibility to disable the field which will prevent the user from filling this field.
- validationRegExp (string) (optional) - provides possibility to validate filled values using regular expressions.
- description (string) (optional) - allows to add a description to a field. Can be added for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- errorMessage (string) (optional) - is used to set the text for the error if the filled values do not match the regular expression in validationRegExp setting.
- items (list of objects) (required only for "type": "checkbox_group", "radio_group", "select") - is used only if "type": "checkbox_group", "radio_group","select". Specifies possible items to check. Each item in the list has the following properties:
- value (string) (required) - the value of the item to display.
- label (string) (optional) - the label of the item to display. If no label is set, value is displayed instead. If both settings are set, label is displayed on Human Task, but value is set in Human Task Output.
- disabled (boolean) (optional) - provides the possibility to disable the item which will prevent the user from selecting it.
- markLabel (string) (optional) - is used only if "type": "checkbox". Displays the label of the checkbox.
- text (string or array) (optional) - is used only if "type": "info". Displays uneditable text for info field. If the text contains an array of strings, each string will be considered as a new paragraph.
- keyboard (boolean) (optional) - is used only if "type": "date". Allows to enable manual date entry for datepicker form. By default, the setting has the value "false".
- bucket (string) (required) - is used only if "type": "file_input". Specifies the name of the bucket in MiniO S3 in which the files added by filling out the form will be stored.
- path (string) (required) - is used only if "type": "file_input". Specifies the name of the folder in MiniO S3 bucket in which the files added by filling out the form will be stored.
- accept (string) (optional) - is used only if "type": "file_input". Specifies the type of files to be filtered in File Explorer.
format (string) (optional) - is used only if "type": "date". Allows to set a custom date format. By default, the setting has a "MM/DD/YYYY" value. Supports the following formatting tokens for dates:
Classification Task
Below you can find an example of the Settings JSON for the Sample Classification Task:
These settings contain:
- autoSave (boolean) (optional) - allows saving intermediate task results automatically. By default, the setting has a "false" value.
- taskInstructionText (string) (optional) - represents the instructions text that will appear in the popup
- taskInstructionLink (string) (optional) - represents the link to the remote instructions source. Having at least one of these fields taskInstructionText or taskInstructionLink will cause the Instructions button to appear.
- taskTypeLabel (string) (optional) - allows to configure the task title. By default, it is set to "Document Classification".
- appLanguage (string) (optional) - allows the user to set up the Human Task localization. Currently available options are "en" and "ru". By default, the task displays as "en".
- multipleChoice (boolean) (optional) - allows enabling multiple choice mode, where you can select multiple categories instead of one. By default, the setting has a "false" value.
- commentsSectionName (string) (optional) - is used to rename the "More" tab. By default, the setting has a "More" value.
- categories (list of strings) (required) - list of predefined classes your documents may relate to.
- labels (object) (optional) - map of labels of predefined classes to be displayed in the human task. If not specified, class names from categories are used as labels.
- scoreThreshold (decimal) (optional) - is used when multipleChoice is true to auto-select the categories after ML. When multipleChoice is false, the scoreThreshold doesn't matter because the category with the highest score is automatically selected.
- metadata (list of objects) (optional) - is used to configure the "More" tab in Workspace and Document view in Document sets. The "More" tab is disabled if metadata is not defined. Checkbox "Invalid Document" on the "More" tab can be configured, but cannot be removed. If checkbox "Invalid Document" is checked all validations will be disabled.
Default "Invalid Document" checkbox can be customized using the settings below:
- name (string) (required) - is used to declare the section that is used to configure the name and description of the "Invalid Document" checkbox. The default value is "isInvalid" and it cannot be changed, if changed the markLabel and description settings will have the default values.
- markLabel (string) (optional) - is used to define the name of the default checkbox in the "More" tab. By default, the setting has an "Invalid Document" value.
- description (string) (optional) - allows to add the description to default "Invalid Document" checkbox. By default, the description is not displayed.
Additional fields can be added to the "More" tab using the settings below:
- name (string) (required) - it serves as a key to receiving data from "More" tab from Human Task output. Required for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- label (string) (optional) - is used to set a name for a field.
- type (string) (required) - specifies the type of field. Should be one of the supported types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, info, date.
- required (boolean) (optional) - shows if the field is required for filling in the "More" tab, so human won't be able to submit Human Task without specifying the values for all required fields in the "More" tab except the case when the checkbox "Invalid Document" is checked.
- multiple (boolean) (optional) - is used only if "type": "select". Allows to select multiple items in the select dropdown list. Can work with autocomplete setting.
- autocomplete (boolean) (optional) - is used only if "type": "select". Allows to search by items in the select dropdown list. Can work with multiple setting.
- mask (string) (optional) - is used only if "type": "input". Allows to set a string of characters that indicates the format of valid input values. Default format characters: "9" for 0-9 characters; "a" for A-Z and a-z characters; "" for A-Z, a-z, 0-9 characters. If required to have exactly "9" character (e.g., a phone code), use "\" before the character (e.g., "+4\9 99 999 99"). If "required": true and a mask are set for "type": "input", input must be filled completely according to the mask.
- minRows (number) (optional) - is used only if "type": "textarea". Specifies the minimum number of lines to display in the textarea. By default, the setting has the value "2".
- maxRows (number) (optional) - is used only if "type": "textarea". Specifies the maximum number of lines to display in the textarea without a scroll appearing. If the number of rows is not within maxRows setting, a scroll appears. There is no default value and the field grows without limit.
- disabled (boolean) (optional) - provides the possibility to disable the field which will prevent the user from filling this field.
- validationRegExp (string) (optional) - provides possibility to validate filled values using regular expressions.
- description (string) (optional) - allows to add a description to a field. Can be added for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- errorMessage (string) (optional) - is used to set the text for the error if the filled values do not match the regular expression in validationRegExp setting.
- items (list of objects) (required only for "type": "checkbox_group", "radio_group", "select") - is used only if "type": "checkbox_group", "radio_group","select". Specifies possible items to check. Each item in the list has the following properties:
- value (string) (required) - the value of the item to display.
- label (string) (optional) - the label of the item to display. If no label is set, value is displayed instead. If both settings are set, label is displayed on Human Task, but value is set in Human Task Output.
- disabled (boolean) (optional) - provides the possibility to disable the item which will prevent the user from selecting it.
- markLabel (string) (optional) - is used only if "type": "checkbox". Displays the label of the checkbox.
- text (string or array) (optional) - is used only if "type": "info". Displays uneditable text for info field. If the text contains an array of strings, each string will be considered as a new paragraph.
- keyboard (boolean) (optional) - is used only if "type": "date". Allows to enable manual date entry for datepicker form. By default, the setting has the value "false".
format (string) (optional) - is used only if "type": "date". Allows to set a custom date format. By default, the setting has a "MM/DD/YYYY" value. Supports the following formatting tokens for dates:
Below you can find an example of the Settings JSON for the Sample Classification Task with formatted metadata:
Information Extraction Task
Below you can find an example of the Settings JSON for the Sample Information Extraction Task:
These settings contain:
- regexFunctions (object) (optional) - a special object which contains "key" → "value" pairs to specify custom fields validators, where "key" is your custom function name which can be used as a link in "regexFnc" option of field settings under "categories" setting. "value" - is your custom JavaScript lambda function.
taskInstructionText (string) (optional) - represents the instructions text that will appear in the popup
- taskInstructionLink (string) (optional) - represents the link to the remote instructions source. Having at least one of these fields taskInstructionText or taskInstructionLink will cause the Instructions button to appear.
- taskTypeLabel (string) (optional) - allows to configure the task title. By default, it is set to "Information Extraction".
- autoSave (boolean) (optional) - allows saving intermediate task results automatically. By default, the setting has "false" value.
- allowCustomValue (boolean) (optional) - enables the possibility to enter a custom value without any connection to the input document picture. This possibility is useful when OCR failed to extract some text from the document.
- appLanguage (string) (optional) - allows the user to set up the Human Task localization. Currently available options are "en" and "ru". By default, the task displays as "en".
- excludeUndefinedEntities (boolean) (optional) - allows getting the output only of fields that are configured in "categories" setting. By default, the setting has "false" value.
- commentsSectionName (string) (optional) - is used to rename the "More" tab.
- categories (list of objects) (required) - list of fields to extract from the document, where each item contains:
- name (string) (required) - it serves as a key to receiving data from Human Task output.
- label (string) (optional) - label for the field displayed in the human task. If not specified, the value from name is used as the label.
- multiple (boolean) (required) - shows if this field may have multiple values (uses when you have the list with details of some items in your document, e.g. a list of products).
- required (boolean) (optional) - shows if the field is required for extraction, so human won't be able to submit Human Task without specifying the values for all required fields.
- helperText (string) (optional) - an additional text which is shown if the value is invalid due to additional validators specified by regexp settings.
- regex (string) (optional) - specify regexp function to use for values validation.
- regexFnc (list of string) (optional) - list of names of validator functions, which are specified by in "regexFunctions".
- iconType (string) (optional) - allows to configure the field icon. Currently available options are "date", "money", "multiple", "text." The default icon is "text".
- hotkey (list of string) (optional) - list of keyboard shortcuts that can be pressed to select the item in the category.
- metadata (list of objects) (optional) - is used to configure the "More" tab in Workspace and Document view in Document sets. The "More" tab is disabled if metadata is not defined. Checkbox "Invalid Document" on the "More" tab can be configured, but cannot be removed. If checkbox "Invalid Document" is checked all validations will be disabled.
Default "Invalid Document" checkbox can be customized using the settings below:
- name (string) (required) - is used to declare the section that is used to configure the name and description of the "Invalid Document" checkbox. The default value is "isInvalid" and it cannot be changed, if changed the markLabel and description settings will have the default values.
- markLabel (string) (optional) - is used to define the name of the default checkbox in the "More" tab. By default, the setting has an "Invalid Document" value.
- description (string) (optional) - allows to add the description to default "Invalid Document" checkbox. By default, the description is not displayed.
Additional fields can be added to the "More" tab using the settings below:
- name (string) (required) - it serves as a key to receiving data from "More" tab from Human Task output. Required for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- label (string) (optional) - is used to set a name for a field.
- type (string) (required) - specifies the type of field. Should be one of the supported types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, info, date.
- required (boolean) (optional) - shows if the field is required for filling in the "More" tab, so human won't be able to submit Human Task without specifying the values for all required fields in the "More" tab except the case when the checkbox "Invalid Document" is checked.
- multiple (boolean) (optional) - is used only if "type": "select". Allows to select multiple items in the select dropdown list. Can work with autocomplete setting.
- autocomplete (boolean) (optional) - is used only if "type": "select". Allows to search by items in the select dropdown list. Can work with multiple setting.
- mask (string) (optional) - is used only if "type": "input". Allows to set a string of characters that indicates the format of valid input values. Default format characters: "9" for 0-9 characters; "a" for A-Z and a-z characters; "" for A-Z, a-z, 0-9 characters. If required to have exactly "9" character (e.g., a phone code), use "\" before the character (e.g., "+4\9 99 999 99"). If "required": true and a mask are set for "type": "input", input must be filled completely according to the mask.
- minRows (number) (optional) - is used only if "type": "textarea". Specifies the minimum number of lines to display in the textarea. By default, the setting has the value "2".
- maxRows (number) (optional) - is used only if "type": "textarea". Specifies the maximum number of lines to display in the textarea without a scroll appearing. If the number of rows is not within maxRows setting, a scroll appears. There is no default value and the field grows without limit.
- disabled (boolean) (optional) - provides the possibility to disable the field which will prevent the user from filling this field.
- validationRegExp (string) (optional) - provides possibility to validate filled values using regular expressions.
- description (string) (optional) - allows to add a description to a field. Can be added for the following types: input, number_input, textarea, date, radio_group, select, checkbox, checkbox_group, date.
- errorMessage (string) (optional) - is used to set the text for the error if the filled values do not match the regular expression in validationRegExp setting.
- items (list of objects) (required only for "type": "checkbox_group", "radio_group", "select") - is used only if "type": "checkbox_group", "radio_group","select". Specifies possible items to check. Each item in the list has the following properties:
- value (string) (required) - the value of the item to display.
- label (string) (optional) - the label of the item to display. If no label is set, value is displayed instead. If both settings are set, label is displayed on Human Task, but value is set in Human Task Output.
- disabled (boolean) (optional) - provides the possibility to disable the item which will prevent the user from selecting it.
- markLabel (string) (optional) - is used only if "type": "checkbox". Displays the label of the checkbox.
- text (string or array) (optional) - is used only if "type": "info". Displays uneditable text for info field. If the text contains an array of strings, each string will be considered as a new paragraph.
- keyboard (boolean) (optional) - is used only if "type": "date". Allows to enable manual date entry for datepicker form. By default, the setting has the value "false".
format (string) (optional) - is used only if "type": "date". Allows to set a custom date format. By default, the setting has a "MM/DD/YYYY" value. Supports the following formatting tokens for dates:
Below you can find an example of the Settings JSON for the Sample Information Extraction Task with formatted metadata: