Skip to main content

Post-Processing

Post-Processing

Overview

Post Processors are the instances of the PostProcessor<D extends DpDocument> interface that and using by Document Processor to change document content during document processing. The document processors could be plugged using document type settings JSON. Usually platform post-processors are used to refine and manipulate extracted data to improve the accuracy and relevance of the information obtained after PREPROCESS, ML and for VALIDATION. This article provides technical information about OOTB post processors.

Configuring Post-Processors in DocumentType JSON

To add a post-processors, into document processing you should define a configuration in the DocumentType JSON. The configuration specifies the post-processors to be executed and their corresponding settings. Each post-processor is referenced by its name and may require specific parameters to be provided. The post processors could be added:

  • at last step of PREPROCESS action (Document Set processor and Automation Process flows)
  • at last step of ML action (Document Set processor and Automation Process flows)
  • at VALIDATION flow of the Automation Process flows:

Below is an example of how to specify the ML post-processing configuration in the DocumentType JSON:

DocumentType JSON
{
	// DocumentType Settings JSON
	. . . . .		
	"preprocessPostProcessors": [
		{
		"name": "removeWordIfConfidenceLessThan",
		"confidence": "50.0"
		}
	],
	"mlPostProcessors": [
		{
		"entityName": "Invoice Number",
		"name": "regexReplacement",
		"rules": {
			"o|O|e|c|C|Q|p|P": "0",
			"I|i|j": "1",
			"b|G": "6",
			"B": "8",
			"q": "9"
		}
		}
	 ],
	 "validators": [
		{
		"entityName": "Quantity",
		"name": "isBigDecimal",
		"strict": true,
		"message": {
			"severity": "error",
			"text": "Quantity should be a number."
		}
		}
	 ]
 }

Post-Processors specialization

A post processor class could change any field of the DpDocument, but because the document could be of different Human Task Type the post processor handle only its specific htType, and could not be applied to another.

The post processor also could be phases oriented, i.e. changes/analyses/checks only inputJson outputJson or modelOutputJson or all of them. Therefore the following list of OOTB processors will be grouped by phases and contains its specialization information.

PreProcessors

It is a PostProcessors that works with inputJson, but also could impact on outputJson and modelOutputJson.

nameparameterssample configdescription
removeWordIfConfidenceLessThanconfidence - (required, string) the double confidence value as string
DocumentType JSON
{
	"name": "removeWordIfConfidenceLessThan",
	"confidence": "30.0"
}

Removes all words from HOCR input JSON which confidence is less then the provided confidence value.

Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.

ieclassificationhtml-iehtml-classification
YesYesNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span><span style="color: rgb(0,0,0);">
</span>
removePagespages - (required, string) the pages in format "1,3-4,9-"
DocumentType JSON
{
	"name": "removePages",
	"pages": "1,3-4,7-"
}

Removes the specified pages. Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.

ieclassificationhtml-iehtml-classification
YesYesNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>
leaveOnlyPagespages - (required, string) the pages in format "1,3-4,9-"
DocumentType JSON
{
	"name": "leaveOnlyPages",
	"pages": "1,3-4,7-"
}

Leaves only specified pages. Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.

ieclassificationhtml-iehtml-classification
YesYesNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>

PostProcessors

It is a PostProcessors that works with modelOutputinput.

nameparameterssample configdescription
removeEntityIfContentEqualsTo

entityName - (required, string) target entity name to process

targetContent  (required, string) the target content that triggers the removal

DocumentType JSON
{
	"name": "removeEntityIfContentEqualsTo",
	"entityName": "Address",
	"targetContent": "Not existed"
}

Removes an entity in modelOutputJson if its content equals the target content.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
changeEntityContentIfContentEqualsTo

entityName - (required, string) target entity name to process

targetContent - (required, string) the target content that triggers the change

changeTo - (required, string) the new content to replace the target content

DocumentType JSON
{
	"name": "changeEntityContentIfContentEqualsTo",
	"entityName": "Payment Amount",
	"targetContent": "$10k",
	"changeTo": "$10.000"
}

Changes the content of an entity in modelOutputJson if its content equals the target content.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
changeEntityContentIfContentContains

entityName - (required, string) target entity name to process

targetContent - (required, string) the target content that triggers the change

changeTo - (required, string) the new content to replace the target content

DocumentType JSON
{
	"name": "changeEntityContentIfContentContains",
	"entityName": "Status",
	"targetContent": "IN-PROCESS",
	"changeTo": "IN PROGRESS"
}

Changes the content of an entity in modelOutputJson if its content contains the target content.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
removeLeadingAndTrailingNonAlphanumericCharactersentityName - (required, string) target entity name to process
DocumentType JSON
{
	"name": "removeLeadingAndTrailingNonAlphanumericCharacters",
	"entityName": "Reference Number"
}

Removes leading and trailing non-alphanumeric characters in target entity content in modelOutputJson .

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
regexReplacement

entityName - (required, string) target entity name to process

rules  - required map of rules that provide both match regex and the replacement regex like in the String#replaceAll(String, String)}

DocumentType JSON
{
		"name": "regexReplacement",
		"entityName": "Expression",
		"rules": {
			"\+" : "plus",
			"=" : "equals",
			"\s+" : " ",
			"(l)(?<digitsAfterL>[\d]+)" : "1#{digitsAfterL}"
		 }
	}

Replaces the content of an entity in modelOutputJson if its content matches a regex from rule.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
trimentityName - (required, string) target entity name to process
DocumentType JSON
{
	"entityName": "Reference Number",
	"name": "trim"
}

Trims the entity value in modelOutputJson.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
tokens

entityName - (required, string) target entity name to process

regex        the optional regexp for token obtaining, default is " "

tokenIndexes the token indexes to leave

DocumentType JSON
{
	"entityName": "Reference Number",
	"name": "tokens",
	"tokenIndexes": [
		1
	]
}

Leave only specified token in the entity in modelOutputJson.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
removeEntitiesentityNames - (required, array of strings) the entity names to remove
DocumentType JSON
{
	"name": "removeEntities",
	"entityNames": [
		"Product Description",
		"Price"
	]
}

Removes specified entities in modelOutputJson.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies
</span>
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>
mergeCloseEntities

entityName - (required, string) target entity name to process

width - (required, int) max horizontal offset in pixels between left down points

height - (required, int) max vertical offset in pixels between left down points

DocumentType JSON
{
	"entityName": "Product Description",
	"name": "mergeCloseEntities",
	"width": 5,
	"height": 30
}

Helps to merge entities which are placed in the same table cell in PDF, but tagged by model as separate entities. It's possible to understand that two entities are in the same table cell only if they are very close to each other. Merges entities in modelOutputJson. if they are close to each other.

ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>
mergeEntitiesIfContentBecomesEqualTo

entityName - (required, string) target entity name to process

targetContent - the target concatenated content that triggers the merge

DocumentType JSON
{
	"name": "mergeEntitiesIfContentBecomesEqualTo",
	"entityName": "Customer Name",
	"targetContent": "JOHNDOE"
}

Merges entities in modelOutputJson if their content becomes equal to the target concatenated content.

ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>
removePagesWithoutEntities


DocumentType JSON
{
	"name": "removePagesWithoutEntities"
}

Removes all pages that doesn't contain any entities in modelOutputJson.

ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>
ocrPositionBasedGrouping


DocumentType JSON
{
	"name": "ocrPositionBasedGrouping"
}

Applies the OCR position-based grouping  in modelOutputJson to the OCR entities of the provided document type.

ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>
htmlTableRowBasedGrouping


DocumentType JSON
{
	"name": "htmlTableRowBasedGrouping"
}

Applies the HTML table row-based grouping in modelOutputJson to the HTML entities of the provided document type.

ieclassificationhtml-iehtml-classification
No supportsNo supportsYesNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span>

renderContentForImageEntity


DocumentType JSON
{
	"name": "renderContentForImageEntity",
	"entityname": "Signature"
}

Replaces the content of the given entity in modelOutputJson with the S3 link to rendered image using the following logic:

  1. Finds entities with given name and without content and with root bbox and for each found
  2. Finds related page image link from input
  3. Downloads image and extracts sub-image based on bbox
  4. Saves sub-image to S3
  5. Sets entity content to the sub-image link
ieclassificationhtml-iehtml-classification
YesNo supportsNo SupportNo supports

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span>

Validators

It is a PostProcessors that checks outputJson.

nameparameterssample configdescription
mustBeReviewedByHuman


DocumentType JSON
{
	"name": "mustBeReviewedByHuman"
}

Send document to human in case if it wasn't reviewed by her/him.

ieclassificationhtml-iehtml-classification
YesYesYesYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.CommonValidationStrategies</span>
hasAllCategories

categories - (required, array of string) the categories list

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "hasAllCategories",
	"categories": [
		"News",
		"New"
	],
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that classified document has all the specified categories.

ieclassificationhtml-iehtml-classification
No supportsYesNo supportsYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span>
dontHaveAllCategories

categories - (required, array of string) the categories list

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "dontHaveAllCategories",
	"categories": [
		"News",
		"New"
	],
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that classified document does not have all the specified categories..

ieclassificationhtml-iehtml-classification
No supportsYesNo supportsYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span>
hasAnyCategory

categories - (required, array of string) the categories list

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "hasAnyCategory",
	"categories": [
		"News",
		"New"
	],
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that classified document has any specified categories..

ieclassificationhtml-iehtml-classification
No supportsYesNo supportsYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span>
atLeastCategoriesSelected

length - (required, integer) the minimal number of selected categories

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "atLeastCategoriesSelected",
	"length": 2,
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that classified document has at least minimum number of categories selected. (Useful for multi-class classification)

ieclassificationhtml-iehtml-classification
No supportsYesNo supportsYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span>
noMoreThanCategoriesSelected

length - (required, integer) the maximum number of selected categories

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "noMoreThanCategoriesSelected",
	"length": 2,
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that classified document has maximum number of categories selected. (Useful for multi-class classification)

ieclassificationhtml-iehtml-classification
No supportsYesNo supportsYes

Implementation classes:  

<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span>
regexCheck

entityName - (required, string) target entity name to process

regex - (required, string) matching regex

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "regexCheck",
	"entityName": "Invoice Number",
	"regex": "\d{6}",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks the entity value using regex.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
notEmpty

entityName - (required, string) target entity name to process

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "notEmpty",
	"entityName": "Invoice Number",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that entity is not null of empty.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isInteger

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isInteger",
	"entityName": "Total",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed into integer.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isLong

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isLong",
	"entityName": "Total",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed into long.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isFloat

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isFloat",
	"entityName": "Total",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed into float.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isDouble

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isDouble",
	"entityName": "Total",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed into double.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isBigDecimal

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isBigDecimal",
	"entityName": "Invoice Number",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed into BigDecimal.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isDate

entityName - (required, string) target entity name to process

formats - (required, list of string)  possible date  formats, negative if none matching

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isDate",
	"entityName": "Invoice Date",
	"formats": [
		"dd/MM/yyyy",
		"yyyy-MM-dd",
		"dd MMM, yyyy"
	],
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed as a date of the specified formats

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isCurrency

entityName - (required, string) target entity name to process

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isCurrency",
	"entityName": "Total Currency",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed as a currency.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isAmount

entityName - (required, string) target entity name to process

strict  - (optional, boolean, false by default)   an optional parameter that specifies the strict parsing, no any related symbols allowed

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isAmount",
	"entityName": "Total",
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be parsed as an amount.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
isSimilarTo

entityName - (required, string) target entity name to process

possibleValues - (list of string)  the possible values

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "isSimilarTo",
	"entityName": "Invoice Type",
	"possibleValues": [
		"Invoice",
		"Billing"
	],
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value could be a one of the specified strings.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
lengthMoreThan

entityName - (required, string) target entity name to process

length - (required, int)     the length to check

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "lengthMoreThan",
	"entityName": "Invoice Number",
	"length": 9,
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value length more than N symbols.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
lengthLessThan

entityName - (required, string) target entity name to process

length - (required, int)     the length to check

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "lengthLessThan",
	"entityName": "Invoice Number",
	"length": 10,
	"message": {
		"severity": "error",
		"text": "A error message"
	}
}

Checks that the entity value length less than N symbols.

ieclassificationhtml-iehtml-classification
YesNo supportsYesNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
<span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span>
hasAllValues

entityName - (required, string) target entity name to process

mandatoryValues - (list of string)  the mandatory values

message - the validation message message map, that contains severity string (info, warning, error) and text

DocumentType JSON
{
	"name": "hasAllValues",
	"entityName": "Signature",
	"possibleValues": [
		"ceoSignature",
		"cooSignature"
	],
	"message": {
		"severity": "error",
		"text": "Document must be signed by both CEO and COO."
	}
}

Checks that the entity values contain all of the specified strings.

ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies
autoMapEntitiesToDocument

 

DocumentType JSON
{
	"name": "autoMapEntitiesToDocument"
}

No actual validation is done by this post-processor, instead for a valid document there will be attempt to auto-map all extracted values into an AP default datastore entity by the following rules:

  • each extracted value name first matched to a datastore entity field annotated with '@MapFromExtractedEntity' which  value exactly same as the name
  • if no match - normalized name will be matched to any entity column field with the same normalized name (normalization here means lowercasing + removing spaces)
  • if match is found - extracted value will be set to the datastore entity field found using the following logic:
    • converted with parser specified in field's @MapFromExtractedEntity annotation
    • if no parser  specified - a standard parser will be applied if found for the field type
    • if field type is String - no conversion will be applied


ieclassificationhtml-iehtml-classification
YesNo supportsNo supportsNo supports

Implementation classes:  

eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies

Creating Custom Post-Processors

Post Processors are the instances of the PostProcessor<D extends DpDocument> interface, but depending of its specialization you should have access to entities, categories, validation messages and so on. Here are the common steps to create a postprocessor and validator classes:

  1. Create a new Java class in your Automation Process project (it can be placed in any package) for the custom post-processor.
  2. Extends the class  from the BasePostProcessor<D extends DpDocument> 
  3. Use a document and repository classes from your project (if you extend it), or DpDocument and DpDocumentRepository if not.
  4. Annotate the class with @PostProcessorStrategies to indicate that it contains post-processor methods. And specify the human task types in the annotation value that your postprocessors supports (empty value or some htTypes means multiply types and a case of more deep knowledge that is not covered here).
  5. To bring OOTB methods implement interface using the following table: 

    Human Task TypePostProcessorValidator
    ieIePostProcessorBase<D extends DpDocument>IeValidatorBase<D extends DpDocument>
    classificationClPostProcessorBase<D extends DpDocument>ClValidatorBase<D extends DpDocument>
    html-ieIeHtmlPostProcessorBase<D extends DpDocument>IeHtmlValidatorBase<D extends DpDocument>
    html-classificationClPostProcessorBase<D extends DpDocument>ClValidatorBase<D extends DpDocument>
  6. Implement one or more post-processor methods within the class. Each method should correspond to a specific post-processing logic.
  7. Annotate each post-processor method with @PostProcessorMethod("myPostProcessorName"), where "myPostProcessorName" is the unique name of the custom post-processor.
  8. Annotate parameters with @PostProcessorParameter("paramName") for those that need to be provided via configuration in the DocumentType JSON.

Here is an example Custom IE Post-Processor Java Class:

MyCustomPostProcessor.java
import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorMethod;
import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorParameter;
import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorStrategies;

@PostProcessorStrategies("ie")
public class MyCustomPostProcessor extends BasePostProcessor<MyDpDocument> implements IePostProcessorBase<MyDpDocument> {

	@Inject
	@Getter
	private MyDpDocumentSetRepository documentRepository;

	@PostProcessorMethod("myCustomPostProcessor")
	public void myCustomPostProcessor(@PostProcessorParameter("paramName") String paramValue) {
		// Custom post-processing logic here
		// Use paramValue as needed
	}

}

public interface	MyDpDocumentSetRepository extends DpDocumentRepository<MyDpDocument> {
}	

@Data
@Entity(value = "MY_DOCUMENTS")
public class MyDpDocumentextends DpDocument {

	@Column("my_column")
	private String myColumn;
}

Where implements IePostProcessorBase<MyDpDocument> gives simple access to the extracted entities via ExtractedEntities interface.

Using Custom Post-Processors in DocumentType JSON

After creating custom post-processors, developers can utilize them by configuring the DocumentType JSON. The configuration will include the custom post-processor names and the required parameters. The custom post-processor will automatically be executed during the ML post-processing phase.

Below is an example of how to specify a custom post-processor in the DocumentType JSON:

DocumentType JSON
{
	"mlPostProcessors": [
	{
		"name": "myCustomPostProcessor",
		"paramName": "paramValue",
		// Additional configuration for the custom post-processor can be added here
	}
	// Additional post-processors and their configurations can be added here
	]
}

Test Post-Processors

There is an ability to play with post processors and check how it works using your own data.

Test PreProcessors and PostProcessors

Using this method you can play with preprocessors and postprocessors.

  1. Upload your documents into a document set with a settings corresponding to your documents.
  2. Then upload a test package from Nexus: https://<CS host>/nexus/repository/rpaplatform/eu/ibagroup/samples/ap/easy-rpa-test-ap/<version>/easy-rpa-test-ap-<version>-bin.zip and import the following automation processes:
    • Cl PP Test Document Processor;
    • Cl HTML PP Test Document Processor;
    • IE PP Test Document Processor;
    • IE HTML PP Test Document Processor.

They are a post processor test document processors. The difference from the usual ones is the switched of call ML task, but they still call ML post-processor.

The testing process look the following:

  1. Switch Document Processor to a corresponding test one;
  2. Import your documents into document set;
  3. Update document set document type by adding a preProcessors and postProcessor to test;
  4. Call PREPROCESS for document;
  5. Check results for your pre-processors;
  6. Open document in Human Task (using document set button);
  7. Tag document using model output (your also could tag human output if your post-processor need it);
  8. Call ML, only post-processor is calling (it saves the output your prepared for testing);
  9. Check results for your post-processors.

Test Validators

  1. You need to prepare your document set in the same way as it described the section above;
  2. Call PREPROCESS to have ability to prepare output;
  3. Add your validator into mlPostProcessors;
  4. Call ML;
  5. Check corresponding output using document set button.