Post-Processing
Post-Processing
Overview
Post Processors are the instances of the PostProcessor<D extends DpDocument> interface that and using by Document Processor to change document content during document processing. The document processors could be plugged using document type settings JSON. Usually platform post-processors are used to refine and manipulate extracted data to improve the accuracy and relevance of the information obtained after PREPROCESS, ML and for VALIDATION. This article provides technical information about OOTB post processors.
Configuring Post-Processors in DocumentType JSON
To add a post-processors, into document processing you should define a configuration in the DocumentType JSON. The configuration specifies the post-processors to be executed and their corresponding settings. Each post-processor is referenced by its name and may require specific parameters to be provided. The post processors could be added:
- at last step of PREPROCESS action (Document Set processor and Automation Process flows)
- at last step of ML action (Document Set processor and Automation Process flows)
- at VALIDATION flow of the Automation Process flows:
Below is an example of how to specify the ML post-processing configuration in the DocumentType JSON:
{ // DocumentType Settings JSON . . . . . "preprocessPostProcessors": [ { "name": "removeWordIfConfidenceLessThan", "confidence": "50.0" } ], "mlPostProcessors": [ { "entityName": "Invoice Number", "name": "regexReplacement", "rules": { "o|O|e|c|C|Q|p|P": "0", "I|i|j": "1", "b|G": "6", "B": "8", "q": "9" } } ], "validators": [ { "entityName": "Quantity", "name": "isBigDecimal", "strict": true, "message": { "severity": "error", "text": "Quantity should be a number." } } ] }
Post-Processors specialization
A post processor class could change any field of the DpDocument, but because the document could be of different Human Task Type the post processor handle only its specific htType, and could not be applied to another.
The post processor also could be phases oriented, i.e. changes/analyses/checks only inputJson outputJson or modelOutputJson or all of them. Therefore the following list of OOTB processors will be grouped by phases and contains its specialization information.
PreProcessors
It is a PostProcessors that works with inputJson, but also could impact on outputJson and modelOutputJson.
name | parameters | sample config | description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
removeWordIfConfidenceLessThan | confidence - (required, string) the double confidence value as string | DocumentType JSON { "name": "removeWordIfConfidenceLessThan", "confidence": "30.0" } | Removes all words from HOCR input JSON which confidence is less then the provided confidence value. Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span><span style="color: rgb(0,0,0);"> </span> | ||||||||
removePages | pages - (required, string) the pages in format "1,3-4,9-" | DocumentType JSON { "name": "removePages", "pages": "1,3-4,7-" } | Removes the specified pages. Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> | ||||||||
leaveOnlyPages | pages - (required, string) the pages in format "1,3-4,9-" | DocumentType JSON { "name": "leaveOnlyPages", "pages": "1,3-4,7-" } | Leaves only specified pages. Intended to use as preprocessor, using on others phase also possible. It removes words from entities (if any) and changes entities content and set custom value flag.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.cldp.postprocessing.</span><span style="color: rgb(0,0,0);">ClPostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> |
PostProcessors
It is a PostProcessors that works with modelOutputinput.
name | parameters | sample config | description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
removeEntityIfContentEqualsTo | entityName - (required, string) target entity name to process targetContent (required, string) the target content that triggers the removal | DocumentType JSON { "name": "removeEntityIfContentEqualsTo", "entityName": "Address", "targetContent": "Not existed" } | Removes an entity in modelOutputJson if its content equals the target content.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
changeEntityContentIfContentEqualsTo | entityName - (required, string) target entity name to process targetContent - (required, string) the target content that triggers the change changeTo - (required, string) the new content to replace the target content | DocumentType JSON { "name": "changeEntityContentIfContentEqualsTo", "entityName": "Payment Amount", "targetContent": "$10k", "changeTo": "$10.000" } | Changes the content of an entity in modelOutputJson if its content equals the target content.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
changeEntityContentIfContentContains | entityName - (required, string) target entity name to process targetContent - (required, string) the target content that triggers the change changeTo - (required, string) the new content to replace the target content | DocumentType JSON { "name": "changeEntityContentIfContentContains", "entityName": "Status", "targetContent": "IN-PROCESS", "changeTo": "IN PROGRESS" } | Changes the content of an entity in modelOutputJson if its content contains the target content.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
removeLeadingAndTrailingNonAlphanumericCharacters | entityName - (required, string) target entity name to process | DocumentType JSON { "name": "removeLeadingAndTrailingNonAlphanumericCharacters", "entityName": "Reference Number" } | Removes leading and trailing non-alphanumeric characters in target entity content in modelOutputJson .
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
regexReplacement | entityName - (required, string) target entity name to process rules - required map of rules that provide both match regex and the replacement regex like in the String#replaceAll(String, String)} | DocumentType JSON { "name": "regexReplacement", "entityName": "Expression", "rules": { "\+" : "plus", "=" : "equals", "\s+" : " ", "(l)(?<digitsAfterL>[\d]+)" : "1#{digitsAfterL}" } } | Replaces the content of an entity in modelOutputJson if its content matches a regex from rule.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
trim | entityName - (required, string) target entity name to process | DocumentType JSON { "entityName": "Reference Number", "name": "trim" } | Trims the entity value in modelOutputJson.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
tokens | entityName - (required, string) target entity name to process regex the optional regexp for token obtaining, default is " " tokenIndexes the token indexes to leave | DocumentType JSON { "entityName": "Reference Number", "name": "tokens", "tokenIndexes": [ 1 ] } | Leave only specified token in the entity in modelOutputJson.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
removeEntities | entityNames - (required, array of strings) the entity names to remove | DocumentType JSON { "name": "removeEntities", "entityNames": [ "Product Description", "Price" ] } | Removes specified entities in modelOutputJson.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies </span> <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | ||||||||
mergeCloseEntities | entityName - (required, string) target entity name to process width - (required, int) max horizontal offset in pixels between left down points height - (required, int) max vertical offset in pixels between left down points | DocumentType JSON { "entityName": "Product Description", "name": "mergeCloseEntities", "width": 5, "height": 30 } | Helps to merge entities which are placed in the same table cell in PDF, but tagged by model as separate entities. It's possible to understand that two entities are in the same table cell only if they are very close to each other. Merges entities in modelOutputJson. if they are close to each other.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> | ||||||||
mergeEntitiesIfContentBecomesEqualTo | entityName - (required, string) target entity name to process targetContent - the target concatenated content that triggers the merge | DocumentType JSON { "name": "mergeEntitiesIfContentBecomesEqualTo", "entityName": "Customer Name", "targetContent": "JOHNDOE" } | Merges entities in modelOutputJson if their content becomes equal to the target concatenated content.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> | ||||||||
removePagesWithoutEntities | DocumentType JSON { "name": "removePagesWithoutEntities" } | Removes all pages that doesn't contain any entities in modelOutputJson.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> | |||||||||
ocrPositionBasedGrouping | DocumentType JSON { "name": "ocrPositionBasedGrouping" } | Applies the OCR position-based grouping in modelOutputJson to the OCR entities of the provided document type.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> | |||||||||
htmlTableRowBasedGrouping | DocumentType JSON { "name": "htmlTableRowBasedGrouping" } | Applies the HTML table row-based grouping in modelOutputJson to the HTML entities of the provided document type.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.postprocessing.</span><span style="color: rgb(0,0,0);">MlHtmlIePostProcessingStrategies</span> | |||||||||
renderContentForImageEntity | DocumentType JSON { "name": "renderContentForImageEntity", "entityname": "Signature" } | Replaces the content of the given entity in modelOutputJson with the S3 link to rendered image using the following logic:
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iedp.postprocessing.</span><span style="color: rgb(0,0,0);">MlIePostProcessingStrategies</span> |
Validators
It is a PostProcessors that checks outputJson.
name | parameters | sample config | description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
mustBeReviewedByHuman | DocumentType JSON { "name": "mustBeReviewedByHuman" } | Send document to human in case if it wasn't reviewed by her/him.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.CommonValidationStrategies</span> | |||||||||
hasAllCategories | categories - (required, array of string) the categories list message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "hasAllCategories", "categories": [ "News", "New" ], "message": { "severity": "error", "text": "A error message" } } | Checks that classified document has all the specified categories.
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span> | ||||||||
dontHaveAllCategories | categories - (required, array of string) the categories list message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "dontHaveAllCategories", "categories": [ "News", "New" ], "message": { "severity": "error", "text": "A error message" } } | Checks that classified document does not have all the specified categories..
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span> | ||||||||
hasAnyCategory | categories - (required, array of string) the categories list message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "hasAnyCategory", "categories": [ "News", "New" ], "message": { "severity": "error", "text": "A error message" } } | Checks that classified document has any specified categories..
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span> | ||||||||
atLeastCategoriesSelected | length - (required, integer) the minimal number of selected categories message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "atLeastCategoriesSelected", "length": 2, "message": { "severity": "error", "text": "A error message" } } | Checks that classified document has at least minimum number of categories selected. (Useful for multi-class classification)
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span> | ||||||||
noMoreThanCategoriesSelected | length - (required, integer) the maximum number of selected categories message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "noMoreThanCategoriesSelected", "length": 2, "message": { "severity": "error", "text": "A error message" } } | Checks that classified document has maximum number of categories selected. (Useful for multi-class classification)
Implementation classes: <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.dp.validation.ClValidationStrategies</span> | ||||||||
regexCheck | entityName - (required, string) target entity name to process regex - (required, string) matching regex message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "regexCheck", "entityName": "Invoice Number", "regex": "\d{6}", "message": { "severity": "error", "text": "A error message" } } | Checks the entity value using regex.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
notEmpty | entityName - (required, string) target entity name to process message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "notEmpty", "entityName": "Invoice Number", "message": { "severity": "error", "text": "A error message" } } | Checks that entity is not null of empty.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isInteger | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isInteger", "entityName": "Total", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed into integer.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isLong | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isLong", "entityName": "Total", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed into long.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isFloat | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isFloat", "entityName": "Total", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed into float.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isDouble | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isDouble", "entityName": "Total", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed into double.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isBigDecimal | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isBigDecimal", "entityName": "Invoice Number", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed into BigDecimal.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isDate | entityName - (required, string) target entity name to process formats - (required, list of string) possible date formats, negative if none matching message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isDate", "entityName": "Invoice Date", "formats": [ "dd/MM/yyyy", "yyyy-MM-dd", "dd MMM, yyyy" ], "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed as a date of the specified formats.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isCurrency | entityName - (required, string) target entity name to process message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isCurrency", "entityName": "Total Currency", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed as a currency.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isAmount | entityName - (required, string) target entity name to process strict - (optional, boolean, false by default) an optional parameter that specifies the strict parsing, no any related symbols allowed message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isAmount", "entityName": "Total", "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be parsed as an amount.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
isSimilarTo | entityName - (required, string) target entity name to process possibleValues - (list of string) the possible values message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "isSimilarTo", "entityName": "Invoice Type", "possibleValues": [ "Invoice", "Billing" ], "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value could be a one of the specified strings.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
lengthMoreThan | entityName - (required, string) target entity name to process length - (required, int) the length to check message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "lengthMoreThan", "entityName": "Invoice Number", "length": 9, "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value length more than N symbols.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
lengthLessThan | entityName - (required, string) target entity name to process length - (required, int) the length to check message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "lengthLessThan", "entityName": "Invoice Number", "length": 10, "message": { "severity": "error", "text": "A error message" } } | Checks that the entity value length less than N symbols.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies <span style="color: rgb(0,0,0);">eu.ibagroup.easyrpa.ap.iehtmldp.validation.</span><span style="color: rgb(0,0,0);">IeHtmlValidationStrategies</span> | ||||||||
hasAllValues | entityName - (required, string) target entity name to process mandatoryValues - (list of string) the mandatory values message - the validation message message map, that contains severity string (info, warning, error) and text | DocumentType JSON { "name": "hasAllValues", "entityName": "Signature", "possibleValues": [ "ceoSignature", "cooSignature" ], "message": { "severity": "error", "text": "Document must be signed by both CEO and COO." } } | Checks that the entity values contain all of the specified strings.
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies | ||||||||
autoMapEntitiesToDocument |
| DocumentType JSON { "name": "autoMapEntitiesToDocument" } | No actual validation is done by this post-processor, instead for a valid document there will be attempt to auto-map all extracted values into an AP default datastore entity by the following rules:
Implementation classes: eu.ibagroup.easyrpa.ap.iedp.validation.IeValidationStrategies |
Creating Custom Post-Processors
Post Processors are the instances of the PostProcessor<D extends DpDocument> interface, but depending of its specialization you should have access to entities, categories, validation messages and so on. Here are the common steps to create a postprocessor and validator classes:
- Create a new Java class in your Automation Process project (it can be placed in any package) for the custom post-processor.
- Extends the class from the BasePostProcessor<D extends DpDocument>
- Use a document and repository classes from your project (if you extend it), or DpDocument and DpDocumentRepository if not.
- Annotate the class with @PostProcessorStrategies to indicate that it contains post-processor methods. And specify the human task types in the annotation value that your postprocessors supports (empty value or some htTypes means multiply types and a case of more deep knowledge that is not covered here).
To bring OOTB methods implement interface using the following table:
Human Task Type PostProcessor Validator ie IePostProcessorBase<D extends DpDocument> IeValidatorBase<D extends DpDocument> classification ClPostProcessorBase<D extends DpDocument> ClValidatorBase<D extends DpDocument> html-ie IeHtmlPostProcessorBase<D extends DpDocument> IeHtmlValidatorBase<D extends DpDocument> html-classification ClPostProcessorBase<D extends DpDocument> ClValidatorBase<D extends DpDocument> - Implement one or more post-processor methods within the class. Each method should correspond to a specific post-processing logic.
- Annotate each post-processor method with @PostProcessorMethod("myPostProcessorName"), where "myPostProcessorName" is the unique name of the custom post-processor.
- Annotate parameters with @PostProcessorParameter("paramName") for those that need to be provided via configuration in the DocumentType JSON.
Here is an example Custom IE Post-Processor Java Class:
import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorMethod; import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorParameter; import eu.ibagroup.easyrpa.ap.dp.util.ie.postprocessing.execution.PostProcessorStrategies; @PostProcessorStrategies("ie") public class MyCustomPostProcessor extends BasePostProcessor<MyDpDocument> implements IePostProcessorBase<MyDpDocument> { @Inject @Getter private MyDpDocumentSetRepository documentRepository; @PostProcessorMethod("myCustomPostProcessor") public void myCustomPostProcessor(@PostProcessorParameter("paramName") String paramValue) { // Custom post-processing logic here // Use paramValue as needed } } public interface MyDpDocumentSetRepository extends DpDocumentRepository<MyDpDocument> { } @Data @Entity(value = "MY_DOCUMENTS") public class MyDpDocumentextends DpDocument { @Column("my_column") private String myColumn; }
Where implements IePostProcessorBase<MyDpDocument> gives simple access to the extracted entities via ExtractedEntities interface.
Using Custom Post-Processors in DocumentType JSON
After creating custom post-processors, developers can utilize them by configuring the DocumentType JSON. The configuration will include the custom post-processor names and the required parameters. The custom post-processor will automatically be executed during the ML post-processing phase.
Below is an example of how to specify a custom post-processor in the DocumentType JSON:
{ "mlPostProcessors": [ { "name": "myCustomPostProcessor", "paramName": "paramValue", // Additional configuration for the custom post-processor can be added here } // Additional post-processors and their configurations can be added here ] }
Test Post-Processors
There is an ability to play with post processors and check how it works using your own data.
Test PreProcessors and PostProcessors
Using this method you can play with preprocessors and postprocessors.
- Upload your documents into a document set with a settings corresponding to your documents.
- Then upload a test package from Nexus: https://<CS host>/nexus/repository/rpaplatform/eu/ibagroup/samples/ap/easy-rpa-test-ap/<version>/easy-rpa-test-ap-<version>-bin.zip and import the following automation processes:
- Cl PP Test Document Processor;
- Cl HTML PP Test Document Processor;
- IE PP Test Document Processor;
- IE HTML PP Test Document Processor.
They are a post processor test document processors. The difference from the usual ones is the switched of call ML task, but they still call ML post-processor.
The testing process look the following:
- Switch Document Processor to a corresponding test one;
- Import your documents into document set;
- Update document set document type by adding a preProcessors and postProcessor to test;
- Call PREPROCESS for document;
- Check results for your pre-processors;
- Open document in Human Task (using document set button);
- Tag document using model output (your also could tag human output if your post-processor need it);
- Call ML, only post-processor is calling (it saves the output your prepared for testing);
- Check results for your post-processors.
Test Validators
- You need to prepare your document set in the same way as it described the section above;
- Call PREPROCESS to have ability to prepare output;
- Add your validator into mlPostProcessors;
- Call ML;
- Check corresponding output using document set button.