Understanding Roles
Understanding Roles
Learn about the Roles
These are our two main roles:
| Data Analyst (DA) | Subject Matter Expert (SME) |
|---|---|
| This dedicated role is essential during delivery of projects which include unstructured data processing. The DA is a person assigned to help with automation of the use case in terms of documents' workflow. The main aim of the Data Analyst is to collect a high-quality data set to train the model. | This is an individual (from the customer’s side) with a deep understanding of a particular process, function, technology, machine, material, or type of equipment, who provides knowledge of business rules for automation projects and who participates in data tagging. |
The Data Analyst and the Subject Matter Expert work in closely together on the use case. They collaborate in the process of collecting the data set. The Data Analyst gives training on how to tag documents in Workspace, verifies the quality of the documents for model training, and validates the result. The Subject Matter Expert provides information on documents' logic, meaning, workflow, and any corner cases. The SME learns how to use Workspace with the help of the DA and tags documents in this application for the data set.
Learn about the DA and SME Functions
Data Analyst and the Subject Matter Expert fulfill particular functions accross thestages of Data Set Collection.
| # | Data Set Collection Stages | Data Analyst | Subject Matter Expert |
|---|---|---|---|
| 1 | Analysis | Study documents' logic, quality, and distribution | Provide marked documents' samples |
| 2 | Preparation | Create tagging instructions | Share expert knowledge on fields' logic and some corner cases |
| 3 | OCR | Check documents' quality after OCR | Search for additional documents if necessary (with better quality for OCR) |
| 4 | Training | Provide training task(s) and teach how to use Workspace | Take the training task in Workspace |
| 5 | Tagging & Validation | Validate tagged data | Tag documents with the help of instructions and guidance provided by Data Analyst |
Investigate the DA and SME Workflow
| # | Step | Description |
|---|---|---|
| 1 | DA studies the requirements | In the first step, the Data Analyst should learn as much as possible about logic, the workflow of the documents, document layouts, and production distribution. DA reviews the requirements for the use case, documents' quality, format, and the data that needs to be selected from the document in terms of fields. |
| 2 | SME shares samples and marked values | SME consults with DA on different questions about the documents and explains the logic of documents' samples. |
| 3 | The questions and answers session and alignment on tagging logic | DA reviews the received documents and provides notes concerning documents' structure and quality to SME. This investigation step can be a reiterative series of questions and answers, which should result in a common understanding of what values should be tagged and where they should be found in documents of different types and vendors. It's crucial to cover all the documents, formats, templates that will be placed in the data set. |
| 4 | (optional) Split into batches | If the amount of documents is overwhelming or/and we are not sure about the success of initial tagging, we can divide available documents into batches. |
| 5 | DA prepares tagging instructions and Human Task | DA records tagging logic established at the questions and answers session. step in the form of instructions for SME and designs a Human Task for tagging. It's essential that tagging instructions are comprehensive and followed by SME while tagging. Any further nuances discovered later, or new documents added to data set alignment on tagging logic, can incur a review of the whole data set, or retagging, or a re-design of the Human Task. That's why it's crucial to cover all the documents at Step 3 |
| 6 | SME has training on tagging | DA provides training tasks for SMEs. The rules of tagging are explained there, and experts study how to work with the Workspace application. Tagging rules and typical mistakes are discussed when the training task is outlined. |
| 7 | DA provides feedback | DA provides feedback on results and shows what can be improved. When enough training is done, tagging for data set collection begins. |
| 8 | SME tags documents | A Human Task is created in Workspace with the set of necessary fields. DA uploads documents and SME starts tagging with the help of instructions provided by DA. |
| 9 | DA validates the results | This is a cycle: The Subject Matter Expert tags documents and the Data Analyst validates results. |
Here is the visualized workflow:
Move further
Now that we've learned about the roles of Data Analyst and Subject Matter Expert, next we will learn about technology. In the following module, we will learn what components of the automation solution the tagging process relies on.
