Technology Assisted Review is a crucial tool within eDiscovery due to the increasing amounts of data generated and the disproportionate amount of time and costs to carry out an electronic document review.
What Exactly is Technology Assisted Review?
Technology Assisted Review (or TAR for short) is software built using mathematical algorithms and statistical sampling to code documents automatically. The software is trained, using a seed set of documents. All coded by an expert, to determine what is a ‘relevant’ document and what is not.
At CYFOR Legal, one of the tools we use is Relativity Assisted Review (RAR) for predictive coding. RAR uses a functionality called ‘Categorisation’ to arrange the documents into groups of ‘Relevant’ and ‘Not Relevant’ documents. Categorisation uses Relativity’s analytics engine to look at textual concepts within a document set. This is based on a type of textual analytics called Latent Semantic Indexing (LSI). The analytics engine will look at concepts within a document and identify other documents containing similar textual content. This allows us to teach the system about the types of documents we are interested in. Then allowing the analytics engine to categorise them accordingly.
CYFOR Legal works alongside clients in the early stages of a litigation case to determine whether technology-assisted review is the best way forward for the project at hand. Once our consultants have been instructed by a client to carry out an assisted review project. The eDiscovery team guides the reviewers through the necessary steps to achieve the desired outcome.
The chosen review platform needs to be able to measure the accuracy of the assisted review project. This is measured using a control or truth set, a statistically significant, random sample taken from the data set. These documents are batched out for manual review and simply coded as either ‘Relevant’ or ‘Not Relevant’. The results of this round are used as a marker for determining the F1 measure. This is a calculation used to monitor the stability of the project.
Pre-coded Seed Round
Training the system to code documents effectively can take time. In cases where a manual review has already been carried out on a set of documents, these documents can be used as pre-coded seeds. Using the same designation field as created for the assisted review project, the documents can be used as examples of ‘Relevant’ and ‘Not Relevant’ documents and used to categorise more documents in the assisted review project because of this.
Training rounds are carried out to teach the system how to categorise documents. During training rounds, a document sample is batched out and manually reviewed. This is preferably completed by someone particularly familiar with the case. Documents can be coded simply as either ‘Relevant’ or ‘Not Relevant’. Documents can also be tagged as examples by checking a ‘Use as Example’ box. This would be in instances where a document has a good amount of text and is deemed a good example of a ‘Relevant’ or ‘Not Relevant’ document. Alternatively, an extract of text can be copied from a document and pasted into a text box named ‘Use Text Excerpt’.
Quality Control Round
On completion of training rounds, a quality control round is executed on the categorised documents to test how accurately the system has grouped the documents. A sample of the documents already categorised as a result of the training rounds is batched out for manual review. The system can then compare how many documents have been categorised correctly by the system. And, how many have been ‘overturned’. An ‘overturn’ is where the system has categorised a document, for example, as ‘Relevant’ and a manual review has coded the documents as ‘Not Relevant’. The number of ‘overturns’ can be measured and analysed to identify and correct issues within the assisted review project.
Training Rounds & Quality Control
In summary, the aim of the training rounds and QC rounds is to categorise as many documents as possible, as accurately as possible, in line with the F1 score (a measure of a test’s accuracy), agreed at the start of the project. The end result is a set of coded documents which can be used for timely production. This can be used to prioritise review by pushing the ‘Relevant’ document to the review team first or batching out all the ‘Relevant’ documents for review. Whatever the reason for your assisted review project, predictive coding will be utilised more and more in the future as data sizes grow and manual document review costs soar. Technology-assisted review has quickly become an essential tool in litigations, prioritising documents for review whilst reducing time and overall costs.
Explore our complete eDiscovery solutions suite here.