Early Case Assessment has come back to the forefront of eDiscovery. For a number of years, people focused solely on speeding up the document review process and forgot about the importance of culling data to reduce the volume of material that needs to be reviewed. Given the rapidly increasing volume of ESI, the demand is to get more done, in less time, for less money without sacrificing the quality of the end product. Early Case Assessment tools to the rescue.
Email threading analysis is an extremely effective tool to reduce the volume of data that needs to be examined. Due to the nature of email threading, teams can achieve both significant increases in efficiency and quality. Combined with some workflow improvements, impressive results can be obtained.
Think about how people generally create document review sets. They assign material by custodians to coding/review staff. The problem with this approach is that email chains may be broken when documents sets are arbitrarily created in sets of “X Number” of documents. Reviewers only get a partial story, since all relevant information is not included in their review set. Reviewers can’t possibly understand complex chains of communications if they only have access to parts of the whole. Organizing data by complete conversation, combined with the ability to read only the “inclusive” or most rich email entries, can dramatically reduce the total documents that must be reviewed.
What is an Inclusive Email?
In structured analytics, there are two types of email messages:
- Inclusive – Defined as an email that contains unique content not included in any other email and therefore, MUST be reviewed. For example, emails with no replies or forwards are by definition inclusive as is the last email in a thread.
- Non-Inclusive – Any email whose text and attachments are fully contained in another (inclusive) email. (see footnote 1 for a complete description of email inclusions)
Focusing staff review on only the inclusive emails and removing duplicates will dramatically shorten the review process without any loss in accuracy. The role of the email threading analytics engine is to derive the email threads and determine which subset of each conversation constitutes the minimal inclusive set. Inclusiveness analysis ensures that even the smallest changes in content will not be missed by reviewers.
Using Email Threading Analytics Improves Document Review Results
Email threading assessment during an ECA process increases awareness of the content of the document corpus by:
- Understanding the scope your case more quickly since it allows you to focus in on entire conversations rather than just snippets. This can be used to improve search term selection, prioritize custodians and concept cluster choices earlier in the process.
- Improving consistency in coding decisions. Since reviewers are seeing the entire context, small snippets of a conversation may have a clearer meaning and should result in a more consistent and accurate coding calls.
- Providing a great Quality Assessment/Quality Control tool to look for inconsistent coding calls by reviewing all the documents in the thread. In particular, reviewing coding decisions on privilege are easier to review and validate.
- Reviewing only inclusive emails will dramatically reduce the number of individual records that have to be reviewed resulting in obvious cost savings. Additionally, it will improve reviewers understanding of the overall content within a single email chain.
In its most simple form, email threading is an improved method for organizing the data to make your review process more efficient and effective. The utilization of clustering analytics will provide further automated grouping of emails by content or topics which results in more efficient review and increased understanding. The goal of using these analytic tools before creating review sets is to provide the complete conversational thread to individual reviewers.
Using these advanced ECA techniques, you could automatically cluster email threads that discuss a particular topic such as “Prequalified Loan Packages” and then create review sets so that only a few people would focus on that topic. Each reviewer would obtain a more complete understanding of a complicated and important theme in the case, code them consistently and direct the information to the lead attorneys earlier in the process. With the proper implementation of email threading and clustering analytics, users can zero in on the important documents earlier in the case and the review team will achieve higher coding quality, improved decision making and better efficiency.
Common Inclusive email definitions included in analytics
- The last email in a thread: The last email in a particular thread should always be marked inclusive, because any text added in this last email (even just a “forwarded” indication) will be unique to this email and this one alone. If there were no attachments, and no changes to the subject line or the body of the mail, this would be the only type of inclusiveness.
- The end of attachments: When an email has attachments, and the recipient replies, the attachments are often dropped from the display. For this reason, the end of the thread will not contain all of the text and attachments of the email. Structured Analytics will flag one of the emails containing the attachments as inclusive to make sure that all the important information is reviewed.
- Change of text: Email threading analytics capture any changes in the body of the email and display both versions for review. One can imagine that an employee wishing to eliminate negative information might attempt to change a word or two by modifying the original email during a reply and or forwarding the email to a third source. In this case, the Analytics engine would recognize that the email from Person A to Person B contained different text than that from Person B to Person C, and flag both emails as inclusive. This type of rule regarding text changes would also display two emails as inclusive if someone includes their responses to questions within the body of the original email text.
- Change of sender or time: If the Analytics engine finds what looks like a prior email, but the sender or time of that email doesn’t match what’s expected, it can trigger an extra inclusiveness tag. Note that there is a certain amount of tolerance built in for things like different email address display formats (“Johnson, Jeffrey” versus “Jeffrey Johnson” versus “firstname.lastname@example.org”). There is also the understanding that date stamps can be deceiving due to clock discrepancies on different email servers and time zone changes.
- Duplication: While not necessarily a reason for inclusiveness, when duplicate emails exist, either both or neither are marked inclusive by most Structured Analytics tools. Duplicates most commonly occur in a situation where person A sends an email to B and to C, and you collect data from two or more of the three people. To avoid redundant review, you should be sure to remove email duplicates from the population before creating review sets