IT Discovery - Intelligent, Strategic Discovery
The Problem
The need for special tools to search particularly email among all of electronically stored information (ESI) arises from the unique nature of email and its unique importance. Email is different from other electronic documents because it is sent among people who are unambiguously identifiable within a context of a social network that is reflective of activity in the enterprise. The corpus of email as a whole reflects nearly everything that is going on in the enterprise. Not only is the vast majority of electronic documentation in an enterprise in the form of email (75 to 80 percent) but more importantly, the content of that email is comprehensive, up to date and deep. By a happy coincidence, the two primary difficulties inherent in searching emails — its sheer volume and its "noisy" nature — are susceptible of recent developments in machine learning technologies that make this task manageable. This is what IT Discovery does.
Email: Too Much, and Too Much Noise
The sheer volume of email and its "noisy" nature makes searching by any traditional means a futile task. For that and other reasons "search" often means manual review, especially in a high-stakes litigation or regulatory context. It would take 100 people working 10 hours per day, 7 days per week, 52 weeks per year, fifty four years to read just one year's production of email from a large enterprise, at a cost of $2 billion1. And the numbers are growing every day.
Time and money aside, it would be done poorly. A large study recently found a 51% accuracy rate on humanly reviewed documents and in a famous study done in the 1980's when asked how well the human reviewers thought they had done in finding the relevant documents, they claimed that they had found 75% of them. In truth they had found only 20%2. For email these figures would have been much worse.
Traditional email e-discovery is broken, yet in nearly all contemporary forensic investigations involving enterprises, email has proven to be the source of the most salient discoveries. Traditional forensic tools that attempt to mine this information fail to do so effectively because simple "keyword" search techniques are inadequate to the task, and string matching extensions or natural language based approaches promise more than they deliver. The key is topic discovery — the creation of a third axis with which to search
The Solution
IT Discovery enables search not just of the text of emails but of their topics as well. The email search is intelligent because in the context of e-discovery one needs context and context is only available when one has fully recognized the unique nature of email, that it combines both structured and unstructured information in a unique way.
Wouldn't it be interesting to know: "What are the topics discussed in these emails?", or "Which emails fall into this topic?" Add a second dimension — people (authors and recipients). Now ask, "What topics did this person discuss?" or "Which people's emails were about this topic?" Lastly, add the third dimension — email keyword search — and you have a powerful search triad. Choose a topic and a person and execute a keyword search against the delimited set of emails and you have triangulated a search, very powerfully narrowing a large corpus with lots of noise to just a few relevant emails. "Who?", "What?", and "When?", in a single view.
1 See George Paul, Jason Baron, Information Inflation: Can the Legal System Adapt? Richmond Journal of Law and Technology Volume XIII Issue 3
2 See Anne Kershaw, Automated Document Review Proves Its Reliability, Vol 5, No. 11 Digital Discovery & e-Evidence
