ALIAS Modules: Automatic, Examiner and Linguist

Each user can select what modules he/she wants to use and have access to. ALIAS contains three kinds of modules, depending on its level of automation and the user’s required expertise in linguistics or statistics.

  • Interactive-Examiner modules require an examiner to interact with the software; this interaction involves linguistic or statistical analysis, depending on the forensic task. The Examiner does not need to be a degreed linguist, since the linguistic or statistical analytical techniques can be learned in training.
  • Interactive-Linguist modules require a linguist to interact with the software; this interaction involves linguistic or statistical analysis, depending on the forensic task. The Linguist must be degreed and have expertise that is relevant to the forensic task. ALIAS Technology linguists are all degreed linguists with specialties in main subfields of linguistics. The ILing modules are:

In the comments on speed below, the time estimate does not include any data preparation that may be necessary in a case (e.g. converting handwritten documents to electronic format, segmenting macro-texts, bundling micro-texts, importing documents, etc.).

ALIAS contains all the ALI and ALEX modules plus the Interactive-Linguist or “ILing” modules of ALIAS. ALIAS contains these specific modules for linguists:

SynAID: Syntactic Author Identification

Task: Who wrote this document or text? Did the person who signed this document actually author it?

Uses: SynAID classifies documents to authors, and predicts the authorship of an unknown document based on the statistical model of known-author classification. It can be used for a pool of suspect authors and, given sufficient data, for verification.

Speed: SynAID runs an industry-grade parser for the initial linguistic analysis. Since even the best parsers can make errors in linguistic analysis, SynAID requires that linguists check and correct any errors. Finally, linguists perform the statistical analysis using SPSS software. SynAID requires a minimum of 2,000 words/100 sentences per author, and at least two authors. The usual time allotted to a SynAID analysis, with report, is 30 hours.

Notes: SynAID is the patent-pending, syntax-based author identification method that has been repeatedly admitted as expert scientific testimony to the court room after Daubert and Frye hearings, i.e., under both scientific reliability and general acceptance standards for admissibility. SynAID has been used in criminal, civil and security cases on documents ranging from text messages to legal rulings.

Accuracy: In testing independent of any litigation, on ground-truth data, SynAID has attained 95-96% accuracy. In actual cases, SynAID regularly performs better at classifying the known documents to the actual author. For each case, SynAID’s accuracy is calculated from the performance on the known documents in the case.

Current Languages: English

Research-in-Progress Languages: Spanish, Arabic, Italian

 

QSynAID: Quick Syntactic Author Identification

Task: Who wrote this document or text? Did the person who signed this document actually author it? –but tell me as quickly as possible

Uses: QSynAID is a “quick” version of SynAID that does not include error checking. Therefore, we use QSynAID for investigative purposes rather than court testimony at this time.

Speed: The usual time allotted to a SynAID analysis, with report, is 10 hours.

Notes: QSynAID does not include the error correction that is part of SynAID. While this speeds up the process, it also allows for cumulative error. This is why we recommend QSynAID as an investigative tool rather than testimonial evidence. However, ALIAS is built so that the work done in a run of QSynAID is all accessible to a SynAID run, which means that work does not have to be repeated if a case moves from QSynAID to SynAID.

Accuracy: In testing independent of any litigation, on ground-truth data, QSynAID has attained 83% accuracy. In actual cases, QSynAID has performed better at classifying the known documents to the actual author. For each case, QSynAID’s accuracy is calculated from the performance on the known documents in the case.

Current Languages: English

Research-in-Progress Languages: Spanish, Arabic, Italian

 

Profiler: Linguistic Profiling for a Specific Task

Task: Profiler covers a range of tasks and can be designed for any specific task that answers the question: what kind of person authored this text? what kind of person spoke this utterance? The focus of the task may be on native language, native dialect, gender, or age.

Native Language Task (NLT): Native and non-native speakers of any language can be distinguished because the non-native speaker’s previous language experience and internalized grammar conflicts with the grammar of the second language. These conflicts are predictable based on the first and second languages. Profiler: NLT provides a report of linguistic indications that the author is a non-native speaker of English (or any other language for which the module can be designed).

Native Dialect Task (NDT): Every language comes wrapped in a particular dialect, the home dialect of the speaker. But in most languages, social forces make one dialect, that may not be the home dialect, the socially prestigious dialect. Further, in mobile societies, adults move to areas with dialects different from the home dialect. Profiler: NDT provides a report of linguistic indications of the speaker’s home dialect for English (or any other language for which the module can be designed).

Uses: Profiler is used as an investigative tool to find out more about a suspect, eliminate a suspect or focus an investigation on particular types of suspects. Because Profiler is investigation-based, it can be designed for most investigative purposes, languages and dialects. ALIAS Technology linguists and developers work with the client to produce the Profiler algorithm that suits the investigation.

Speed: Algorithms that are already in place (such as those for English) can be run by a linguist with the specific expertise required for the task in about an hour, including a report. The development of the algorithm for a brand-new investigative task can take up to one month, as it includes both linguistic and programming expertise.

Notes: Profiler can be used on both textual and audiovisual data.

Accuracy: Linguists with specific expertise for the task at hand can be over 85% accurate at profiling.

Current Languages: English, Arabic, Russian

Research-in-Progress Languages: Spanish