Advanced Leveraging is a term coined by the Translation Association User Society (TAUS) to describe the next generation of Translation Memory (TM) tools that build on, and go beyond, the functionality of traditional TM tools. This expansion of functionality is intended to overcome some of the known limitations of conventional TM technology, namely that matches are limited to whole sentence (or segment) matches which is counterintuitive to the human translation process. Humans can readily identify matches just not on the sentence level but on the paragraph and, more importantly, on the sub-segment level, where parts of speech such as expressions and conventional phrases make up a significant part of writing. ALTM is designed to do everything that conventional TM does and more by adding functionality to better match the abilities of human translators to identify matching text in context.
Did you know:
Conventional Translation Memory (TM) systems transform inventories of past translations into a database by automatically extracting and aligning the source sentences with the target language sentences. This involves breaking apart entire documents to create a simple database of aligned sentences (usually only unique sentences) taken out of context. Since this happens without reference to context, conventional TM technology generally requires a good deal of manual maintenance by a senior linguist to validate and correct misalignments, especially 1:n and n:1 combinations that are readily apparent to human, but not automated tools. Since conventional TM tools align these segments as discrete chunks of data they are not able to match content on a macro or paragraph level (several sentences) or a sub-segment level (phrases and expressions). MultiTrans Prism ALTM does that, but without the time-consuming human intervention to correct alignments at the segment level. This is because the entire non-segmented document is stored in the database and the content is indexed rather than having the segments cut up and placed in a database. Therefore, it retains the entire document and is designed to retrieve segments from the contextual memory in a way that can both replicate a conventional TM and do even more. Since a human does not need to maintain this process, you will build larger translation memories and get more from your translation process.
Advanced Leveraging Translation Memory overcomes the shortcomings of conventional TM tools by not only aligning sentences, but also aligning entire documents at the full-text level, paragraphs and even sub-segments. Alignments done in this way not only allow matching to be done on the whole document, paragraph, segment and sub-segment level, but, by preserving context, more readily identify and align 1:n and n:1 combinations. This model delivers superior alignment results and requires virtually no alignment maintenance. The granular matches delivered by ALTM means that users are more productive because they leverage significantly more repetitions from their past translations. The result is that ALTM gives users everything conventional TM delivers plus even more matches and greater alignment reliability without human intervention.
Another advantage to ALTM is the ability to search for matches within the whole document so that the linguistic context of sub-segment (or even paragraph) matches can be reviewed for meaning, style and tone. This advanced searching capability includes the ability to find large sub-segments of text in the TM that are the same or similar, but which may be in a different order from corresponding sub-segments in the source text. It also includes the searching of several TMs at the same time to find and rank the most appropriate translations. These factors mean that ALTM can deliver higher match rates beyond the full and fuzzy match rates of conventional TM.
The concept of sub-segment matching can be a confusing one to those familiar with conventional TM. Paragraph matching is easier to understand as several convention TMs mimic paragraph matching through "context" matches, although this is not ”true” whole paragraph matches. To explain how sub-segment matching works here is a simple example. You have a TM with the following sentence and its translation:
Once upon a time, there was beautiful princess.
You now have a new document with the following sentence:
Once upon a time, an evil dragon lived in a cave.
By conventional TM methodology this will show as a no-match - since it falls under a 75% match rate. However, to a human reader it is readily apparent that what we have here is a sentence made up of two components, one of which is a 100% match ("Once upon a time") and the other a 0% match ("there was beautiful princess"/"an evil dragon lived in a cave"). This is a very simple example. The same principle holds for sentences that may have been made up of clauses from different paragraphs of an earlier document. Likewise, if an earlier document is rewritten and clauses cut or repositioned, ALTM can still recognize the components of the new text and match to elements in the TM. This kind of cut and paste language is generally not caught by conventional TM tools as it will fall below match thresholds and will not identify the text that is the same. With Advanced Leveraging technology, these sub-segment matches can be identified and easily reused by a translator working on the text. This advantage means that the translator's rate of work is quicker by not having to retranslate this sort of repetitive text but also consistency is better maintained.
A surprising amount of written language is composed of common phrases and clauses which repeat. "Once upon a time" is just one example of a fixed expression that repeats across documents written by different authors. One of our clients tried an experiment with sub-segment matching just to see how common this content was. They assembled a mass translation memory (over 6 million words) of content from all their financial clients. They then took a document from a financial company that was not a client and analyzed it against this TM. Not surprisingly there were 0% matches at the 100% match level and around 1-2% matches at the Fuzzy (75-99%) match level. But much to their surprise, at the sub-segment level over 20% of the entire content was a match! Further experiment showed that the volume of content in the TM correlated to the degree of match, the more content, the more sub-segment matches. With the birth of cloud TMs such as MyMemory and TAUS TDA, this opens up new possibilities of TM use where texts for new clients without dedicated TMs can have significant text reuse.
ALTM features full-text indexing capabilities which allow you to search and retrieve text strings of any length, such as full and fuzzy segments, paragraphs, terms, and even sub-segments. As a result, this technology finds up to 30% more matches from previously translated documents than conventional translation memories. Because ALTM technology identifies past translations at a more granular level, users benefit from the ability to view the past translation variations of sub-segments, terms and even words. This technology will query your repository of past translations to show you the various ways that a sub-segment, term or word has been translated along with its frequency in percent. This level of analytic granularity also helps users standardize their terminology for even greater efficiencies and improved quality.
ALTM provides you with the context of your past translations. Context of past translations is particularly valuable when translating more complex documents with full paragraph styles. ALTM allows you to build entire repositories of raw indexed texts while keeping the context intact for an on the fly view of how ambiguous terms, sentences and even paragraphs were previously translated. By providing context during translation, ALTM acts as an extensive "by-example" dictionary, containing usage and style references for terms and expressions. An intuitive user interface should present matching source text from past translations and the corresponding target text in separate windows with a view of each match's natural context.
Conventional TMs typically contain fewer previously translated documents because of the effort it takes to build and maintain them. If you do invest the effort, you are still 30% behind due to the improved granularity that only ALTM can deliver. ALTM should allow you to create TMs very rapidly - in fact with ALTM, you will be able to automatically create new translation memories of over 100,000 segments per hour (or more than 1 million segments per day). ALTM should align an unlimited number of documents at a success rate of 95% - this means that you can have confidence in the quality of your automated alignments and can begin using your TM immediately.
An independent benchmark study sponsored by the TAUS Data Association (TDA) and performed by the Centre for Translation Studies of the University of Leeds. It reports that ALTM increases the number of matches found from previously translated documents by an average of 30% above and beyond conventional translation memories. If your organization does not currently manage their linguistic assets at all, you could be recycling up to 50% of your past translation investments with ALTM.
MultiTrans Prism TextBase TM features full-text indexing capabilities, so users can subsequently search and retrieve text strings of any length such as terms, sub-segments, full and fuzzy segments and paragraphs. With MultiTrans Prism technology, users can also create new translation memories of over 10,000 segments in under 5 minutes - without the need for time-consuming, up-front manual verification of alignments before actually using the translation memory.
Instead of being limited to the modest size of a conventional TM system, which often contains less than 30% of relevant past translations because of the effort it takes to build, users can now easily import all of past translations, reference documents and legacy TMs automatically. This is because without human interaction, MultiTrans Prism technology can align unlimited amounts of documents at a success rate of 95%. Depending on business needs, organizations may consider investing in the manpower needed to correct the remaining 5%, but this is not even necessary because with the TextBase TM, users can always view the texts in their full context and even correct any rare misalignments on the fly - even while translating!
MultiTrans can also import existing TMX files to get positive return on past translation memory investments. Given that MultiTrans Prism is easy to use and allows users to rapidly align their documents, users can feed MultiTrans Prism TextBase TM all their documents, not only a selection of them, bypassing the cost/benefit analysis associated with conventional TM tools. Having a larger pool of previously translated quality document pairs, from which more repetitions can be retrieved, helps not only improve productivity but also improves quality and terminological consistency in future translations.
MultiTrans Prism is not the only ALTM product on the market, but it was the first – and we feel the best. Its range of features is superior to competing ALTM products on the market. To give you an idea of the scope of our ALTM features compared to conventional TMs please consult the following table. It can also serve as a useful guide when considering any ALTM to make sure you are getting all the features you need in a tool.

MultiCorpora would like to share with you the advantages of ALTM. To see a live demo of ALTM in action go to our Live Demo Request page and enter your information. If your interest is piqued, let us calculate your actual return on investment with real data - you will be surprised with the results!
Contact Us for more information.
Robert J. Kuhns, "Advanced Leveraging: The New Generation of TMs," A TAUS Report, De Rijp, Netherlands, 2007
Centre for Translation Studies of the University of Leeds, "Increasing leveraging from shared industry data" available at http://www.translationautomation.com/technology/increasing-leveraging-from-shared-industry-data.html