April 22, 2026AI-Proof Issues in Software Translation
By admin-emal-93 Views-No Comment
Despite advances in AI and machine translation, localizing software remains difficult due to organizational issues such as content ownership, poor terminology management and silos.
- Redundant Retranslation and IP Risk
- Failure to Use “Source of Truth”: Software often uses external regulatory or standard texts. For example, accounting software might reuse IFRS texts. A frequent mistake is not verifying if an official, approved translation already exists. Instead of adopting the “source of truth,” software companies will pay for a new translation instead of just asking the authoring organization for their translation memory – often because the decision makers don’t even know what a translation memory is.
- Compliance and Usability Issues: Deviation from officially approved translations causes major usability and compliance problems for end-users. For example, if the software uses its own translations for key terms such as “entity” or jurisdiction that differ from the official translations, users will have serious problems.
- IP Exposure: The practice of “reverse-engineering” Translation Memories (TMs) by crawling copyrighted source PDFs is expensive and risky, exposing the client to IP issues. Many organizations even forbid third-party translations of their content.
- Terminology Chaos and Quality Gaps
- Incomplete Glossaries: Projects typically start with incomplete, hastily compiled glossaries, often because nobody bothered to write down all the terms while the software was being developed.
- Unbudgeted Research: Linguists are then forced to manually scrape terms from external documents (e.g., finding terms in multiple PDFs), a high-effort task that should be budgeted as pre-production scoping. Even worse, if they skip this step and just use poorly edited machine translation, you won’t find out until you’ve already paid for the job.
- TM Pollution: Using a “polluted” Translation Memory (TM) with inconsistent or incorrect terms can make the resulting human translation worse than raw machine translation. In my experience, inconsistent terminology in a TM is even worse than consistently wrong terminology, because it’s much harder to clean up.
- Recommendations
To mitigate these risks, software developers should:
- Talk to People: Before contacting a translation agency, identify the official “source of truth” and contact content owners directly to secure existing glossaries or TMs. The agency won’t do it, because their business model is not built around finding existing translations but rather producing new ones.
- Scoping: Treat terminology extraction and validation as a mandatory, budgeted pre-production step. Terminology extraction is the actual expensive part of translation, typing is much easier to automate.