Skip to content

Language detection

You can supplement the anti-spam filtering by using language detection that lets you disallow messages based on specific languages. For example, if your company doesn't communicate in certain languages, you can disallow messages written in those languages to reduce unwanted emails.

You can configure actions such as tagging the subject line, quarantining, or deleting these emails.

Configure language detection

To configure language detection, do as follows:

  1. In your Email Security policy, go to Settings > Inbound > Language.
  2. Select the actions you want to take. You can choose from the following options:

    • Tag subject line
    • Quarantine
    • Delete
  3. (Optional) Select Include In End User Quarantine to let your users view, release, or delete these messages themselves. For more information, see End User Quarantine.

  4. Save the policy.

Language detection is configured. Sophos Email will perform the action you configured if emails contain a disallowed language. You can view the Message History report and Quarantined Messages for more details about the message.

Detection efficacy

Language detection is based on the meaning of a message, so when the interpretation of meaning fails, it results in misdetection. To ensure accurate detection, be aware of the following factors that affect detection efficacy:

  • Language detection is triggered for the languages configured in your policy settings.
  • Short or cryptic messages and a combination of similar languages, such as Ukrainian and Russian, may cause inaccuracies.
  • Subjects are too short, contain abbreviations, or include multilingual tags. So, they're ignored to prevent misdetection. Message attachments might be in a different language than the message itself. Therefore, language detection ignores attachments to prevent misdetection.

    Note

    Language detection is exclusively based on the email body. The detection disregards numbers, special characters, symbols, and URLs to minimize the risk of misdetection.

  • Emails containing multiple languages in their body may be classified as one of the languages present in the email.

  • System- or machine-generated emails pose a risk of misdetection as they use a common language, usually English, in their templates. Whenever feasible, templates are actively detected and ignored. However, the risk of misdetection due to the use of templates can't be completely eliminated.

Note

Although the language detection is designed to prevent misdetection, it may not always be successful.