Skip to main content

Multilingual Services@EC: AI to support the European Commission’s multilingual services

New AI-powered European digital multilingual services available for the European Commission and for the European public administrations
m13
m13

The Responsible Organisation

The European Commission’s Directorate-General for Translation, also known as DGT, is the branch of the European Commission responsible for translating the Commission's written texts into the 24 official EU languages. This ensures that all citizens of the European Union can understand and access the policies, information, and legislation developed by the Commission in their own language. DG Translation plays a crucial role in promoting multilingualism, fostering communication, and ensuring transparency within the EU public sector. 

The problem

With the aim of ensuring broader access to translation services, DGT embarked on the mission to make available advanced machine-powered language services in all the 24 official EU languages to speed up all European Institutions (EUIs) activities. The implementation of AI-powered language services aims to address the growing demand for efficient and accurate multilingual communication within the EU, while also addressing the challenges presented by the diversity of the EU’s linguistic landscape.  

Specifically, DGT has focused on three main types of language functions: 

  • translation, to convert texts from one language into another; 
  • summarization, to build condensed versions in a certain language of large documents available in another language; 
  • and briefing, to draft documents in specific languages providing relevant information and analysis of a certain topic. 

There are some challenges associated with ensuring a larger access to language services through AI-powered tools.  First of all, the complexity and nuances of human languages, that are characterised by idioms, cultural references, and context-specific meanings. It is challenging to ensure machines can accurately translate or summarise with quality, or to grasp the tone and the exact meaning conveyed in a piece of text.  Another significant challenge is the constant evolution of languages. Maintaining the cultural richness and idiomatic expressions unique to each language presents an additional complexity, making it difficult for machines to deliver contextually appropriate and sensitive translations. An additional challenge is guaranteeing the privacy and security of the data being processed.  Last but not least, a significant challenge is to achieve universal coverage of all official EU languages, ensuring that every eligible user receives the same quality of language services. This is particularly difficult for languages with fewer speakers, which the commercial language service market tends to overlook. Additionally, there is a need to include major "non-EU languages" essential for communication, such as Chinese and Arabic. The diversity of the EU's linguistic landscape requires specific investment in language technology to ensure that the needs of all member states are met equitably.   

The solution and its implementation

The European Commission implemented a range of AI-based multilingual services developed by DGT in cooperation with the Directorate-General for Communications Networks, Content and Technology (DG CNECT) as part first of the Connecting Europe Facility programme, and now of its successor, the Digital Europe Programme.  

The services made available by the European Commission cover the three main language actions above-mentioned (translation, summary and briefing). The translation tool, eTranslation, was developed first, with the eSummary and eBriefing tools being available more recently. All three tools are accessible through an API connection based on SOAP/XML standards. Additional tools that are available include Speech-to-Text, Multilingual Tweet and NLP (Natural Language Processing) Tools such as Anonymisation, Classification and Named-Entity Recognition.  

The tools can be used by EU institutions and public administrations, including local and regional authorities; SMEs; NGOs; academia, and projects financed by the Digital Europe programme in the EU and in countries associated with the Digital Europe Programme. 

There are 3 main tools available:

  • eTranslation is an AI-based service that can translate documents between all the 24 official EU languages and other languages such as Arabic, Chinese-simplified, Icelandic, Japanese, Norwegian, Russian, Turkish and Ukrainian. The tool maintains the formatting of the original document, except for pdf files, which are returned as docx (MS Word format).  
    Moreover, short unformatted text snippets (up to 2500 characters) can also be translated on the eTranslation web interface. eTranslation was set up to provide quality machine translation for all official EU languages including those with fewer speakers where the market providers do not focus. Over time, the number of languages covered has grown, and will continue to grow. It is designed to be a secure system that protects privacy and passes on Intellectual Property Rights to the requester of the machine translation. eTranslation is available for EU institutions and other eligible users in the EU and in countries associated with the Digital Europe Programme. eTranslation performs in a hybrid environment, as the access point to the service is hosted in the European Commission Data Centre while the translation engines are cloud-based with servers located in Amsterdam.  
     
  • eBriefing is an AI-based service that utilises a large language model to create initial drafts. It only uses information from documents submitted by the user to generate content based solely on the information provided by the user in the form of documents and it does not pull in external information or context on its own. For similar purposes, the service also does not include further data from its training set to ensure that the AI does not incorporate or rely on additional information it has learned from the vast datasets it was trained on. This could be an important privacy or security feature, ensuring that the content generated is purely a reflection of the user-submitted documents and not influenced by potentially sensitive or unrelated material from the training data. Users can generate briefings in the EU style or in a general style and can request an overview of the uploaded documents as well. Regarding data protection, SNC (Sensitive non-classified) data is allowed and no EU/EU Restricted or higher should be uploaded to eBriefing.  The service has a clear privacy policy and a statement about it is available through a web user interface. 
    The tool, along with other associated tools such the above eTranslation, records user information for access and statistical purposes, which is stored for 18 months. The draft briefing is available for 72 hours and then deleted. Users can also choose to have it deleted as soon as they have picked it up. User data is not shared with third parties. From a technical perspective, behind eBriefing there is GPT-4 platform located in France. eBriefing is 100% cloud based in a cloud server located in Netherlands.    
     
  • eSummary is another AI-based service to which users can send documents to obtain a shortened version. By connecting to eTranslation, it can handle all of the official EU languages, as well as the additional languages eTranslation supports. It uses AI algorithms to choose where the emphasis lies in the document and to extract a general idea of the topic and of the main thrust of the documents uploaded. To summarise the uploaded documents, eSummary takes decisions on what it will include and what it will leave out. Once the summary has been produced, the user can review the summary automatically produced and modify it accordingly. eSummary like eTranslation, can handle all common MS Office formats (Word, Excel, LibreOffice) and PDF files. The outputs are MS Word documents in all cases.  From technical perspective, eSummary is based primarily on a local instance of the bart-large-cnn open source pre-trained model and it is available both through a web user interface and an API as a full service, powered with a full GPU-based infrastructure in a cloud environment. To the date, eSummary infrastructure is not the same as eBriefing service but evaluations are ongoing to upgrade eventually to an improved LLM like the one used for the eBriefing service in terms of capacity.  

Main challenges

Implementing inside a large public administration a machine-powered translation, summarisation, and briefing services for all official EU languages, particularly those with smaller populations, presents certain challenges, among which the main are the following: 

  • Firstly, developing accurate and nuanced translation algorithms for less commonly spoken languages can be complex due to the scarcity of available data for machine learning.  
  • Secondly, ensuring the cultural sensitivity and context-appropriateness of translations can be problematic, as languages are deeply intertwined with their native culture.  
  • Thirdly, the accessibility and usability of these services across all public administrations and other eligible users in the EU can be hindered by varying levels of digital literacy, infrastructure, and integration capacity.  
  • Lastly, the issues of data privacy and security also pose significant challenges. A feature that distinguishes DGT’s AI-based services from certain commercial systems is that all data is processed in a secure environment and deleted after a brief time period, specifically after 24 hours maximum in the case of eTranslation and 72 hours maximum for the other AI-based services. The Commission does not keep the users' data and will not share it with third parties and users keep ownership of their data throughout the entire process.  

Expected benefits

Offering machine-powered translation, summarisation, and briefing services for all official EU languages, including those spoken by smaller populations, provides numerous benefits to its users. 

  • Primarily, it fosters inclusivity and ensures no language or culture feels left behind in the European public administrations.  
  • Secondly, making these services available ensures broad access to high-quality translation services not only to European public administrations, including local and regional authorities, but also to small and medium-sized enterprises, academia, non-governmental organizations and Digital Europe Programme projects. 
  • Thirdly, users can benefit from these services to enhance their operations’ efficiency, save time, and ensure crucial information is accurately conveyed in every language.  
  • Finally, AI-powered translation services at the EU level can also support policy-making, as decisions can be better informed by a broader range of sources from various linguistic backgrounds, and it can hopefully significantly contribute to the unity, prosperity, and growth of the European Union.

 

Divider

Useful Link

Language tools web page: 

These are the registration pages for external users for individual use. Users users can use this link to register for all the available language services:

Divider

Detailed Information

Year: 2017 (eTranslation), 2023 (eSummary) and 2024 (eBriefing)

Status: Implemented

Responsible Organizations: European Commission’s Directorate-General for Translation (DGT)

Geographical extent: Across Countries

Country: European Union and countries associated with the Digital Europe Programme

Function of government: General Public Services

Technology: Artificial Intelligence

AI domain: Neural Machine Translation, Machine Learning, Generative AI

Interaction: Government 2 Government, Government 2 Citizens and Government 2 Businesses

Divider

Do you want to know more about this story?