Acquisition of transcribing software for National Anti-Corruption Bureau of Ukraine

General background

The EU Anti-Corruption Initiative in Ukraine (EUACI) is the European Union’s technical support program in the area of anti-corruption in Ukraine, co-funded and implemented by the Ministry of Foreign Affairs in Denmark. The overall objective of the EUACI is to achieve significant progress in preventing and countering corruption, ensuring the coherence and systemic anti-corruption activities of state and local self-government bodies, and to empower civil society and citizens to contribute to the combatting of corruption, as well as the proper process of Ukraine’s post-war recovery. The program runs till April 2027.

The National Anti-Corruption Bureau of Ukraine (NABU) is one of the EUACI’s key partners, playing a vital role in combating high-level corruption and upholding the rule of law in Ukraine. As part of this cooperation, the EUACI supports the development of NABU’s IT capacity, ensuring that the Bureau has the necessary infrastructure and technological capabilities to operate efficiently and securely.

NABU has requested the EUACI’s support in acquiring specialized software capable of transcribing all voice and speech information into text documents. The proposed solution should meet the requirements outlined in this Terms of Reference (TOR).

NABU is a beneficiary for the procurement. The contracting authority is the Ministry of Foreign Affairs of Denmark, EUACI.

Objective

The objective of this procurement is to provide NABU with specialized transcribing software for processing linguistic information, including licensing and technical support services. The acquisition is intended to enhance detectives’ automated workplaces by integrating transcribing capabilities. This, in turn, will strengthen NABU’s overall IT capacity and support the agency in fulfilling its mandate more efficiently while maintaining a stronger operational track record.

Deliverables

The subject of the tender is software license, including technical support for a period of 12 months.

The software should be in compliance with the requirements specified in Annex 1 of the TOR, including software functionality, compatibility with NABU infrastructure, and security standards.

Budget, payments, and timeframe

The budget covers the cost of a software license, including technical support for a period of 12 months.

The delivery of the license will be carried out via email to the designated contact person at NABU and confirmed through an Acceptance Certificate signed between the EUACI and the Supplier.

The supplier may request an advance payment of up to 30% of the total cost upon contract signing. The remaining 70% will be paid after the license has been successfully transferred to NABU.

Timeframe:

Task	Date	Time
Issuing the Request for Bid	28 April 2025
Deadline for submission of bids	9 May 2025	17:00 Kyiv time
Evaluation of the bids (provisional)	13 May 2025
Notification of award to the successful Supplier (provisional)	14 May 2025
Signature of the contract (provisional)	20 May 2025
Delivery of software (provisional)	1 June 2025

How to apply

The deadline for submitting the proposal is May 9, 2025, 17.00 Kyiv time.

The financial bid should be submitted within the above deadline to [email protected] cc to [email protected] indicating the subject line: “NABU transcribing software”.

Prices must be quoted in EUR, including costs of delivery to the place of destination, all duties and taxes applicable, and excluding VAT. The EUACI has a VAT exemption as an international technical assistance program.

The EUACI Procurement Plan and Registration Card are available on the official website of the Cabinet of Ministers of Ukraine.

You will receive an auto-reply from the [email protected] mailbox when the offer has been received. If you do not receive an auto-reply, your offer was not received and you should contact the EUACI by phone.

Bidding language: English.

Any clarification questions regarding the terms of reference should be addressed to [email protected], not later than May 2, 2025, 17.00 time.

Evaluation criteria

The main selection criteria will be:

Full compliance with the requirements specified in Annex 1 of the TOR, including software functionality, compatibility with NABU infrastructure, and security standards.
Best price, including licensing and technical support services.

Annex 1. TECHNICAL REQUIREMENTS

#	Feature	Requirement
Software Requirements
1.1	Software architecture	Client-server
1.2	Access to external resources for file processing	Access to external resources for file processing is prohibited. The system will operate autonomously within the customer’s isolated computer network.
1.3	Licensing	There are no limitations on the number of clients served simultaneously. However, an increase in the number of clients will affect the processing (transcription) wait time.
1.4	Supported file types.	Supported file types include, but not limited to: Audio files: (wav, mp3) Video files: (avi, mp4) (only audio tracks are processed)
1.5	Automatic file processing	Priority functionality: Transcription of audio files (conversion to text). Extracting audio from video files. Splitting long audio files into parts for parallel transcription. Next priority functionality: Segmentation of audio files (separation) of transcribed text by speakers. Language identification. Subsequent priority functionality: Noise reduction of the audio track for operator listening. Voice recognition for speaker identification. Automatic object detection. Automatic summary of conversations. Report generation.
1.6	Management of the Automatic Processing	Interactive with operator involvement. The operator, using the client application, uploads one or more files, specifies the types of services, configures the parameters, and starts the processing task. Activation of necessary services on the server is performed automatically (without operator involvement) after the file is uploaded and the processing task is activated in the client application. Once the task processing begins on the server, the client application receives a textual result of the processing, with the option to download both the converted and original files from the server and save them to a storage device. Audio file processing is carried out on the server according to the queue formed by the operators of the system.
1.7	Operation log management	The operators are authorized through Active Directory. The actions of operators within the system and tasks related to file processing on the server should be recorded in the log. Logging of actions and operations in the client application is not required.
1.8	Sequence of file processing when forming a task	Tasks for the automatic processing of audio and video files are created by the operators’ client applications. Processing requests are recorded in the technological database of the system, from which a queue is formed. The queue is processed according to the FIFO (First In, First Out) principle and based on priority. The processing of the queue is carried out automatically by server services, taking into account the language settings defined by the operator for each file. After processing, audio and text files can be deleted by the user from the server. The task for automatic processing in the client application can include: Single files; A group of files. The results of the automatic processing will be received by the operator as follows: ▪ The processed files will be stored in the technological database and will be automatically loaded into the application as soon as it is launched for evaluation, transcription correction, and report preparation. Notes: In case of insufficient audio quality, when transcription is not performed automatically, the operator has the option to manually transcribe in the client application.It is also possible to re-transcribe a file with different processing parameters.Tasks created by the operator for processing and processed audio and text files will be available for listening and viewing the results only to the owner (the one who created the processing task).
1.9	Queue management	Queue management will be implemented later, if necessary, as a separate role of “Administrator.”
1.10	User management	The administrator can use the Active Directory service to manage roles and users within the system.
1.11	Saving of Automatic Processing Results	Files that were uploaded and added to the processing (transcription) task can be downloaded by the operator through the client application after task completion, along with all derived files. The results of an automatic processing task may include: The original audio/video file (uploaded, unconverted); An audio file extracted from a video file; A transcoded audio file for transcription; A raw transcribed text file (as-is); An operator-edited and auto-saved transcribed text file (post-editing); A noise-cleaned transcoded audio file (if activated, for comfortable listening); A supporting text file with metadata (sections such as processing parameters and their values, errors and descriptions). After reviewing and editing the transcribed text, the operator can export all or selected files to a chosen storage. Additionally, the operator should be able to delete selected files either before or after automatic processing.
Requirements for the server part (backend)
2.1	The number of audio/video files that must be transcoded and transcribed within a 24-hour period	The expected volume of audio files per day is at least 375 hours per day (including both video files with audio tracks and standalone audio files).
2.2	Horizontal scaling to increase audio processing performance.	By adding new processing servers.
2.3	Maximum duration of audio/video files.	The software must provide automatic file splitting based on its maximum duration.
2.4	Audio payload in the file.	Types: duplex (separate speakers in separate channels, digital telecommunications) simplex (multiple speakers in a single channel)
2.5	Audio speech recognition	Automated or manual
2.6	Expected acceleration for transcoding, noise reduction, and audio transcription (with simultaneous speaker segmentation).	Typical values: transcoding acceleration – up to 32x. noise reduction – up to 10x. transcription acceleration – up to 16x. Note: Transcription acceleration can range from 9x FTRT to 25x FTRT (faster than real-time). The exact value depends on the language, the percentage of speech in the audio, and the noise levels.
2.7	Languages for identification and transcription	The priority languages are Ukrainian, Russian, and English.
2.8	Audio speaker segmentation (separation)	Supported for simplex audio with several speakers.
2.9	Timestamping	Generation of timestamps for sentence blocks after transcription.
2.10	Spelling and punctuation	Logical segmentation into sentences and punctuation placement.