UNDP : Setting Up a Transcription and NLP Pipeline for Capturing Stories – Istanbul

  • Location:
  • Salary:
    negotiable / YEAR
  • Job type:
    OTHER
  • Posted:
    3 months ago
  • Category:
    Language and Writing Services
  • Deadline:
    06/04/2025

JOB DESCRIPTION

Mission and objectives

As the United Nations lead agency on international development, UNDP works in 170 countries and territories to eradicate poverty and reduce inequality. We help countries to develop policies, leadership skills, partnering abilities, institutional capabilities, and to build resilience to achieve the Sustainable Development Goals. Our work is concentrated in three focus areas; sustainable development, democratic governance and peace building, and climate and disaster resilience.

Context

To contribute to knowledge-sharing efforts and the documentation of lived experiences, we collect video and audio interviews capturing personal stories from diverse individuals. These interviews serve as a rich source of qualitative data for understanding lived realities, cultural contexts, and social dynamics, contributing to evidence-based dialogue and informed decision-making for sustainable development. However, the volume of recording collected has created a bottleneck in processing and analysis. Each recording needs to be transcribed, translated (where necessary), and analyzed to extract key insights through natural language processing (NLP) techniques. Given the sensitivity of the recordings, all processing must be conducted locally to ensure data security. The processing must accommodate multiple languages—starting with Setswana, English, and Russian. A language-detection feature or user-enforced settings should ensure accurate processing. This assignment is aimed at establishing a pipeline for: • Batch transcription of recordings • (Optional) Translation into English. • (Optional) Basic NLP processing (e.g., named entity recognition and keyword extraction). This initiative will contribute to advancing innovative methods for qualitative analysis while preserving data privacy and security. Volunteers will have the opportunity to use their skills to create a tool that enables meaningful insights and promotes dialogue for sustainable development.

Task Description

The online volunteers will work on creating a pipeline for NLP processing audio and video records. More specifically this includes: 1. Develop a Python-based pipeline for processing video interviews that includes: a. Batch transcription of audio and video files. b. Language detection or user-enforced language selection. c. (Optional) Translation of non-English transcripts into English. d. (Optional) Basic NLP processing, including named entity recognition and keyword extraction. 2. Ensure the solution can handle multiple file formats (e.g., MP4, MKV, MP3, WAV). 3. Processing must be conducted locally to ensure data security 4. Build an interface or script settings to enable user configuration (e.g., enable/disable translation, language selection). 5. Test the pipeline with sample videos in Setswana, English, and Russian. Deliverables: • A Python-based pipeline script for local processing. • Documentation on how to install and use the pipeline. • Sample outputs demonstrating successful transcription, translation (optional), and NLP analysis.

Competencies and values

Living conditions and remarks

Level of Education: Bachelor Degree

Work Hours: 8

Experience in Months: No requirements

This job has expired.