This paper describes the development of an automatic broadcast data transcription system for Lithuanian. The system performs fully automatic transcription of broadcast media recordings, including speech/non-speech detection, speaker diarization, speech-to-text conversion and automatic punctuation restoration. The system was developed in collaboration with the Baltic Media Monitoring Group (BMMG). The system is currently used in production for performing various broadcast speech monitoring tasks.
展开▼