Junior Computational Linguist
hace 7 días
Donostia / San Sebastián
About the role Verbio by Capacity is hiring a Junior Computational Linguist to join the Capacity delivery team. This role involves developing and deploying advanced voice solutions within a fast-paced, multidisciplinary environment. The core focus is on ensuring the linguistic quality and robustness of our voice-enabled products, particularly for the Spanish and LATAM markets, so Spanish language is a must for the position. The successful candidate will address dialectal variations, linguistic/acoustic challenges and NLU problems to deliver culturally appropriate solutions to millions of users. Obtenga más información sobre este puesto leyendo los detalles a continuación y luego envíe su solicitud para ser considerado. Key responsibilities will include: • Data Annotation and Curation: Processing, cleaning, and annotating large datasets of linguistic data to train and test our Natural Language Processing (NLP) and Automatic Speech Recognition (ASR) models., • Model Evaluation: Rigorously evaluating the performance of ASR and NLU (Natural Language Understanding) models, identifying linguistic errors, and pinpointing areas for improvement., • Lexicon and Grammar Development: Assisting in the maintenance and expansion of language-specific lexicons, grammars, and language models for specific project needs., • Cross-functional Collaboration: Providing linguistic insights and support to engineers and project managers to ensure projects' success., • Focus on Spanish and LATAM Markets: Leverage specialized linguistic expertise in Spanish, including various LATAM dialects and co-official languages from Spain, to guarantee product accuracy and optimize user satisfaction within these critical markets. We most definitely want you if... We are seeking a highly motivated individual with a strong background in Linguistics and a passion for technology to join our team. What We're Looking For: • A quick learner who thrives in multidisciplinary, intercultural environments., • Prior experience with NLU (Natural Language Understanding) and ASR (Automatic Speech Recognition) projects., • Familiarity with a Linux environment., • Previous experience with chatbots and AI agents. You will… • Analyze, review, and prepare large text corpora for language technology product development, with a focus on Spanish-speaking regions (Spain and LATAM)., • Support computational linguists and engineers in making technical decisions related to data preparation and processing for new products., • Contribute to the development, testing, and maintenance of linguistic resources such as grammars, models, and dictionaries., • Collaborate in the deployment and quality assurance of new language models., • Apply expertise in computational linguistics to enhance product performance across different language and dialect variations. What you will do in your first 3 months (List of main tasks) • Analyzing and preparing language data for product development, focusing on the needs of Spain and LATAM projects., • Support Computational Linguists and Engineers with your linguistic expertise to make proper decisions for product development., • Review and improve existing linguistic resources (dictionaries, grammars, and models) relevant to ongoing projects., • Learn the day-to-day work tools and procedures. Skills and Experience [MANDATORY AND MUST skills & experience] • Master in Linguistics or Computational Linguistics., • Spanish: native, • English (B2 or higher)., • Experience working with large text corpora., • Familiarity with Linux-based systems., • Strong organizational skills, reliability, and can-do attitude., • Team spirit, responsiveness, flexibility, and proactivity., • Capacity and willingness to learn and face new professional challenges. Nice to have [Nice to have and plus skills, not mandatory] • Additional languages (not English, English is mandatory for the job)., • Some experience in Python programming., • Experience working in Linux and programming in Bash., • Experience with other programming languages (Go, C++, Rust...)., • Experience using Docker., • Experience in prompt engineering., • Experience configuring chatbots and AI agents., • Experience in AI programming., • Knowledge on regular expressions., • Some knowledge on data analytics., • Background on phonetics. xcskxlj, • Background on semantics.