Senior Software Engineer - LLM Evaluation - RibbitZ
2 days ago
New York
Job Description Our client RibbitZ is looking for Senior Software Engineer-LLM Evaluation to work remotely. As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making corrections in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go; evaluating and refining AI-generated code for efficiency, scalability, and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions. What Does a Typical Day Look Like? • Working on AI model training initiatives by curating code examples, building solutions, and correcting code in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go., • Evaluate and refine AI-generated code to ensure that it is efficient, scalable, and reliable., • Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks., • Build agents that can verify the quality of the code and identify error patterns., • Hypothesize on steps in the software engineering cycle (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities on them, • Several years of software engineering experience (+5 years), including, 2+years of continuous full-time experience at a top-tier product company (e.g., Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research)., • Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools., • Deep understanding of software architecture, design, development, debugging, and code quality/review assessment., • Software Engineering profiles only, • Candidates must be based in the US, • 5+ years of relevant experience, • Google (Alphabet), • Apple, • Amazon, • Meta (Facebook), • Netflix, • Microsoft, • Tesla, • NVIDIA, • Adobe, • Salesforce, • Github, • Atlassian, • hashiCorp, • Databricks, • Snowflake, • Cloudflare, DigitalOcean, MongoDB, • Elastic, Confluent, Airbnb, Dropbox, • Stripe, Palantir, Uber, Lyft, • Square (Block), Twilio, Snap Inc., • Pinterest, Figma, Oracle, Cisco, • Paypal, Doordash, Rivian, Reddit, Coinbase, Splunk, • Spotify, Goldman Sachs, Morgan Stanley, • JP Morgan Chase, Capital One, • Plaid, Shopify, Intuit, Workday, ServiceNow, • Hugging Face, VMware, Brex, Wise, • Epic Games, Unity Technologies, • Activision Blizzard, Riot Games, Valve, • Huawei, Bloomberg, ByteDance, • Alibaba, Baidu, Notion, Klarna, • Instacart, Zillow.