Lead AI Engineer (FM Hosting, LLM Inference) (Hiring Immediately)
2 days ago
New York
Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems. LLM Inference, Similarity Search and VectorDBs, Guardrails, Memory) using Python, C++, C#, Java, or Golang. Lead AI Engine