Lead AI Engineer (FM Hosting, LLM Inference)
3 days ago
New York
LLM Inference, Similarity Search and VectorDBs, Guardrails, Memory) using Python, C++, C#, Java, or Golang. Lead AI Engineer (FM Hosting, LLM Inference). Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of lar