Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote
8 days ago
You’ll work on the C++ layer that powers local AI, porting and enhancing inference engines like llama.ONNX and similar, to run efficiently on Нижних devices.Your focus is on the runtime: making models load faster, run leaner, and perform well across different hardware.You’ll ensure that th