Site Reliability Engineer - Hedge Fund, Prediction Markets
hace 12 horas
Santander
Site Reliability Engineer - Hedge Fund, Prediction Markets Full Time, Barcelona Office careers.bluewalker.capital About the Company BlueWalker Capital is a hedge fund developing systematic and quantitative trading strategies native to prediction markets. We log hundreds of millions of trades and order book ticks per day, isolate market pairs from tens of millions of combinations, and continuously monitor for a range of arbitrage opportunities we execute on. Its still early days for us—we are deepening a moat across trading infrastructure, datasets, and modeling. We trade algorithmically using our own balance sheet, generating daily returns exceeding 1% with the capacity to rotate hundreds of thousands of euros in daily volume. Our mission is to become the leading hedge fund in this emerging asset class, delivering investment returns while making prediction markets efficient as the worlds most accurate source of truth. About the Role We are a small engineering team, which means your responsibilities scale rapidly and your impact is clear and foundational. Much of our infrastructure is still greenfield, and entire systems can be yours to design and own. Youll ensure the reliability, observability, and operational robustness of systems handling hundreds of thousands of euros in daily trade volume. Responsibilities Own uptime and reliability of trading infrastructure: execution systems, market data pipelines, and risk monitoring. Build and maintain observability across the stack (Prometheus, Grafana, distributed tracing) with alerting tuned to trading-specific failure modes. Define SLOs and error budgets for latency-sensitive services; lead incident response and post-mortems. Harden infrastructure against data feed outages, exchange connectivity failures, and clock drift. Manage and automate deployments, environment configuration, and secrets management (Kubernetes, Terraform, CI/CD, HashiCorp Vault, GCP Secret Manager). Capacity plan and performance-tune systems under real trading load, identifying bottlenecks before they become incidents. Who we are looking for Hard Skills 2+ years in a production SRE or DevOps role with ownership of critical, low-tolerance systems. Proficiency in Python or Go for tooling, automation, and service development. Hands-on experience with Kubernetes, Terraform, and CI/CD pipelines. Strong Linux fundamentals: networking, kernel tuning, process isolation, and time synchronization (PTP/NTP). Experience with observability stacks and building alerting logic that minimizes noise without missing critical events. Soft Skills Fluent English Sharp and methodical under incident pressure; rigorous in post-mortems. Self-starter who takes ownership of problems end-to-end with limited guidance. Preferred Skills Familiarity with trading infrastructure: FIX protocol, market data (ITCH, OPRA), exchange connectivity. Experience with high-throughput messaging systems from durable, replayable pipelines (Kafka) to low-latency in-memory streams with consumer semantics (Redis Streams) to brokerless microsecond-range IPC (ZeroMQ). Understanding of latency profiling and performance optimization in time-critical systems. What it’s like here Fast paced environment. We value strong integrity. We’re direct and honest with each other. 5 days a week, in person. Full immersion. Project ownership and autonomy, flat company structure. We value people who take initiative and are able to figure things out. Company culture is informal, non-hierarchical, ambitious, collaborative and entrepreneurial. Compensation package Competitive compensation, tailored to ability and experience. Benefits Team Offsite Flexible Holidays Gym Membership 3 walking to 3200 m2 gym 5 walking to FCG train station Cutting-edge infra and technology Links careers.bluewalker.capital Linkedin X