We're looking for a Senior Platform Engineer (3-month initial contract, open to extend) to join our distributed team at Graphy. You'll help build infrastructure and tooling around two core products: our AI API + SDK (build the foundation) and web app (maintain legacy system).
We're managing both a legacy system serving 300k users and a rapidly evolving AI API + SDK. We need your help to transition to modern, maintainable infrastructure while ensuring reliability at scale.
What we look for
Must-haves:
- Someone who can own infrastructure problems end-to-end.
- A problem solver who evaluates trade-offs and proposes solutions.
- Experience designing infrastructure that product engineers can understand and maintain long-term.
Nice-to-haves:
- Modern tooling expertise: Vercel, PlanetScale, Langfuse, serverless patterns.
- AI infrastructure experience (LLM observability tools etc).
- Experience with infrastructure-as-code tools like SST, Pulumi or similar.
What you’ll do
SDK + AI API (primary focus)
- Lay the foundation for a robust, scalable infrastructure.
- Build a new CI pipeline from scratch:
- Fast deploys: <5 min ideally (a must for hotfixes).
- True preview branches for e2e testing before merge.
- Explore partial / incremental deployments.
- Reliable blue-green deployments (we have this for the web app but want to level up).
- Design for scale across Vercel / serverless platforms (we’re exploring Vercel functions and fluid compute but are flexible).
- Build resilience by designing fallback and failover paths (e.g. if Vercel or OpenAI goes down → use AWS Lambda or Anthropic).
- Safe and reliable migrations: help build team confidence around database migrations, rollbacks and releases.