Apicworld

Site Reliability Engineer (SRE)

6.0/10

Apicworld

Not specified
Remote
mid
about 3 hours ago
devtechLinuxPostgreSQLMongoDBClickhouseRedisRabbitMQKafkaGitLabVictoriaMetricsPrometheus

AI Summary

The vacancy provides clear responsibilities and tech stack but lacks compensation details and company information.

Description

Join Apicworld as a Site Reliability Engineer to maintain the stability and reliability of our production environment.

Work remotely or relocate to Cyprus.

We’re looking for a Site Reliability Engineer (SRE) to join our team in Cyprus (on-site) or remotely.

In this role, you will be responsible for maintaining the stability and reliability of our production environment.

## What you'll do

  • Ensure the stability of production and development infrastructure
  • Develop and improve monitoring, alerting, and observability (metrics, logs, tracing)
  • Configure and optimize metrics and logging systems
  • Analyze incidents and prevent their recurrence
  • Work with alerts and improve their quality
  • Increase service reliability and fault tolerance
  • Optimize system performance and stability

## Conditions

  • Remote work or from our office in Limassol
  • Compensation for English or Greek classes
  • Health insurance (only for Cyprus)
  • Office lunches (only for Cyprus)
  • Flexible start of the working day

Requirements

  • Strong understanding of Linux
  • Experience as an SRE / DevOps / System Engineer
  • Solid experience with monitoring and alerting tools (Prometheus, Grafana, or similar)
  • Understanding of observability (metrics, logs, tracing)
  • Experience with Kubernetes and containerization
  • Experience in incident analysis and production troubleshooting
  • Automation skills (Bash, Python)
  • Understanding of networking, performance, and fault tolerance
  • Experience with GCP is a plus
Loading similar jobs...