Job Description
- Participate in stability construction, responsible for system R&D including planning, resource diagnosis and isolation, server and OS/kernel configuration, and resource management
- Responsible for docking real-time anomaly detection and fault prediction data, establishing rotation scheduling capabilities under large-scale clusters, and building full-link fault isolation and periodic rotation system based on data decision-making
- Responsible for resource and configuration management of cloud product underlying systems
Position Requirement
Basic Qualifications:
- Solid computer foundation, proficient in one of the programming languages such as C/C++, Java, Golang, Rust, and Python for development
- Familiar with Unix/Linux and can diagnose common problems
- Thorough understanding of common algorithms, able to independently analyze and decompose business problems into effective engineering solutions
- Good communication skills, willingness to summarize and share, and teamwork skills
Preferred Qualifications:
- Familiar with the container, K8S, ServiceMesh-related ecology
- Familiar with and have experience in using cloud platforms or internship experience in cloud computing
- Experience in data analysis