Site Reliability Engineering
GPTKB entity
Statements (49)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:academic
|
gptkbp:abbreviation |
SRE
|
gptkbp:appliesTo |
software systems
|
gptkbp:coinedBy |
gptkb:Ben_Treynor_Sloss
|
gptkbp:emphasizes |
blameless postmortems
collaboration between development and operations proactive engineering |
gptkbp:focusesOn |
automation
reliability scalability |
gptkbp:goal |
maximize system availability
minimize downtime reduce toil |
gptkbp:hasConcept |
service level agreements
service level indicators service level objectives |
gptkbp:hasRole |
Site Reliability Engineer
|
https://www.w3.org/2000/01/rdf-schema#label |
Site Reliability Engineering
|
gptkbp:originatedIn |
gptkb:Google
|
gptkbp:principle |
incident response
monitoring capacity planning accept failure as normal automate wherever possible automation of operations balance change and stability eliminate toil embrace risk error budgets implement gradual change learning from failure leverage tooling and automation measure everything measure service health monitoring distributed systems postmortems prioritize reliability reduce organizational silos share ownership |
gptkbp:publishedBy |
gptkb:O'Reilly_Media
|
gptkbp:publishedIn |
gptkb:Site_Reliability_Engineering:_How_Google_Runs_Production_Systems
|
gptkbp:relatedTo |
gptkb:DevOps
software engineering IT operations |
gptkbp:uses |
monitoring tools
automation tools incident management tools |
gptkbp:bfsParent |
gptkb:DevOps
|
gptkbp:bfsLayer |
5
|