🛠️
YourToolsKit
🛠️ SRE & DevOps Tools — Free, Browser-based

Free SRE & DevOps Tools for
Site Reliability Engineers

Nine free, browser-based tools for SRE and DevOps engineers — SLA uptime calculator, error budget tracker, CIDR subnet calculator, curl builder, HTTP status codes, P99 latency percentiles, bandwidth calculator and more. All client-side. No sign-up. No data sent to servers.

🛠️ Try My SRE Assistant →Browse all 72 tools
✦ AI-POWERED — Ask in plain English, get exact numbers
My SRE Assistant
"Our SLA is 99.9% and we've been down 28 minutes this month — how much error budget is left?"
Open Assistant →

All SRE & DevOps Tools

9 tools · All free · Client-side only · No sign-up required

📊
SLA / Uptime Calculator

Calculate allowed downtime for 99.9%, 99.95%, 99.99%, and 99.999% SLAs. Daily, weekly, monthly, yearly breakdowns with error budget context.

SLAReliability
Open →
🎯
Error Budget Calculator

Track error budget consumption and remaining budget for any SLA. Visual burn rate indicator — red when you are burning too fast.

SRESLO
Open →
🌐
CIDR / Subnet Calculator

Network address, broadcast, subnet mask, wildcard mask, first and last usable IP, and host count from any CIDR notation.

NetworkingAWS
Open →
Curl Command Builder

Build curl commands visually — select method, add headers, set auth, paste body. Generates a ready-to-run curl command.

APIDebugging
Open →
💾
Byte & Bandwidth Calculator

Convert between bytes, KB, MB, GB, TB and PB instantly. Calculate file transfer times at any bandwidth speed.

CapacityStorage
Open →
🔌
Port Reference

Searchable reference of 40+ common network ports — SSH, HTTP, MySQL, Redis, Kafka, Kubernetes, Elasticsearch and more.

NetworkingReference
Open →
🔢
HTTP Status Codes

Complete reference with plain-English descriptions and fix guidance for every HTTP status code. Filter by 1xx through 5xx.

APIDebugging
Open →
🔄
YAML ↔ JSON Converter

Convert JSON to YAML for Kubernetes manifests, Docker Compose, and GitHub Actions — or YAML back to JSON. Instant, browser-based.

ConfigKubernetes
Open →
📈
Percentile Calculator

Calculate P50, P75, P90, P95, P99, P99.9 from any latency dataset. Outlier detection flags tail latency issues automatically.

LatencySLO
Open →

What are SRE Tools and Why Do You Need Them?

Site Reliability Engineering (SRE) is a discipline that applies software engineering principles to infrastructure and operations problems. SRE teams are responsible for the availability, latency, performance, efficiency, and reliability of production services.

Every SRE team needs to calculate SLAs, track error budgets, plan network capacity, analyse latency distributions, and debug API failures. These calculations are typically done in spreadsheets or mental math — our tools make them instant, accurate, and accessible from any browser.

Whether you are in the middle of an incident calculating your remaining error budget, planning a new VPC with the right CIDR blocks, or debugging a 502 Bad Gateway error at 2am — these tools give you the correct numbers immediately without opening a spreadsheet or writing code.

All tools are browser-based. No data leaves your machine. No sign-up or account required. Use them on any device — desktop, tablet, or mobile.

Common SRE Workflows

🚨
During an Incident
  1. Open Error Budget Calculator
  2. Enter your SLA % and window (30 days)
  3. Enter minutes of downtime so far
  4. See remaining budget and burn rate
  5. Or use My SRE Assistant for all in one chat
🏗️
Planning a New VPC
  1. Open CIDR / Subnet Calculator
  2. Enter your VPC CIDR (e.g. 10.0.0.0/16)
  3. Plan subnets per availability zone
  4. Verify usable host counts
  5. Use /24 for standard subnets (254 hosts)
📊
Analysing API Latency
  1. Collect latency samples from your APM
  2. Open Percentile Calculator
  3. Paste values (one per line or comma separated)
  4. Read P50, P95, P99, P99.9 instantly
  5. Check outlier warning for tail latency issues
🔍
Debugging API Errors
  1. Open HTTP Status Code Reference
  2. Search for the status code you are seeing
  3. Read plain-English description and fix guidance
  4. Use Curl Builder to reproduce the failing request
  5. Check Port Reference for firewall rules

Frequently Asked Questions

What is an SRE tool?

SRE (Site Reliability Engineering) tools help engineers measure, track, and improve service reliability. Common SRE tools calculate SLA uptime allowances, track error budget consumption, analyse latency percentiles (P50/P95/P99), and plan network capacity. These calculators are essential for teams following Google's SRE practices of setting SLOs, measuring SLIs, and managing error budgets.

What is an error budget and how do I calculate it?

An error budget is the maximum allowed unreliability for a service within a rolling time window. If your SLA is 99.9% over 30 days, your error budget is 43.8 minutes of downtime. Error budget consumed = downtime experienced / total budget × 100%. When budget is exhausted, SRE teams freeze feature releases and focus on reliability work. Use our Error Budget Calculator to track consumption in real time.

What is the difference between SLA, SLO, and SLI?

SLI (Service Level Indicator) is a quantitative measure of service behaviour — e.g., request success rate or latency. SLO (Service Level Objective) is an internal reliability target — e.g., 99.9% requests succeed within 200ms. SLA (Service Level Agreement) is an external contract with consequences — e.g., customer refunds if availability drops below 99.5%. Error budgets are derived from SLOs, not SLAs.

Why do SRE teams use P99 instead of average latency?

Average latency hides the worst user experiences. A service with 200ms average latency might have P99 latency of 8 seconds — meaning 1 in 100 users experiences severe slowness. P99 represents the 99th percentile: 99% of requests are faster than this value. For APIs handling 1,000 requests/second, P99 directly affects 10 users every second. SRE teams set SLOs based on P95 or P99, not averages.

How do I calculate subnets for an AWS VPC?

For an AWS VPC, start with a /16 CIDR (65,536 IPs) and divide into /24 subnets (256 IPs, 254 usable) per availability zone. A common pattern: one public /24 and one private /24 per AZ, giving 6 subnets across 3 AZs. Use our CIDR Calculator to verify network addresses, broadcast addresses, and usable host counts before creating subnets in the AWS console.

What is a good SLA target for a production API?

99.9% (Three Nines) is standard for most production APIs — it allows 43.8 minutes of downtime per month. 99.95% allows 21.9 minutes per month and is appropriate for business-critical APIs with paying customers. 99.99% (Four Nines) allows only 4.38 minutes per month and requires redundancy, automated failover, and zero-downtime deployments. Five Nines (99.999%) is rare and typically only for telecom infrastructure.

What HTTP status codes mean there is a server problem?

5xx status codes indicate server-side errors. 500 Internal Server Error means an unexpected bug or crash on the server. 502 Bad Gateway means the upstream server returned an invalid response to the proxy — check your backend service health. 503 Service Unavailable means the server is overloaded or in maintenance. 504 Gateway Timeout means the upstream did not respond in time — check for slow database queries or network issues.

Related Developer Tools

SRE and DevOps engineers also use these tools regularly — JSON formatting, diff checking, regex testing, JWT decoding, and more. All free, all browser-based.

JSON FormatterDiff CheckerRegex TesterJWT DecoderBase64 EncoderCron GeneratorUnix TimestampUUID GeneratorHash Generator

Want to use all these tools in one conversation?

My SRE Assistant lets you ask any reliability question in plain English and chains these tools automatically.

🛠️ Open My SRE AssistantView all AI Agents