For a company that provides API products, a key metric on the performance of an API is to track and iterate on its response time.
What is Response Time?
It’s the amount of time it takes from the moment an application or system receives an initial request to when it returns the results.
Why do Response Times Matter?
We care about this metric because customers have different response time needs based on their workflows and business problems. The faster APIs are, the more workflows can be integrated and more business problems are solved.
We measure this by measuring the time it takes to parse inputs, make upstream requests, apply logic and computation, and form and return the response. Response times should be consistent, whether it was initiated internally or by a customer.
On top of the internal processing time, customers will ultimately want to focus on the overall response time, and that includes the network latency— the time it takes to transfer data between a client and server, including activities such as establishing a trusted connection. Depending on the physical location of the customer’s data center, the quality of their connection, and how the customer has architected the integration to our APIs, this can add anywhere from 5ms to 500ms and beyond to a response time.
How are Response Times Measured?
They are often measured in percentiles, to indicate the percentage of requests processed within a certain timeframe. This gives a more accurate representation of overall user experience compared to an average response time that may be more skewed by outliers or not indicative of most requests.
For example, knowing the average response time is 500ms does not give nearly the same amount of information and confidence as knowing that the 99th percentile (p99) is 500ms— that 99% of all requests were returned in 500ms or less.
How can I Improve Response Time?
To improve response times, there are a number of areas to look at:
- Connection settings – for example, it’s important to make sure that connections are kept alive between requests, not re-established with each new request.
- Application profiling – profile the steps needed in executing the request, analyze each task in the waterfall, and identify where time is spent and whether those activities can be parallelized or engineered more efficiently.
To stay ahead of the market curve, it’s imperative to continually enhance the performance of API response times. At Ekata, we take pride in making sure our APIs have fast response times and low latency, so our customers and their own customers can have a seamless, frictionless experience. Contact us if you have any questions.