When browsing the web, your web browser is constantly chatting back and forth with web servers over HTTP — but once your browser has finished making its request, often it takes a while for the server to think about its response. This period between the end of the request and the start of the response is known as the Time to First Byte, and making this as small as possible is the key to making your website hop along quickly.
Generally, the Time to First Byte is the sum of how long it takes the web server to do everything it needs to get the response ready for the client, along with the time it takes for the packets to get from the server to the client.
Requests for static content are generally the easiest for servers to handle, because it just has to read a file from disk, and so often result in the shortest Time to First Byte, though if the web server is loading the data from a slow source, such as a mechanical hard drive or a network share, this can negatively affect responsiveness.
Dynamic responses, on the other hand, can sometimes take a while for the server to prepare, particularly if data has to be fetched from a database or otherwise processed. A notable example of this would be WordPress, where all posts and pages are generated on-the-fly based on data stored in a SQL database, resulting in a lot of work for the server on every request.
Finally, if the responding server is far away from the client, or otherwise has poor connectivity to the client’s ISP, the laws of physics dictate that it will take a certain amount of time for the first byte to make it across the Internet.
Ensuring the TTFB is as low as possible is crucial to make sure that people visiting a website get the best experience. If it takes the server too long to start responding with the requested page once someone clicks on a link, people will often give up and go elsewhere.
For this reason the TTFB also affects search rankings — snappy websites are often prioritised over slower ones.
The best way to reduce the TTFB is to use a CDN to cache content on servers closer to the end user. Often, there’s no reason something like a blog homepage needs to be re-generated on every request, and by storing a copy of this on various points of presence around the world, it ensures that it can reach the end user as quickly as possible — and also reduces the load on your server for when it does need to regenerate the page!
The other factor is to ensure that dynamic content is generated efficiently, by optimising database queries and processing, and otherwise implementing server-side caching for as many aspects as possible.
Along with Round-Trip Time (RTT), the Time to First Byte (TTFB) is a very important measure of web performance. It’s key to making a good first impression on end users of websites and web applications, as well as ensuring web services are delivered efficiently.
Latency is a measure of how long something takes to occur from the time it is requested (it is usually measured in 'ms' or milliseconds for Internet traffic).
Round Trip Time is a measure of how long a packet takes to reach a host, and back.
The Time to First Byte is the period between the end of a client request and the start of when it receives the server's response.