"What happens after a user enters a URL in the browser address bar?" This question is a classic for us front-end developers; it's fundamental to front-end development, a common topic in job interviews, and a basis for performance optimization. However, the focus of this article is not on what happens afterward, but rather on what happens beforehand—specifically, the steps our code goes through to become a webpage that internet users can access. How do we reasonably update web pages?
The first question involves development and deployment, while the second involves release. Below, I will explain these four parts: web entry, development, deployment, and release.
Part 1 Web Entry#
This part will briefly introduce what constitutes the webpage seen by users and what work the browser does to present these components to the user. First, here is the main page of bilibili:
A content-rich, aesthetically pleasing, and user-friendly webpage relies on the front-end trio: HTML, CSS, JS, as well as resource files like images and fonts:
- HTML determines the content of the webpage and serves as the entry point for users visiting any website. CSS and JS code can be written directly in HTML or in separate files that are referenced in HTML.
- CSS is responsible for the webpage's styling.
- JS enables user interaction.
<!-- Basic structure of the web entry HTML -->
<!DOCTYPE html>
<html>
<head>
<title>Webpage Title, displayed on the browser tab</title>
<meta name="keywords" content="webpage keywords, SEO"/>
<meta name="description" content="webpage description, SEO"/>
<!-- Inline CSS in HTML -->
<style>
.foo {
color: red;
}
</style>
<!-- External CSS file referenced in HTML -->
<link rel="stylesheet" href="https://s.alicdn.com/@g/msite/msite-rax-detail-cdn/1.0.73/web/screen.css"/>
</head>
<body>
<!-- Webpage content -->
<div class="foo">
Page Content
</div>
<!-- Inline JS script in HTML -->
<script>
function log(param) {
console.log(param)
}
log('Parsing and executing this JS code')
</script>
<!-- External JS file referenced in HTML -->
<script src="https://s.alicdn.com/@g/msite/msite-rax-detail-cdn/1.0.73/web/screen.js"></script>
</body>
</html>
Before a user accesses any website, they must first enter a valid address in the address bar. The browser then sends a request to the server to retrieve the corresponding webpage entry file, "xxx.html." Opening the browser's Network console will show that this is the first response content received by the browser.
Next, the browser parses the HTML code, recognizes other resources, and initiates more requests. Through the loading, parsing, and execution (not mandatory) of various types of resources, it gradually becomes the complete page visible to the user. At this point, we must mention CRP (Critical Rendering Path), which is a series of critical steps that the browser takes to convert HTML, JS, and CSS code into pixels visible on the screen, as follows:
- Download HTML over the network and parse the HTML code to construct the DOM.
- Download CSS over the network and parse the CSS code to construct the CSSOM.
- Download JS over the network, parse and execute the JS code, which may modify the DOM or CSSOM.
- Once the DOM & CSSOM are "shaped," the browser constructs the Render Tree based on the DOM and CSSOM.
- The reflow process calculates the position and style of each element node.
- The repaint process draws the actual pixels on the screen.
At this point, the webpage is presented to the user for further browsing and interaction.
Part 2 Development Phase#
After reviewing the previous part, you should have a better understanding of how the browser searches for and presents webpages. This part will briefly introduce the modern webpage development process.
Code Writing#
As webpage content becomes richer and webpage functionality more complex, the front-end trio of HTML, CSS, and JS has also grown significantly. Clearly, organizing CSS and JS code within a single HTML file is no longer suitable. We no longer write HTML, CSS, and JS code in the traditional way; instead, we use various UI frameworks (like React/Vue/Angular, etc.) for component-based development and CSS preprocessors (like Sass/Less/Stylus, etc.) to write styles.
Engineering Capabilities#
Using front-end build tools (like webpack/vite/Rollup, etc.) to organize various types of files and provide modularization, automation, optimization, and transpilation capabilities for local development and production packaging.
It is necessary to explain modularization, which allows us to treat different types of files as modules during the development phase. Modules become first-class citizens in the module system, allowing them to reference each other. The differences between different file type modules are handled by the build tools.
import '@/common/style.scss' // Importing SCSS
import arrowBack from '@/common/arrow-back.svg' // Importing SVG
import { loadScript } from '@/common/utils.js' // Importing a function from JS
Unlike the development phase, build tools also provide rich build capabilities for production environments, allowing for compression, tree-shaking optimization, uglification, compatibility, extraction, and other processes on the business source code, resulting in optimized code suitable for production environments. The production environment JS built looks like this:
!function(){"use strict";function t(t){if(null==t)return-1;var e=Number(t);return isNaN(e)?-1:Math.trunc(e)}function e(t){var e=t.name;return/(\.css|\.js|\.woff2)/.test(e)&&!/(\.json)/.test(e)}function n(t){var e="__";return"".concat(t.protocol).concat(e).concat(t.name).concat(e).concat(t.decodedBodySize).concat(e).concat(t.encodedBodySize).concat(e).concat(t.transferSize).concat(e).concat(t.startTime).concat(e).concat(t.duration).concat(e).concat(t.requestStart).concat(e).concat(t.responseEnd).concat(e).concat(t.responseStart).concat(e).concat(t.secureConnectionStart)}var r=function(){return/WindVane/i.test(navigator.userAgent)};function o(){return r()}function c(){return!!window.goldlog}var i=function(){return a()},a=function(){var t=function(t){var e=document.querySelector('meta[name="'.concat(t,'"]'));if(!e)return;return e.getAttribute("content")}("data-spm"),e=document.body&&document.body.getAttribute("data-spm");return t&&e&&"".concat(t,".")......
The production environment CSS built looks like this:
@charset "UTF-8";.free-shipping-block{-webkit-box-orient:horizontal;-webkit-box-direction:normal;-webkit-box-align:center;-ms-flex-align:center;-webkit-align-items:center;align-items:center;background-color:#ffe8da;background-position:100% 100%;background-repeat:no-repeat;background-size:200px 100px;border-radius:8px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;margin-top:24px;padding:12px}.free-shipping-block .content{-webkit-box-flex:1;-ms-flex-positive:1;color:#4b1d1f;-webkit-flex-grow:1;flex-grow:1;font-size:14px;margin-left:8px;margin-top:0!important}.free-shipping-block .content .desc img{padding-top:2px;vertical-align:text-top;width:120px}.free-shipping-block .co.....
The build tool outputs HTML code that automatically references JS and CSS resources:
<!doctype html><html><head><script defer="defer" src="/build/xxx.js"></script><link href="/build/xxx.css" rel="stylesheet"></head><body><div id="root"></div></body></html>
Part 3 Code Deployment#
At this point, we have all the resources needed for the webpage entry (HTML and corresponding CSS, JS, and other static resources). We can simply double-click the HTML file to open it in the browser for local access to our page. Ha! Front-end development is that simple!
So what’s the next step? We need to ensure that testers, product managers, operations, and global users on the internet can access our page, right? Just running it locally won’t work (doge), we need to upload all these resources to the internet.
In the development phase, webpage access runs on a local development server, typically with an IP of 127.0.0.1 and a custom port. Access is done via IP + Port + Path. One way is to manually upload resources to the server, allowing others to access the page using the server's IP + Port + Path (the topic of website domain application and mapping binding is omitted here). Another way is to automate the entire process through a dedicated publishing platform, which does the following:
- Check branch submission information, mandatory configurations, compliance checks for dependencies, and other checkpoints.
- Run scripts to execute pre-configured dependency installation and packaging build commands, initiate cloud builds, install project dependencies, and package a production environment product (in simple terms, this step is similar to cloning the project to local and initializing it, and then building it locally).
- Upload the product to the CDN.
At this point, users can enter the URL in their browsers to access our page. The server returns HTML, which references resources from the CDN, and the browser renders the page.
Part 4 External Release#
Iterative Updates#
For pages with tens of thousands (or millions) of DAU, the high traffic and extreme performance metrics require us to consider safe releases and user experience for iterative modifications to the page before officially making it accessible.
.foo {
background-color: red;
}
For index.css, if users have to request that file every time they open the page, it not only wastes bandwidth but also makes users wait longer for the download. We can fully utilize the strong cache in HTTP caching to cache static resources in the browser, allowing users to see the page faster (the speed comes from the browser reading files directly from memory/dist cache, eliminating download time).
<!-- Set cache validity period -->
Cache-Control: max-age=2592000,s-maxage=86400
For static resources, servers often set a very long cache expiration time to make full use of caching, so the browser doesn’t need to send requests. However, if the browser isn’t sending requests, what do we do if we have updates/bug fixes for the page? A common solution is to append a version number to the resource URL, such as:
<!-- Update via version number -->
<!doctype html>
<html>
<head>
<script defer="defer" src="https://s.alicdn.com/build/foo.js?t=0.0.1"></script>
<link href="https://s.alicdn.com/build/index.css?t=0.0.1" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
When updating next time, changing the version number will force the browser to initiate a new request:
<!-- Iteration version 0.0.2 -->
<!doctype html>
<html>
<head>
<script defer="defer" src="https://s.alicdn.com/build/foo.js?t=0.0.2"></script>
<link href="https://s.alicdn.com/build/index.css?t=0.0.2" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
However, this approach has a problem: if the HTML references multiple files and only one of them is changed in an iteration, the method of uniformly adding version numbers will invalidate the local cache of the other files!
To solve this problem, we need to implement file-level granular cache control. We can easily think of the data digest algorithm in HTTPS, which generates a unique hash value based on the file content. If the file is unchanged, the hash value remains the same, allowing for precise caching of individual files:
<!-- Control updates via file content digest -->
<!doctype html>
<html>
<head>
<!-- foo.js remains unchanged, continue using cache -->
<script defer="defer" src="https://s.alicdn.com/build/foo.js"></script>
<!-- index.css has changed styles, needs to request the updated file and cache -->
<link href="https://s.alicdn.com/build/index_1i0gdg6ic.css" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
Alternatively, we can add the resource path with the iteration version number:
<!-- Control updates via resource path -->
<!doctype html>
<html>
<head>
<!-- Resource path updated, request new resources -->
<script defer="defer" src="https://s.alicdn.com/0.0.2/build/foo.js"></script>
<!-- Resource path updated, request new resources -->
<link href="https://s.alicdn.com/0.0.2/build/index.css" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
Separation of Static and Dynamic Content#
Modern front-end deployment solutions often upload static resources (JS, CSS, images, etc.) to a CDN closer to users. These resources rarely change and need to fully utilize caching to improve cache hit rates. Meanwhile, dynamic pages (HTML) are tailored to individual user data, perform SSR for SEO, and are often stored closer to business servers for faster data retrieval and injection.
With two types of resources distributed in different locations, static resources are referenced in HTML via CDN links. However, a question arises: when updating the page, should we release static resources first or the page itself?
Releasing the page first and then the resources:
<!-- New page, old resources -->
<!doctype html>
<html>
<head>
<!-- Resources not fully published yet -->
<script defer="defer" src="https://s.alicdn.com/0.0.1/build/foo.js"></script>
<link href="https://s.alicdn.com/0.0.1/build/index.css" rel="stylesheet">
</head>
<body>
<!-- Page has been modified -->
<div class="bar"></div>
</body>
</html>
Before the static resources are published, users may access the new page structure, but the static resources are still old. Users might see a page with broken styles or encounter errors due to old JS scripts not finding element nodes, which is not feasible 🙅.
Releasing the resources first and then the page:
<!-- Old page, new resources -->
<!doctype html>
<html>
<head>
<!-- Resources have been published -->
<script defer="defer" src="https://s.alicdn.com/0.0.2/build/foo.js"></script>
<link href="https://s.alicdn.com/0.0.2/build/index.css" rel="stylesheet">
</head>
<body>
<!-- Page has not been published yet -->
<div class="foo"></div>
</body>
</html>
Before the page is published, the page structure remains unchanged while the resources are new. If users have previously accessed the page and have the old resources cached locally, they will see a normal page. Otherwise, if they access the old page but load new resources, they will encounter the same issues mentioned earlier: either broken styles or JS execution errors leading to a blank screen, which is also not feasible 🙅.
Thus, neither option works! This is why, in the past, developers had to work late at night to deploy projects during low traffic periods, as it minimizes the impact. However, large companies do not have absolute low-traffic periods, only relatively low-traffic periods. Even during these times, for those of us who pursue perfection, it is unacceptable!
The issue arises from the overlay release, where problems occur when resources to be released overwrite already published resources. The corresponding solution is non-overlay release, achieved by adding version numbers or hash to file paths. When releasing new resources, we do not overwrite old resources; instead, we fully publish static resources first and then gradually roll out the full release of the page, perfectly solving the problem.
Therefore, regarding static resource optimization, we should aim to:
- Configure long cache expiration times to improve cache hit rates and save bandwidth.
- Use content digests or versioned file paths as the basis for cache updates to achieve precise cache control.
- Deploy static resources on a CDN to save network request transmission paths and shorten request response times.
- Update resources using non-overlay releases for smooth transitions.
At this point, the code painstakingly written by front-end developers has gone through continuous iterations, (cloud) builds, resource deployments, and external releases, allowing global users to experience our products and surf the internet happily!