I created this joke-y "framework" to build a website, or a blog if you will, from markdown files. The idea is to (ab)use Apache's default directory listing. The "index" page reads the Apache's HTML of a given directory and produces a list of articles. When the user clicks, the markdown file (corresponding to an article) is converted to HTML on the fly.
The result: a blog with no build process. To publish a new article, you just drop a new MD file.
Obviously this is a half-joke and the SEO of such a site is in the trash, because the whole site is JS-generated. But does SEO matter anymore anyway?
Lizzy.js source code is here github.com/stoyan/Lizzy.js
And an example site fully, powered by this "framework", is highperformancewebfonts.com
Source code highlighting is supported.
Drafts are supported too, just prefix the .md file with a _. and you can preview the way it renders, but it's not listed on the index page.
All you have to do is copy-paste the library and configure it like so:
window.\_\_lizzyconf = { index: '/posts/', // where the MD files live root: document.getElementById('root'), // where to render it all read: '/read/', // for bookmarking URLs}
There's some extra stuff to make it all more useful, you can head out to the docs.
I'm working on a new site at https://highperformancewebfonts.com/ where I'm doing everything wrong. E.g. using a joke-y client-side-only rendering of articles from .md files (Hello Lizzy.js)
Since there's no static generation, there was no RSS feed. And since someone asked, I decided to add one. But in the spirit of learning-while-doing, I thought I should do the feed generation in Rust—a language I know nothing about.
Here are my first steps in Rust, for posterity. BTW the end result is https://highperformancewebfonts.com/feed.xml
rustup tool. This page https://www.rust-lang.org/tools/install has the instructions:$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Next, restart the terminal shell or run:
$ . "$HOME/.cargo/env"
Check if installation was ok:
$ rustc --versionrustc 1.84.0 (9fc6b4312 2025-01-07)
2. A new projectA tool called Cargo seems like the way to go. Looks like it's a package manager, an NPM of Rust:
$ cargo new russel && cd russel
(The name of my program is "russel", from "rusty", from "rust". Yeah, I'll see myself out.)
Cargo.toml looks like a config file similar in spirit to package.json and similar in syntax to a php.ini. Since I'll need to write an RSS feed, a package called rss would be handy.[package]name = "russel"version = "0.1.0"edition = "2021"[dependencies]rss = "2.0.0"
Running $ cargo build after a dependency update seems necessary.
To explore packages, a crates.io site looks appropriate e.g. https://crates.io/crates/rss as well as docs.rs, e.g. https://docs.rs/rss/2.0.11/rss/index.html
cargo new russel from the previous step created a hello-world program. We can test it by running:$ cargo run
This should print "Hello, world!"
Nice!
Open src/main.rs, look at this wonderful function:
fn main() { println!("Hello, world!");}
Replace the string with "Bello, world". Save. Run:
$ cargo run
If you see the "Bello", the setup seems to be working. Rejoice!
Go Rust in peace!https://en.wikipedia.org/wiki/Rust_in_Peace
When you use a bog-standard WordPress install, the caching header in the HTML response is
Cache-Control: max-age=600
OK, cool, this means cache the HTML for 10 minutes.
Additionally these headers are sent:
Date: Sat, 07 Dec 2024 05:20:02 GMTExpires: Sat, 07 Dec 2024 05:30:02 GMT
These don't help at all, because they instruct the browser to cache for 10 minutes too, which the browser already knows. These can actually be harmful in cases of clocks that are off. But let's move on.
WP Super CacheThis is a plugin I installed, made by WP folks themselves, so joy, joy, joy. It saves the generated HTML from the PHP code on the disk and then gives that cached content to the next visitor. Win!
However, I noticed it ads another header:
Cache-Control: max-age=3, must-revalidate
And actually now there are two cache-control headers being sent, the new and the old:
Cache-Control: max-age=3, must-revalidateCache-Control: max-age=600
What do you think happens? Well, the browser goes with the more restrictive one, so the wonderfully cached (on disk) HTML is now stale after 3 seconds. Not cool!
A settings fixLooking around in the plugin settings I see there is no way to fix this. There's another curious setting though, disabled by default:
[ ] 304 Browser caching. Improves site performance by checking if the page has changed since the browser last requested it. (Recommended)
304 support is disabled by default because some hosts have had problems with the headers used in the past.
I turned this on. It means that instead of a new request after 3 seconds, the repeat visit will send an If-Modified-Since header, and since 3 seconds is a very short time, the server will very likely respond with 304 Not Modified response, which means the browser is free to use the copy from the browser cache.
Better, but still... it's an HTTP request.
A config fixThen I had to poke around the code and saw this:
// Default headers.$headers = array( 'Vary' => 'Accept-Encoding, Cookie', 'Cache-Control' => 'max-age=3, must-revalidate',);// Allow users to override Cache-control header with WPSC\_CACHE\_CONTROL\_HEADERif ( defined( 'WPSC\_CACHE\_CONTROL\_HEADER' ) && ! empty( WPSC\_CACHE\_CONTROL\_HEADER ) ) { $headers['Cache-Control'] = WPSC\_CACHE\_CONTROL\_HEADER;}
Alrighty, so there is a way! All I needed to do was define the constant with the header I want.
The new constant lives in wp-content/wp-cache-config.php - a file that already exists, created by the cache plugin.
I opted for:
define( 'WPSC\_CACHE\_CONTROL\_HEADER', 'max-age=600, stale-while-revalidate=100');
Why 600? I'd do it for longer but there's this other Cache-Control 600 coming from who-knows-where, so 600 is the max I can do. (TODO: figure out that other Cache-Control and ditch it)
Why stale-while-revalidate? Well, this lets the browser use the cached response after the 10 minutes while it's re-checking for a fresher copy.
Some WebPageTest tests1. The repeat visit as-is, meaning the default less-than-ideal WP Super Cache behavior:
https://www.webpagetest.org/result/241207_AiDcHR_1QT/
Here you can see a new request for a repeat view, because 3 seconds have passed.
You can see a request being made that gets a 304 Not Modified response
Here you can see no more requests for HTML, just one for stats. No static resources either (CSS, images, JS are cached "forever"). So the page is loaded completely from the browser cache.
Animated gifs are fun and all but they can get big (in filesize) quickly. At some point, maybe after just a few low-resolution frames it's better to use an MP4 and an HTML <video> element. You also preferably need a "poster" image for the video so people can see a quick preview before they decide to play your video. The procedure can be pretty simple thanks to freely available amazing open source command-line tools.
Step 1: an MP4For this we use ffmpeg:
$ ffmpeg -i amazing.gif amazing.mp4
Step 2: a poster imageHere we use ImageMagick to take the first frame in a gif and export it to a PNG:
$ magick "amazing.gif[0]" amazing.png
... or a JPEG, depending on the type of video (photographic vs more shape-y)
Step 3: video tag ```
``` Step 4: optimize the image... with your favorite image-smushing tool e.g. ImageOptim
CommentsI did this for a recent Perfplanet calendar post and the 2.5MB gif turned to 270K mp4. Another 23MB gif turned to 1.2MB mp4.
I dunno if my ffmpeg install is to blame but the videos didn't play in QuickTime/FF/Safari, only in Chrome. So I ran them through HandBrake and that solved it. Cuz... ffmpeg options are not for the faint-hearted.
Do use preload="none" so that the browser doesn't load the whole video unless the user decides to play. In my testing without a preload=none Chrome and Safari send range requests (like an HTTP header Range: bytes=0-) for 206 Partial Content. Firefox just gets the whole thing
There is no loading="lazy" for poster images
You've seen some of these UIs as of recent AI tools that stream text, right? Like this:
I peeked under the hood of ChatGPT and meta.ai to figure how they work.
Server-sent eventsServer-sent events (SSE) seem like the right tool for the job. A server-side script flushes out content whenever it's ready. The browser listens to the content as it's coming down the wire with the help of EventSource() and updates the UI.
(aside:) PHP on the serverSadly I couldn't make the PHP code work server-side on this here blog, even though I consulted Dreamhost's support. I never got the "chunked" response to flush progressively from the server, I always get the whole response once it's ready. It's not impossible though, it worked for me with a local PHP server (like $ php -S localhost:8000) and I'm pretty sure it used to work on Dreamhost before they switched to FastCGI.
If you want to make flush()-ing work in PHP, here are some pointers to try in .htaccess
<filesmatch "\.php$"> SetEnv no-gzip 1 Header always set Cache-Control "no-cache, no-store, must-revalidate" SetEnv chunked yes SetEnv FcgidOutputBufferSize 0 SetEnv OutputBufferSize 0<filesmatch>
And a test page to tell the time every second:
<?phpheader('Cache-Control: no-cache');@ob\_end\_clean();$go = 5;while ($go) { $go--; // Send a message echo sprintf( "It's %s o'clock on my server.\n\n", date('H:i:s', time()), ); flush(); sleep(1);}
In this repo stoyan/vexedbyalazyox you can find two PHP scripts that worked for me.
BTW, the server-side partial responses and flushing is pretty old as web performance techniques go.
A bit about the server-sent messages(I'll keep using PHP to illustrate for just a bit more and then switch to Node.js)
In their simplest from server-sent events (or messages) are pretty sparse, all you do is:
echo "data: I am a message\n\n";flush();
And now the client can receive "I am a message".
The events can have event names, anything you make up, like:
echo "event: start\n";echo "data: Hi!\n\n";flush();
More on the message fields is available on MDN. But all in all, the stuff you spit out on the server can be really simple:
event: startdata:data: hellodata: fooevent: enddata:
Events can be named anything, "start" and "end" are just examples. And they are optional too.
data: is not optional. Even if all you need is to send an event with no data.
When event: is omitted, it's assumed to be event: message.
The client's JavaScriptTo get started you need an EventSource object pointed to the server-side script:
const evtSource = new EventSource( 'https://pebble-capricious-bearberry.glitch.me/',);
Then you just listen to events (messages) and update the UI:
evtSource.onmessage = (e) => { msg.textContent += e.data;};
And that's all! You have optional event handlers should you need them:
evtSource.onopen = () => {};evtSource.onerror = () => {};
Additionally, you can listen to any events with names you decide. For example I want the server to signal to the client that the response is over. So I have the server send this message:
event: imouttaheredata:
And then the client can listen to the imouttahere event:
evtSource.addEventListener('imouttahere', () => { console.info('Server calls it done'); evtSource.close();});
Demo timeOK, demo time! The server side script takes a paragraph of text and spits out every word after a random delay:
$txt = "The zebra jumps quickly over a fence, vexed by...";$words = explode(" ", $txt);foreach ($words as $word) { echo "data: $word \n\n"; usleep(rand(90000, 200000)); // Random delay flush();}
The client side sets up EventSource and, on every message, updates the text on the page. When the server is done (event: imouttahere), the client closes the connection.
Try it here in action. View source for the complete code. Note: if nothing happens initially, that's because the server-side Glitch is gone to sleep and needs to wake up.
One cool Chrome devtools feature is the list of events under an EventStream tab in the Network panel:
Now, what happens if the server is done and doesn't send a special message (such as imouttahere)? Well, the browser thinks something went wrong and re-requests the same URL and the whole thing repeats. This is probably desired behavior in many cases, but here I don't want it.
Try the case of a non-terminating client.
The re-request will look like the following... note the error and the repeat request:
Alrighty, that just about clarifies SSE (Server-Sent Events) and provides a small demo to get you started.
In fact, this is the type of "streaming" ChatGPT uses when giving answers, take a look:
In the EventStream tab you can see the messages passing through. The server sends stuff like:
event: deltadata: {json: here}
This should look familiar now, except the chosen event name is "delta" (not the default, optional "message") and the data is JSON-encoded.
And at the end, the server switches back to "message" and the data is "[DONE]" as a way to signal to the client that the answer is complete and the UI can be updated appropriately, e.g. make the STOP button back to SEND (arrow pointing up)
OK, cool story ChatGPT, let's take a gander at what the competition is doing over at meta.ai
XMLHttpRequestAsking meta.ai a question I don't see EventStream tab, so must be something else. Looking at the Performance panel for UI updates I see:
All of these pinkish, purplish vertical almost-lines are updates. Zooming in on one:
Here we can see XHR readyState change. Aha! Our old friend XMLHttpRequest, the source of all things Ajax!
Looks like with similar server-side flushes meta.ai is streaming the answer. On every readyState change, the client can inspect the current state of the response and grab data from it.
Here's our version of the XHR boilerplate:
const xhr = new XMLHttpRequest();xhr.open( 'GET', 'https://pebble-capricious-bearberry.glitch.me/xhr', true,);xhr.send(null);
Now the only thing left is to listen to onprogress:
xhr.onprogress = () => { console.log('LOADING', xhr.readyState); msg.textContent = xhr.responseText;};
Like before, for a test page, the server just flushes the next chunk of text after a random delay:
$txt = "The zebra jumps quickly over a fence, vexed ...";$words = explode(" ", $txt);foreach ($words as $word) { echo "$word "; usleep(rand(20000, 200000)); // Random delay flush();}
XHR client demo page
Differences between XHR and SSEFirst, HTTP header:
```
``` Second, message format. SSE requires a (however simple) format of "event:" and "data:" where data can be JSON-encoded or however you wish. Maybe even XML if you're feeling cheeky. XHR responses are completely free for all, no formatting imposed, and even XML is not required despite the unfortunate name.
And lastly, and most importantly IMO, is that SSE can be interrupted by the client. In my examples I have a "close" button:
document.querySelector('#close').onclick = function () { console.log('Connection closed'); evtSource.close();};
Here close() tells the server that's enough and the server takes a breath. No such thing is possible in XHR. And you can see inspecting meta.ai that even though the user can click "stop generating", the response is sent by the server until it completes.
Node.js on the serverFinally, here's my Node.js that I used for the demos. Since I couldn't get Dreamhost to flush() in PHP, I went to Glitch as a free Node hosting to host just this one script.
The code handles requests / for SSE and /xhr for XHR. And there are a few ifs based on XHR vs SSE:
const http = require("http");const server = http.createServer((req, res) => { if (req.url === "/" || req.url === "/xhr") { const xhr = req.url === "/xhr"; res.writeHead(200, { "Content-Type": xhr ? "text/plain" : "text/event-stream", "Cache-Control": "no-cache", "Access-Control-Allow-Origin": "*", }); if (xhr) { res.write(" ".repeat(1024)); // for Chrome } res.write("\n\n"); const txt = "The zebra jumps quickly over a fence, vexed ..."; const words = txt.split(" "); let to = 0; for (let word of words) { to += Math.floor(Math.random() * 200) + 80; setTimeout(() => { if (!xhr) { res.write(`data: ${word} \n\n`); } else { res.write(`${word} `); } }, to); } if (!xhr) { setTimeout(() => { res.write("event: imouttahere\n"); res.write("data:\n\n"); res.end(); }, to + 1000); } req.on("close", () => { res.end(); }); } else { res.writeHead(404); res.end("Not Found\n"); }});const port = 8080;server.listen(port, () => { console.log(`Server started on port ${port}`);});
Note the weird-looking line:
res.write(" ".repeat(1024)); // for Chrome
In the world of flushing, there are many foes that want to buffer the output. Apache, PHP, mod_gzip, you name it. Even the browser. Sometimes it's required to flush out some emptiness (in this case 1K of spaces). I was actually pleasantly surprised that not too much of it was needed. In my testing this 1K buffer was needed only in the XHR case and only in Chrome.
That's all folks!If you want to inspect the endpoints here they are:
Once again, the repo stoyan/vexedbyalazyox has all the code from this blog and some more too.
And the demos one more time:
Small update: honorable mention for Web SocketsWeb Sockets are yet another alternative to streaming content. Probably the most complex of the three in terms of implementation. Perplexity.ai and MS Copilot seem to have went this route:
While at the most recent performance.now() conference, I had a little chat with Andy Davies about fonts and he mentioned it'd be cool if, while subsetting, you can easily create a second subset file that contains all the "rejects". All the characters that were not included in the initially desired subset.
And as the flight from Amsterdam is pretty long, I hacked on just that. Say hello to a new script, available as an NPM package, called...
inverse-subset
Initially I was thinking to wrap around Glyphhanger and do both subsets, but decided that there's no point in wrapping Glyphhanger to do what Glyphhanger already does. So the initial subset is left to the user to do in any way they see fit. What I set out to do was take The Source (the complete font file) and The Subset and produce an inversion, where
The Inverted Subset = The Source - The Subset
This way if your subset is all Latin characters, the inversion will be all non-Latin characters.
When you craft the @font-face declaration, you can use the Unicode range of the subset, like
@font-face { font-family: "Oxanium"; src: url("Oxanium-subset.woff2") format("woff2"); unicode-range: U+0020-007E;}
(Unicode generated by wakamaifondue.com/beta)
Then for the occasional character that is not in this range, you can let the browser load the inverted subset. But that should be rare, otherwise an oft-needed character will be in the original subset.
Save on HTTP requests and bytes (in 99% of cases) and yet, take care of all characters your font supports for that extra special 1% of cases.
Unicode-optionalWakamaifondue can generate the Unicode range for the inverted subset too but it's not required (it's too long!) only if the inverted declaration comes first. In other words if you have:
@font-face { font-family: "Oxanium"; src: url("Oxanium-inverse-subset.woff2") format("woff2");}@font-face { font-family: "Oxanium"; src: url("Oxanium-subset.woff2") format("woff2"); unicode-range: U+0020-007E;}
... and only Latin characters on the page, then Oxanium-inverse-subset.woff2 is NOT going to be downloaded, because the second declaration overwrites the first.
Test page is here
If you flip the two @font-face blocks, the inversion will be loaded because it claims to support everything. And the Latin will be loaded too, because the inversion proves inadequate.
If you cannot guarantee the order of @font-faces for some reason, specifying a scary-looking Unicode range for the inversion is advisable:
@font-face { font-family: "Oxanium"; src: url("Oxanium-inverse-subset.woff2") format("woff2"); unicode-range: U+0000, U+000D, U+00A0-0107, U+010C-0113, U+0116-011B, U+011E-011F, U+0122-0123, U+012A-012B, U+012E-0131, U+0136-0137, U+0139-013E, U+0141-0148, U+014C-014D, U+0150-015B, U+015E-0165, U+016A-016B, U+016E-0173, U+0178-017E, U+0192, U+0218-021B, U+0237, U+02C6-02C7, U+02C9, U+02D8-02DD, U+0300-0304, U+0306-0308, U+030A-030C, U+0326-0328, U+03C0, U+1E9E, U+2013-2014, U+2018-201A, U+201C-201E, U+2020-2022, U+2026, U+2030, U+2039-203A, U+2044, U+2070, U+2074, U+2080-2084, U+20AC, U+20BA, U+20BD, U+2113, U+2122, U+2126, U+212E, U+2202, U+2206, U+220F, U+2211-2212, U+2215, U+2219-221A, U+221E, U+222B, U+2248, U+2260, U+2264-2265, U+25CA, U+F000, U+FB01-FB02;}
How embarrassment looks likeIf you don't load the extended characters and someone uses your CMS to add a wee bit of je ne sais quoi, you get a fallback font:
Test page is here
(Note the à shown in a fallback font)
But if you do load the inversion, all is fine with the UI once again.
Test page
Thank you!... and happy type setting, subsetting, and inverse subsetting!
Here's a view of the tool in action:
This is part 4 of an ongoing study of web font file sizes, subsetting, and file sizes of the subsets.
I used the collection of freely available web fonts that is Google Fonts.
Now, instead of focusing on just regular or just weight-variable fonts, I thought let's just do them all and let you, my dear reader, do your own filtering, analysis and conclusions.
One constraint I kept was just focusing on the LATIN subset (see part 1 as to what LATIN means) because as Boris Shapira notes: "...even with basic high school Chinese, we would need a minimum of 3,000 characters..." which is order of magnitude larger than Latin and we do need to keep some sort of apples-to-apples here.
The studyFirst download all Google fonts (see part 1).
Then subset all of them fonts to LATIN and drop all fonts that don't support at least 200 characters. 200 and a bit is what the average LATIN font out there supports. This resulted in excluding fonts that focus mostly on non-Latin, e.g. Chinese characters. But it also dropped some fonts that are close to 200 Latin characters but not quite there. See part 1 for the "magic" 200 number. So this replicates part 1 and part 3 but this time for all available fonts.
This 200-LATIN filtering leaves us with 3277 font files to study and 261 font file "rejects". The full list of rejects is rejects.txt
Finally, subset each of the remaining fonts, 10 characters at a time to see how they grow. This replicates part 2 for all fonts, albeit a bit more coarse (10 characters at a time as opposed to 1. Hey, it still took over 24 hours while running 10 threads simultaneously, meaning 10 copies of the subsetting script!). The subsets are 1 character, 10, characters, 20... up to 200. I ended up with 68,817 font files.
((10 to 200 = 20) + 1) * 3277 files
DataLATINThe LATIN subset data is available in CSV (latin.csv) and HTML (latin.html)
SubsetsThe subset data is available as CSV (stats.csv) and Google spreadsheet
Some observations* The data set contains 3277 different fonts files, each being subset 21 times * 588 are variable fonts * 429 variable only on the weight axis * 196 containing variable with more than one axis, e.g. [wdth,wght] or [FLAR,VOLM,slnt,wght] * 63 using the [opsz] axis (it's been suggested this is the "expensive" one in terms of file size
ConclusionsI'd love to hear your analysis on the data! I hope this data can be useful and I'm looking forward to any and all insights.
I've been crafting a nice font-face fallback, something like this:
@font-face { font-family: fallback; src: local('Helvetica Neue'); ascent-override: 85%; descent-override: 19.5%; line-gap-override: 0%; size-adjust: 106.74%;}
It works well, however Safari doesn't yet support ascent-override, descent-override, nor line-gap-override in @font-face blocks. It does support size-adjust though.
Since my code requires all 4, the results with size-adjust-only look bad. Worse than no overrides. Easy-peasy I thought, I'll target Safari and not give it any of the 4.
I wanted to use @supports in CSS to keep everything nice and compact. No JavaScript, no external CSS, all this is for a font fallback, so it should be loaded as early in the page as possible, together with the @font-face.
Unfortunately, turns out that for example both
@supports (ascent-override: normal) {/* css here */}
and
@supports (size-adjust: 100%) {/* css here */}
end up with the "css here" not being used.
In fact even the amazing font-display: swap is not declared as being @support-ed.
Using the JavaScript API I get this in Chrome, Safari and Firefox:
console.log(CSS.supports('font-stretch: normal')); // trueconsole.log(CSS.supports('font-style: normal')); // trueconsole.log(CSS.supports('font-display: swap')); // falseconsole.log(CSS.supports('size-adjust: 100%')); // falseconsole.log(CSS.supports('ascent-override: normal')); // false
Huh? Am I using @supports incorrectly? Or browsers forget to update this part of the code after adding a new feature? But what are the chances that all three make the same error?
It's not like anything in @font-face is not declared @support-ed, because font-style and font-stretch are.
Clearing out my confusionRyan Townsend pointed out what font-style and font-stretch work because they double as properties not only as font descriptors. So turns out font descriptors are not supported by @supports. Darn!
Noam Rosenthal pointed out this github issue, open in 2018, to add support for descriptors too.
For now I came up with 2 (imperfect) solutions. One that uses JavaScript to check for a property, like
'ascentOverride' in new FontFace(1,1); // true in Chrome, FF, false in Saf
Not ideal because it's JavaScript.
The other one is to target non-Safari in CSS is with a different property to use as a "proxy". Using the wonderful Compare Browsers feature of CanIUse.com I found a good candidate:
@supports (overflow-anchor: auto) { @font-face { /* works in Chrome, Edge, FF, but not in Safari*/ }}
It's not-ideal to test one thing (overflow-anchor) and use another (ascent-override) but at least no JavaScript is involved
In this post, I talked about the letter frequency in English presented in Peter Norvig's research. And then I thought... what about my own mother tongue?
So I got a corpus of 5000 books (832,260 words), a mix of Bulgarian authors and translations, and counted the letter frequency. Here's the result in CSV format: letters.csv
Here are the results (in alphabetical order) in a graph:
And another graph, with data sorted by the frequency of letters:
ChatGPT gives a different result, even startlingly so (o is the winner at ~9.1% and a is third with 7.5%), which makes me like my letter count research even more
TL;DR:
For context see part 1 and part 2.
After publishing part 2 of my ongoing web fonts file size study, I got feedback on Mastodon to the effect of hey, what about variable fonts?
Good question! I speculated in part 2 that there may be savings if we can combine font variants (bold, italic) in a single file, sprite-style. And that's just what a variable font is (and more!)
Rerun them scriptsFollowing the process described in part 1. I grabbed only fonts from Google fonts that have [wght] in the name and subset them to the LATIN subset, throwing away those with fewer than 200 characters. Also I removed all fonts with "Italic" in the name.
Why [wght] only and not stuff like AdventPro[wdth,wght]?
I wanted to keep only one variable dimension so we can see apples-to-apples as much as possible. And [wght] seems to be the most popular dimension by far.
Why no Italic?
I wanted to keep fonts kinda diverse. Chances are AlbertSans-Italic[wght].ttf and AlbertSans[wght].ttf are designed by the same person (or people). So they are using similar techniques, optimizations and so on. And I'm looking for what's "out there" in general.
ResultsHere are the results in HTML and in CSV format.
And just a taste of what the results look like...
| Num chars | Num glyphs | Bytes | File | Font name | | --- | --- | --- | --- | --- | | 235 | 378 | 21400 | Afacad[wght]-subset.woff2 | Afacad | | 217 | 243 | 34688 | Aleo[wght]-subset.woff2 | Aleo | | ... | ... | ... | ... | ... | | 241 | 609 | 61456 | YsabeauOffice[wght]-subset.woff2 | Ysabeau Office | | 241 | 621 | 62552 | Ysabeau[wght]-subset.woff2 | Ysabeau | | 241 | 584 | 58688 | YsabeauInfant[wght]-subset.woff2 | Ysabeau Infant |
Overall stats:
Conclusions? In part 1 one of the conclusions was: the median file size of a regular web font with Latin-extended subset of characters is 19092 bytes. Where "regular" means no bolds, no italics, etc. * Here we see that the median file size of a variable web font with Latin-extended subset of characters is 34744 bytes* * The sum is smaller than the parts. A variable font that has both normal and heavy (bold) weight (and also everything in between) is slightly smaller than two regular fonts. Assuming that a bold font file is as big as a regular (we'll check on that assumption later), then 19092 * 2 = 38,184 is greater than 34,744
The file size difference is not big but we can still see a saving probably because of duplicate metadata and some other similar elements in two files vs one. And there there's also the delivery saving - 2 HTTPS requests vs one.
Potential skew-age?1. Smaller subset: here we're looking at the median file size amongst 335 files vs 1009 files in the original study. 2. Uneven number of characters: the median number of characters here is 222 where in the the original study it was 219. Not a big difference but still... Also overall the total number of characters is random (but over 200) in both studies. We can control for this (in a followup) by comparing only 200-char subsets for example. 3. Google fonts only: well yeah, that's an easy corpus of fonts to download and mess around with.
Next?In the spirit of part 2 I'd like to study the sizes when incrementing the number of characters in a subset (as opposed to a catch-all LATIN). This will address potential skew #2 above. Probably not increments of 1 but of 50 to save some processing.
I'd also like to experiment with ALL the fonts available. So far I've been looking at "Regular" and [wght] only. But I should just do it all and then have people smarter than me (such as yourself, my dear reader) slice the results and draw conclusions any way you want.
The zebra jumps quickly over a fence, vexed by a lazy ox. Eden tries to alter soft stone near it. Tall giants often need to rest, and open roads invite no pause. Some long lines appear there. In bright cold night, stars drift, and people watch them. A few near doors step out. Much light finds land slowly, while men feel deep quiet. Words run in ways, forward yet true. Look ahead, and things form still, yet dreams stay hidden. Down the path, close skies come, forming hard arcs. High above, quiet kites drift, fast on pure wind, yanking joints.
What's so special about the nonsense paragraph above? It's attempting to match the average distribution of letters in texts written in the English language.
This article by Peter Norvig discusses a 2012 study of letter frequency using Google books data set. And the distribution look like so:
For font-fallback matching purposes (more on this later) I want a shorter paragraph, representing roughly similar distribution. One can, of course, just create a paragraph like "Zzzzzzzzz" (9 Zs), followed by 12 Qs and so on, all the way to 1249 Es. But where's the fun in that? Plus texts have spaces and punctuation too.
So after some tweaking and coaching AI, this is a paragraph that came out that looks more realistic and matches the letter frequency pretty well.
Here's a CSV that shows:
Letter,Norvig,Tall giantsE,12.49%,12.26%T,9.28%,8.73%A,8.04%,7.55%O,7.64%,7.08%I,7.57%,6.60%N,7.23%,7.55%S,6.51%,6.84%R,6.28%,6.13%H,5.05%,4.01%L,4.07%,4.48%D,3.82%,5.42%C,3.34%,1.89%U,2.73%,2.36%M,2.51%,2.12%F,2.40%,2.83%P,2.14%,2.59%G,1.87%,2.12%W,1.68%,2.12%Y,1.66%,2.12%B,1.48%,0.94%V,1.05%,0.94%K,0.54%,1.18%X,0.23%,0.47%J,0.16%,0.47%Q,0.12%,0.71%Z,0.09%,0.47%
Here's the same data represented graphically:
Well, what's the point of this?Similar to the nonsense etaoin shrdlu used by typesetters, this paragraph can be used to find out the average character width of a font.
Just render the paragraph in a non-wrapping inline-block DOM element, measure the width of the element and divide by the length of the text.
How is this useful? Welp, to set the size-adjust CSS property of a fallback font to match a custom web font. Further write up is coming, stay tuned!
Close enoughAs you can see in the graph, the two lines do not match exactly. I think this is OK. It's extremely unlikely that any text on your page will have the exact average distribution of letters in it. So we're talking about an approximation to begin with. May also be site-dependent. E.g. in an adult site maybe the X character will occur more often than the average book.
Also Norvig's analysis doesn't mention spaces and punctuation. In my paragraph, these exist, maybe making it possible to match the average text on a web page just a little bit closer.
Aside: why not just Lorem IpsumWell, it doesn't attempt to match the character distribution in English. (Duh, it's not even English!)
Here's what it looks like in the same digram:
Note: no K, J, Z, W or Y. Barely any H.
Here are the stats in CSV and .numbers for your perusal.
May "The zebra jumps quickly over a fence, vexed by a lazy ox" be always in your favor!
Earlier this year I wondered how many KB is "normal" for a web font file size (spoiler 20-ish KB). I finished the post questioning how much subsetting really helps, meaning how much do you save from painstakingly choosing which characters should stay in the subset as opposed to just broad strokes (ASCII vs Latin vs Cyrillic, etc)
So here's a little follow-up study of filesizes where I subset'd 1009 font files found on Google fonts' GitHub, one character at a time ending up with 222,557 WOFF2 files.
ConstraintsI had to put some constraints, so that the study is not too broad, but yet is representative. The previous post has the reasoning in details, but here are the highlights:
So. Now we have 1009 TTF files each containing 200 characters or more. The full list is available with the data later in the post.
Step 1: What's in a font?We start with a Node script that takes each font and prints a text file with each character supported by the font on a new line. The results look like so:
$ less ZenAntique-Regular.txtU+20U+21U+22U+23U+24U+25U+26U+27U+28U+29U+2AU+2BU+2CU+2DU+2EU+2FU+30U+31U+32U+33U+34U+35U+36....
U+20 is a space, U+31 is the number 1 and so on...
The script uses the fontkit library for font introspection. Prrrrrty cool.
In this directory you can see and inspect all the 1009 txt files.
Step 2: SubsettingUsing Glyphhanger we can now subset each font adding one character at a time. So the first subset is only a space character. The second subset is space and exclamation. The third subset is space, ! and ". And so on. The last subset contains all the characters supported by the font.
Here's an example of the 3-character subset of "Are you serious" font inspected and visualized in wakamaifondue.com
Scrolling further down we see the characters (space, ! and "):
Same subset font inspected by another wonderful tool fontdrop.info shows glyphs. (Remember glyph !== character)
Time to write the script to do the work! The full script is available here, but here's the gist: for each font file, read the corresponding txt file (full o' unicode characters) and keep running glyphhanger to subset the font, adding each new character to the new subset.
// read the list of files in the font directoryfs.readdir(fontDirectory, (err, files) => { // process each one files.forEach((file) => { const fontPath = path.join(fontDirectory, file); // check if the file is a TTF if (file.toLowerCase().endsWith('.ttf')) { const fontName = path.basename(file, path.extname(file)); const txtFilePath = path.join(fontDirectory, `${fontName}.txt`); // read the Unicode characters from the corresponding .txt file const unicodeCharacters = fs .readFileSync(txtFilePath, 'utf-8') .split('\n') .map((line) => line.trim()) .filter(Boolean); // for each character in the txt, run glyphhanger unicodeCharacters.forEach(unicodeCharacter, index => { subsetList.push(unicodeCharacter); const subsetString = subsetList.join(','); // glyph!hang! const command = `glyphhanger --formats=woff2 --subset="${fontPath}" --whitelist=${subsetString}`; execSync(command);
Running the script is pretty intensive, by which I mean slow. I had to make sure I can run it in parallel and at some point I had a bunch of instances. I'm pretty sure it still took about a day.
Step 3: Wrapping up and verifying continuityOne feature request I have for glyphhanger is to let you choose the output file name (or maybe I've missed it). As far as I've tested it always creates the output font file in the same directory as the source and with a "-subset" appendix. That's why my subsetting script copies the results to the results directory. At this stage the last thing to do is rename the WOFF2s so that a subset with one character lives in 1.woff2, the one with 23 characters is in 23.woff2 and so on. And while at it, double-check that all subsetting was successful, there are no gaps in the sequence, e.g. there's no missing 24.woff2. (I'm happy to report I found no gaps)
Here's the full Node script, nothing interesting in it that is worth highlighting.
(BTW, in this and in the previous post I'm trying my best to include all scripts and instructions so anyone can recreate the experiment maybe with a different corpus of fonts, or italics/bolds, etc.)At the end, we end up with a directory structure like so:
Step 4: to CSVTime to get some data into a CSV file. Last and final script is available here. It just writes out a big ol' CSV file with one row per font and one column per subset.
The resulting stats CSV (1.3MB) is here, and here's a preview:
Name,1,2,3,4,5,6,7,8,9,10...AreYouSerious-Regular,532,676,764,964,1424,1680,1964,1968,2060,...Arizonia-Regular,2352,2360,2528,2624,3040,3356,3648,4100,4124,...Armata-Regular,1632,1628,1652,1632,1632,1632,1644,1632,1652,1640,...Arsenal-Regular,2184,2196,2196,2296,2348,2468,2668,2836,3040,3072,...Artifika-Regular,2292,......
AnalysisThat's a lot of data points, over 200 thousand. What to do, what to do... Some sort of median trend analysis is what I decided on.
(BTW, this is where I need your help, dear reader. Does looking at this data inspire some more analysis?)
I took the median of each column from 1 character to 200 characters and plotted it. Then tried a few trendline options. The polynomial trendline seemed to make the most sense. Here's the result:
This is my .numbers (Apple's Excel) file if that data inspires you.
What do we see? Adding more characters to a font makes the file size increase fairly linearly. And at 100-130 character mark, the linear increase maaaybe slows down a bit. Maybe. If you squint hard enough.
Wild speculation follows... past the 95 characters of US ASCII (see previous post for the 95 number) some of the additional Latin characters may be easier to draw based on existing strokes already in the file. E.g. drawing À, Á, Â, Ã, Ä and Å can probably reuse some of the A work. This is just a guess, as I'm yet to design a font.
Additional observations* (min) In the 1009 fonts, the one to take the least bytes to draw a space used 408 bytes * (max) The one that used the most bytes was 3.37K for a space character. Wild difference. * At the 100 characters point we have min 2.4K, max 327K, median 13K. * At the 200 characters point we have min 2.6K, max 333K, median 18K. That median agrees with the previous analysis that if your Latin font "costs" a lot more that 20K, you may look at it again
How much does one character cost?This was my original question and I tried a few median/average ways to get to that answer (e.g. the average of the median of 1 to 200 characters, the median to add the 201th character and so on). No matter how you slice, it appears the cost of one character is usually about 0.1K.
As a takeaway for me... I think it's worth subsetting chunks of alphabets, e.g. Latin vs Cyrillic. If your UI is in English, go ahead and subset and remove all Cyrillic characters. If there's user-generated content, still do the same, but use the unicode-range in CSS to conditionally load the Cyrillic subset if the content requires it.
But if you're pondering should I keep À but remove Å... just keep it. 0.1K win is not worth the embarrassment of having an occasional Å look too alien (in a fallback font).
In fact sometimes it may be beneficial to have a font face collection (e.g. regular + bold) in the same file download. Sprite-style. This is not supported by WOFF nor WOFF2 but I think it should be. (Maybe a different file type?) Because if you have a median, well-behaved Latin Regular font at 20K and its italics 20K, it may be better to download 40K at once, rather than risk a Mitt Romney Font Problem.
This is a different version of a font on/off toggler bookmarklet. I did one such bookmarklet earlier this year, which works by messing up font-family inside @font-face.
This new version works my messing up url() inside @font-face blocks. By "messing up" I mean changing the string "woff" to "woof" which means making the font files unavailable.
The motivation is to be able to toggle on/off and judge the "new school" of doing font fallbacks, namely using fallbacks such as local(). In short, the previous bookmarklet breaks @font-face blocks that use local(), this version does not. This version only looks for url().
Drawback: looking for "woof" files cases 404s during toggling but I think this is not a biggie.
For a demo and other info check the original post
InstallDrag this link to a bookmark toolbar near you:
Font toggle v2
Full and unminified source
const errors = [];for (let s = 0; s < document.styleSheets.length; s++) { let rules = []; try { rules = document.styleSheets.item(s).rules; } catch (\_) { errors.push(s); } for (let i = 0; i < rules.length; i++) { const rule = rules.item(i); if (rule.constructor.name === 'CSSFontFaceRule') { if (rule.style.oldSrc) { rule.style.src = rule.style.oldSrc; delete rule.style.oldSrc; } else { const src = rule.style.src; if (src.includes('url(')) { rule.style.oldSrc = src; rule.style.src = src.replaceAll('.woff', '.woof'); } } } } }if (errors.length && !window.\_\_fontfacetogglererror) { window.\_\_fontfacetogglererror = true; const msg = ['Could not access these stylesheets:']; errors.forEach(idx => msg.push(document.styleSheets.item(idx).href)); alert(msg.join('\n\n'));}
Today I gave a talk on memory leaks in web apps at the wonderful dotJS conference in Paris' Folies Bergère theater. I was 5 minute so not much time for links and such. Here's more or less what I said including the links.
Bonjour à toutes et à tous, who's excited about... Memory Leaks!?
How many of you have seen this crash? It says "Aw Snap! Out of memory." Rarely, right? We have powerful computers we use for development.
And do you think anyone ever sees this while using your app? Never! How dare you?
You may be surprised.
Couple years ago Nolan Lawson used his tool appropriately named fuite to check the top 10 most popular SPAs.
What do you know, 10 out of 10 apps had leaks.
And these are "the top" 10 apps with good developers working on them.
So it's not that we are lazy or sloppy, it's just that leaks are easy to create.
I while ago I was working at a famous social media site. And we figured we have a problem. We crash people's browsers.
Because we leak memory.
Where do we leak? Who knows, it's a big project.
So what's the solution when we didn't have good debugging tools?
After 15 or so soft navigations, just fully reload the page.
Give the browser chance to start over.
Turn the SPA into an MPA just for a moment.
How embarrassing!
Today things are very different. Today we have tools.
The first step in addressing a problem is admitting you have a problem, right?
How do you know you leak memory?
Check out the... Reporting API
It allows you get data back from the browser, when your users see OOM crashes in real life.
You can stop guessing and get an idea of the real problems.
And let's say you discover that you do leak memory? What to do about it?
Option A: call a friend. This is a person who knows all the secrets of the Universe and can
dive into your app and unearth... The leak!
And then you fix it. And then all is fine. No.
The thing is there's usually not a single leak. And if you fix The One, the next is just around
the corner
Option B: take memory snapshots and look at them. Here's the sequence.
Load your app, make the browser perform garbage collection to free memory and take snapshot 1
Then “navigate” or interact in some way with the app, garbage collect again, and take snapshot 2
Finally go back to the old state, GC, take snapshot number 3
Then you look at what changed between snapshots 1 and 3 and find any JavaScript or DOM objects that are retained but shouldn't be.
If this sounds hard to you, that's because it often is. If only there was a tool that can help...
Enter Memlab. It's a command-line open-source tool by Facebook.
It's been used to plug memory leaks including some in React itself.
It does the three stages: initial load, interaction, and back to the initial state.
Here is an example, these bars represent memory consumption.
Here not only memory is lost on interaction. But even more when going back.
The main goal of Memlab is to do an intelligent diff of the snapshots and point to the source of the leak.
Here it reports that 1 leak was found that leaks a 1000 similar objects and gives you the path to the first one of those objects: a DIV element.
This is a real example of a popular Maps app.
You load the page (bar 1), click a button to show hotels nearby (bar 2) and then unclick the thing.
And you see, memory was leaked again.
Memlab uses Puppeteer to drive the browser and needs a so-called Scenario file.
This is simply a javascript module that implements three functions:
function url(){} for the initial loadfunction action(){} for the interactionfunction back(){} to go back to the initial stateIf you rarely use Puppeteer it can be a bit of a curve to get off the ground.
But to get you from 0 to 1, I've published a Chrome extension:
Memlab Scenario Recorder (webstore, code). It allows you to just click around and have the scenario
javascript file generated for you.
Alright, let's quickly touch on the source of leaks.
In general they are caused by objects that are no longer needed but some reference, some variable is pointing to them.
The solution is to just assign null to that variable.
This signals to the GC that the object is no longer needed and can be safely cleaned up.
In web applications, we sometimes have event listeners
that keep on listening even after the DOM element is no longer around.
Let's see an example using a React class.
Nothing out of the ordinary, right?
Well, when this component is removed from the DOM, the event listener is not.
And that's a memory leak.
The fix? Clean up when we're done.
In this case it means remove the event listener using componentWillUnmount() or whatever API your framework offers.
To wrap up: memory leaks...
I don't mean to sound paranoid, but they're everywhere, man.
As you can see leaks are hard to find, easy to fix.
Use the tools we have available today and find your first leak. And make your users, if not happy (you can't guarantee happiness), at least less frustrated.
Video by Eunjae Lee
Do you hate it when sites open new tabs, for example from search results? Yeah, me too. So I thought a good idea would be to have a right-click "Open in this tab" similar to "Open in a new tab" option. Voilà, a new Chrome extension.
You can install it from the store.
How it works?Two options:
1. Right-click a link and use the context menu
2. Hold the "t" key (as in "Tab") and click the link
DevelopmentI took https://extension.js.org/ for a spin. It's nice when auto-refresh works. But it doesn't always. Overall a good experience, saves some time and it's not overly abstracted to a point of confusion.
Code... is on github: https://github.com/stoyan/openinthesametab.
All the "business" is in service_worker.js for the context menu and content.js for the hold-t-and-click feature.
Cross-browser?I took one small step in x-browserness but that's about it:
if (typeof browser !== 'object') { browser = chrome;}
Feedback?My wife thinks it's the most useless thing ever because she likes to open the search results in new tabs so she can switch. But she also has over 500 open tabs at any time (no exaggeration) and wouldn't install an extension so not my target audience anyway What do you think?
Why minimal?I like "minimum-viable"s of all sorts. As a performance enthusiast I'm fascinated by anything minimal. So here goes a minimum viable SVG favicon.
Why favicon?Welp, browsers will look for one and if you don't have it, enjoy the 404s!
Why SVG?It could be tiny, almost as tiny as a CDN URL, it's scalable and it's all inline.
Why all the "why"s, just show me?!Ok, the first one is:
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'/>">
It's based on prior research into Minimum viable no-image image src and also the research into encoding of SVGs in data URIs. Long story short: you don't need to encode anything except #. So careful if you're adding hex colors to the SVGs.
Having an icon that's nothing is kinda trippy and worth checking out. Now let's go for something.
Next example. You like rectangles? fill-ed with salmon?
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'><rect width='50' height='50' fill='salmon'/></svg>">
As you can see we're keeping the SVGs readable and free of obfuscating encoding by using ' instead of ".
Next, who likes circles?
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 2 2'><circle cx='1' cy='1' r='1' fill='tomato'/></svg>">
A triangle?
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 2 2'><polygon points='0,2 1,0 2,2' fill='salmon'/></svg>">
Alrighty, you get the idea. This is where I leave you to your SVGizing of your favicons!
Note: this SVG icon post is inspired by an example by Barry Pollard who hates missing favicons, that's why my favicon is his profile photo, flipped, because why not.
Updates1. Safari doesn't support SVG icons (thanks Joseph Scott) 2. You can use emoji, a bit more useful than simple shapes (thanks Barry and Lea). Here's my minimal example:
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 1 1'><text font-size='1' y='.9'>????</text/></svg>">
(replace ???? with your favorite emoji, cuz my WordPress DB tables are not the unicode-iest of them all, sorry)
Here's how that looks when you have a basketball emoji in place of ????:
I recently saw someone sharing a blog post on social media using a video that just scrolls through the blog post. I wondered if a video like this can be created easily and automatically. Using a simple bookmarklet. Turns out yes! I ended up with two bookmarklets because they do different and independent things: one captures video (not only of a web page, could be the whole screen) and one scrolls the page stopping at every h* tag. All in JavaScript, all in a bookmarklet, using web APIs. Here's what the end result looks like when running on a post in this here blog:
Bookmarklet 1: captureUsing MediaDevices part of the Media Stream API we can get access to the user's screen, or window, or a tab within a browser window.
We combine this with the MediaStream Recorder API which lets us take the captured data and display it in a hidden <video> element.
The last step is the ole a href where the href points to the video's src. And we add a download attribute and an auto-click to the link.
May sound complicated but it's in fact not too much code. Here it is in its entirety:
const video = document.createElement('video');video.style = 'display: none';document.body.append(video);navigator.mediaDevices.getDisplayMedia({ video: true, selfBrowserSurface: "include", }).then(stream => { go(stream); });function go(stream) { const mediaRecorder = new MediaRecorder(stream); const chunks = [] mediaRecorder.addEventListener('dataavailable', (e) => { chunks.push(e.data) }) mediaRecorder.addEventListener('stop', () => { const blob = new Blob(chunks, { type: chunks[0].type }); video.src = URL.createObjectURL(blob); const a = document.createElement('a'); a.href = video.src; a.download = 'video.webm'; a.click(); }) mediaRecorder.start();}
This code is general purpose, you can capture any old screen. But for my purpose I wanted to scroll through a page. So..
Bookmarklet 2: the heading scrollerThis one is easier to explain, even though it ended up just as long (in terms of lines of code) as the first one. Here we want to select all heading elements. Then scroll to each one, waiting for 2 seconds before moving on. When done, scroll to the bottom of the page. Then scroll back to the top. Here's the code:
const headers = Array.from(document.querySelectorAll('h1, h2, h3, h4, h5, h6'));let currentHeaderIndex = 0;const chillFor = 2000; // wait this long before moving onfunction scrollDown() { const currentHeader = headers[currentHeaderIndex]; const top = currentHeader.offsetTop; window.scrollTo({ top, behavior: 'smooth' }); currentHeaderIndex++; if (currentHeaderIndex < headers.length) { setTimeout(scrollDown, chillFor); } else { setTimeout(() => { window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' }); setTimeout(() => { window.scrollTo({ top: 0, behavior: 'smooth' }); }, chillFor); }, chillFor); }}scrollDown();
Downloads and usageDrag these links to your bookmarks:
Then go to a page you like, click the first bookmarklet to start capturing. Click the second to auto-scroll. Stop capturing and the video should be in your downloads.
Parting wordsThis should work x-browser, although I tested only Firefox and Chrome. The browsers behave differently, for example Firefox includes the browser's bookmarks toolbar and devtools (if you have them open) while Chrome doesn't.
I didn't futz about with codecs and such, left it to the browser to decide. Seems like both browsers decide WEBM as a default format. If your destination of choice doesn't support WEBM, you may need to convert the video to e.g. MP4. I personally use HandBrake for this purpose.
Like 'em bookmarks? Want to improve them? LMK!
tl;dr: You can stop worrying and URL-encode only the # character.
What?So you want to have an SVG image in a CSS stylesheet. Yup, using data URIs (hey lookie, a 2009 post). There are a number of reasons not to embed images in CSS to begin with (caching, reuse), but hey, sometimes you're not in a position to make that particular call.
Base64-encodeOne way to go about it is to base64-encode the SVG, like:
background: url('data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciLz4=')
Drawbacks: Base64 makes the content larger by 25-30%. And also a human reading the code cannot tell what's in the image. (A nice feature of SVG is being able to tell what's in an image, roughly)
URL-encodeAnother way to include an SVG is to use URL encoding:
background: url('data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%2F%3E')
Drawback: the image payload is even larger. All these %20 (spaces) and %3C (<) characters quickly add up. The SVG is a bit more readable though.
As-isA quick bit of testing in modern browsers suggests that the browser can understand the unencoded SVG perfectly well, so how about a new solution: SVG as-is.
background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg"></svg>')
It all works fine for this particular type of MVP SVG but, as I quickly found out, as soon as you add a color like fill="#bad" to the SVG, the # acts like a URL hash and everything after it is no longer part of the SVG. So the image appears broken.
Selective URL-encodingI was curious what other characters may brake the image and I looked and I asked around. In the webperf slack, Radu pointed to Sass/Bootstrap that's been around forever, it's battle-tested, (m)old-browser-verified and so on. Sass escapes these characters:
< becomes %3c> becomes %3e# becomes %23( becomes %28) becomes %29
I see the #hash is there, but the other characters? Digging through github I saw that the encoding was added 5 years ago containing initially only the characters <>#. The parentheses were added later to avoid a bug in a CSS minifier.
If we use a decent CSS minifier, we can forget about )( and we're left with <>#. I cannot find a reputable source for <> but the rumor has it it's for IE support. Well, IE is no more. We're left with only # to worry about.
For folks using old IEs, the worst that can happen is no background image. Meh, so be it. It's not like all of CSS is broken.# encodingAnd here's the conclusion: in this day and age, encode your # (replace with %28) and enjoy small payloads, readable SVGs and modern browser support.
Future-proof?The only nagging thing is that MDN will tell you to URL-encode. That's the right way. The fact that browsers are tolerant may be only temporary. But, if browsers suddenly decide to be strict, that'd be a Web-breaking change. Because a zillion web pages have partially encoded SVGs, I mean Sass/Bootstrap is popular. And browsers go to great lengths to avoid breaking the web. So I think it's safe to assume this minimal #-encoding will work for a looong time.
p.s. And, of course, escape the quotes you use in the url(). I'd suggest using single quotes, so the SVG itself can use double quotes. All of these should be ok though:
background: url('data:image/svg+xml,<svg xmlns=\'http://www.w3.org/2000/svg\'></svg>');background: url("data:image/svg+xml,<svg xmlns=\"http://www.w3.org/2000/svg\"></svg>");background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg"></svg>');background: url("data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'></svg>");
p.p.s. Apologies about the cheesy "the Truth" in the title. I'm aiming to be a tad obnoxious to trick people into proving me wrong. The Truth is what I'm seeking even if I'm wrong temporarily.
TL;DR: If your font file is significantly larger than 20K you may ask yourself "How did I get here?".For images I think we (web developers) have a sense of how many bytes we can expect an image we see on a page to be. A JPEG photo? 100-ish K is ok for a decent quality. Less is nice. How about 200K? Hmmm..., ok. Half a meg? This must be a Hero of some sort. 2 megs? That better be a downloadable hi-res photo of Neptune or something.
But file sizes of web fonts? I personally don't have a gut feeling how much is too much and how much is to be expected. So here's my attempt to find out.
Data setTurns out one can download all Google Fonts from GitHub. Under a gigabyte of stuff, lots of fonts. For my purposes I decided to only look into regular fonts (no bold, italics), which is still plenty. I took only the TTF files that have "Regular" in the name and that's 1128 files.
find /gfonts -type f -iname "*regular*" -print0 | xargs -0 cp -t ../regulars
Tools I usedGlyphhanger is a nice and easy Nodejs library and CLI that uses Python's fonttools and makes it trivial to subset fonts, while also converting to WOFF2 which is the format that will end up on the web.
Fontkit is also a Nodejs library that can inspect a font file and tell you some meta data such as number of characters, number of glyphs (those two are not synonymous, turns out). And there's also a nice crisp web UI on top of fontkit for all your font introspection needs.
US ASCII subsetBecause I was sure some of these fonts may be wild (big sizes, tons of glyphs), I thought I'd level the playing field by subsetting each font only to the 95 characters in basic English, so no umlats and so on. This is the unicode range U+0020-007E, also conveniently called US_ASCII in Glyphhanger.
Converting all fonts is a one-liner:
$ glyphhanger --subset="*.ttf" --US\_ASCII --formats=woff2
Randomly inspecting some fonts I saw some have just a handful of characters, not the expected 95. Reason is some, say Japanese-only, have very few characters in the US_ASCII unicode range. So I thought I should filter only those that have 95 characters.
The complete script is available, but the salient parts are just looping all files, reading the content and passing each one to fontkit for introspection:
const fontkit = require('fontkit');// all filesfs.readdir(fontDirectory, (err, files) => { files.forEach((file) => { fs.readFile(fontPath, (err, fontBuffer) => { const font = fontkit.create(fontBuffer); // and now some handy properties are available: font.familyName font.numGlyphs font.characterSet
font.characterSet.length lets us only work with the fonts that have 95 characters and discard the rest. This results in a total of 1074 files for us draw general conclusions. And here are the results...
Results* Average File Size: 19751.88 bytes * Median File Size: 12380 bytes * Average Glyph Count: 144.92 * Median Glyph Count: 107 * Number of font files: 1074
As you can see there are usually a few more glyphs than there are characters.
And so, a conclusion: the median font file with English-only subset of characters should be around 12K. If you look at your network requests and your font is much larger, well there's work for you to do.
StatsThe full stats are available here in CSV format but here's a taste...
| Num chars | Num glyphs | Bytes | File | Font name | | --- | --- | --- | --- | --- | | ... | ... | ... | ... | ... | | --- | --- | --- | --- | --- | | 95 | 175 | 40260 | GreatVibes-Regular-subset.woff2 | Great Vibes | | 95 | 96 | 4248 | Gudea-Regular-subset.woff2 | Gudea | | 95 | 116 | 16088 | GreyQo-Regular-subset.woff2 | Grey Qo | | 95 | 96 | 47676 | Griffy-Regular-subset.woff2 | Griffy | | 95 | 123 | 14660 | Gruppo-Regular-subset.woff2 | Gruppo | | 95 | 107 | 13760 | Gupter-Regular-subset.woff2 | Gupter | | 95 | 156 | 17964 | Gulzar-Regular-subset.woff2 | Gulzar | | 95 | 116 | 24364 | Gwendolyn-Regular-subset.woff2 | Gwendolyn | | 95 | 213 | 14468 | HachiMaruPop-Regular-subset.woff2 | Hachi Maru Pop | | 95 | 98 | 10452 | Halant-Regular-subset.woff2 | Halant | | 95 | 98 | 6648 | Habibi-Regular-subset.woff2 | Habibi | | 95 | 96 | 10736 | HammersmithOne-Regular-subset.woff2 | Hammersmith One | | 95 | 96 | 10696 | Handlee-Regular-subset.woff2 | Handlee | | 95 | 107 | 34260 | Hanalei-Regular-subset.woff2 | Hanalei | | 95 | 107 | 16448 | HanaleiFill-Regular-subset.woff2 | Hanalei Fill | | 95 | 96 | 8356 | Gurajada-Regular-subset.woff2 | Gurajada | | 95 | 96 | 14912 | HeadlandOne-Regular-subset.woff2 | HeadlandOne | | ... | ... | ... | ... | ... |
OutliersWhat about some font files on the outer edges of the median?
Some small files (2K) are hardly useable:
Others (also 2K) are perfectly fine, though simple:
And even 3k can "buy" you a fine font that makes your visitors say, hey this website is not like the others:
On the larger side (250K) we have
(what happened to the capital F?)
and
I suspect more hole-y fonts are more complicated to draw and therefore weigh more, compared to simple strokes, like an old-timey digital watch.
LATINAlright, 95 characters is fine and all, but you're one Voilà! away from embarrassment, because your font doesn't have an à. So how about a more character-complete LATIN subset. Glyphhanger's LATIN is a more involved set of unicode ranges:
U+0000-00FFU+0131U+0152-0153U+02BB-02BCU+02C6U+02DAU+02DCU+2000-206FU+2074U+20ACU+2122U+2191U+2193U+2212U+2215U+FEFFU+FFFD
I'm not going to pretend I understand why this is the range, but I can tell you these are 385 characters in total, I checked.
let count = 13; // single chars: U+0131, U+02C6, etcfor (let codePoint = 0x0000; codePoint <= 0x00FF; codePoint++) { count++;}for (let codePoint = 0x0152; codePoint <= 0x0153; codePoint++) { count++;}for (let codePoint = 0x02BB; codePoint <= 0x02BC; codePoint++) { count++;}for (let codePoint = 0x2000; codePoint <= 0x206F; codePoint++) { count++;}console.log(count); // 385
Subsetting to LATIN is just as easy as US_ASCII:
$ glyphhanger --subset="*.ttf" --LATIN --formats=woff2
With US_ASCII we had 95 characters in most fonts and removed the ones with fewer characters to keep it all equal. Here, rarely, if ever there's a font that has all 385 characters. Most have a little over 200. So I somewhat randomly picked 200 as a number under which the font is not considered for a comparison. We still have over 1000 font files to compare, but that's a little caveat: not all fonts support the same characters. (I did keep the number of characters in the stats, see below)
Results* Average File Size: 29045.30 bytes * Median File Size: 19092 bytes * Average Glyph Count: 287.03 * Median Glyph Count: 236 * Number of font files: 1009
Conclusion: the median font file with Latin-extended subset of characters should be a little inder 20K. If you look at your network requests and your font is much larger, well there's work for you to do.
StatsThe full stats are available here in CSV format but here's a taste...
| Num chars | Num glyphs | Bytes | File | Font name | | --- | --- | --- | --- | --- | | 262 | 315 | 15884 | Arya-Regular-subset.woff2 | Arya | | 224 | 260 | 32052 | Arizonia-Regular-subset.woff2 | Arizonia | | 224 | 247 | 40712 | AreYouSerious-Regular-subset.woff2 | Are You Serious | | 235 | 236 | 17488 | Armata-Regular-subset.woff2 | Armata | | 209 | 210 | 16920 | Arvo-Regular-subset.woff2 | Arvo | | 228 | 233 | 23044 | Asar-Regular-subset.woff2 | Asar | | 216 | 217 | 24424 | Artifika-Regular-subset.woff2 | Artifika | | 231 | 350 | 23464 | Arsenal-Regular-subset.woff2 | Arsenal | | 231 | 348 | 21244 | AsapCondensed-Regular-subset.woff2 | Asap Condensed | | 230 | 261 | 20792 | Athiti-Regular-subset.woff2 | Athiti | | ... | ... | ... | ... | ... | | 221 | 340 | 12504 | ZenKakuGothicAntique-Regular-subset.woff2 | Zen Kaku Gothic Antique | | 216 | 229 | 15872 | ZenLoop-Regular-subset.woff2 | Zen Loop | | 227 | 921 | 107016 | YujiMai-Regular-subset.woff2 | Yuji Mai | | 221 | 340 | 12516 | ZenKakuGothicNew-Regular-subset.woff2 | Zen Kaku Gothic New | | 226 | 350 | 15928 | ZenKurenaido-Regular-subset.woff2 | Zen Kurenaido | | 226 | 348 | 15564 | ZenMaruGothic-Regular-subset.woff2 | Zen Maru Gothic | | 221 | 341 | 34128 | ViaodaLibre-Regular-subset.woff2 | Viaoda Libre | | 226 | 350 | 19696 | ZenOldMincho-Regular-subset.woff2 | Zen Old Mincho | | 225 | 590 | 43104 | ZillaSlab-Regular-subset.woff2 | Zilla Slab | | 227 | 921 | 94288 | YujiSyuku-Regular-subset.woff2 | Yuji Syuku | | 216 | 317 | 32700 | ZenTokyoZoo-Regular-subset.woff2 | Zen Tokyo Zoo | | 229 | 595 | 43912 | ZillaSlabHighlight-Regular-subset.woff2 | Zilla Slab Highlight |
Next time...So here it is, folks, a web font file that supports extended Latin characters, your Às and your Ás and Â, Ã, Ä, Å... should weigh around 20K. Anything a little over (or a lot over) 20K is up to you to decide. Is the font worth it, can it be subset, etc, etc.
That's, of course, just, like, my opinion. Curious to see other folks' thoughts and/or further experimentation.
As a follow up I want to just try to see how much subsetting really helps. Stay tuned.
Ever wanted to look at your page and turn Web Fonts on and off? Experience the layout shift repeatedly, like some sort of UX torture? Look no further, here comes the handy bookmarklet.
InstallDrag this link to a bookmark toolbar near you:
toggle fonts
UseGo to a page with web fonts and click to toggle. Like so:
SourceThe idea is you go through all stylesheets and mess with the fontFamily of @font-face rules. Simple. Stash them as fontFamiliar for ease of toggling back.
const errors = [];for (let s = 0; s < document.styleSheets.length; s++) { let rules = []; try { rules = document.styleSheets.item(s).rules; } catch (\_) { errors.push(s); } for (let i = 0; i < rules.length; i++) { const rule = rules.item(i); if (rule.constructor.name === 'CSSFontFaceRule') { if (rule.style.fontFamiliar) { rule.style.fontFamily = rule.style.fontFamiliar; delete rule.style.fontFamiliar; } else { rule.style.fontFamiliar = rule.style.fontFamily; rule.style.fontFamily = ''; } } } }if (errors.length && !window.\_\_fontfacetogglererror) { window.\_\_fontfacetogglererror = true; const msg = ['Could not access these stylesheets:']; errors.forEach(idx => msg.push(document.styleSheets.item(idx).href)); alert(msg.join('\n\n'));}
LimitationsWhen the stylesheets are not accessible by document.styleSheets API (e.g. on a third party domain, what!?), we cannot mess with them. But we can report their URLs nonetheless. Only the first time though, too much otherwise.
Layout shift helperWant a layout shift monitor to go with your toggle? Paste this into the console before you go toggling:
new PerformanceObserver((list) => { let cls = 0; for (const entry of list.getEntries()) { cls += entry.value; } console.log(cls);}).observe({type: 'layout-shift'});
You know the pattern: spit out some markup, probably server-side, but hide it for later. On-demand features (not to overwhelm the UI), dialogs waiting to pop, and so on.
content here...
And what happens when the "content here..." includes resources, such as images? Is the browser going to download them? Let's check.
What to test* image in a div hidden with visibility: hidden. The theory is that this is more likely to be downloaded because invisibility still takes a section of the page, the browser needs to calculate geometry of the content, so an image (exp one that has no width/height) will be required to load
* image in a div hidden with display: none
* image in a non-expanded HTML element
* same three things above but with loading="lazy" on the contained images
* all of the 6 scenarios above but below the fold
Test page/files/display/test.html, you can see the full source and play with show/hiding and scrolling. At the top of the pages all downloaded images are being listed (thanks to their respective load events)
ResultsX-browser!I tested Firefox, Safari and Chrome and they all behave exactly the same.
More details1. All non-lazy images are loaded
* no matter if they are above or below the fold
* no matter how their containers are hidden
2. A lazy image above the fold inside a visibility: hidden loads too
3. Lazy images inside a display: none or do not load
4. Scrolling all the way down loads another image, the lazy one inside visibility: hidden
5. Expanding all the containers (while above the fold) loads the lazy images above the fold
6. Finally expanding all the containers (and scrolling all the way down) loads the lazy images below the fold
DiscussionSo what does all that mean? Assuming we prefer to not load images in hidden content...
visibility: hidden is the worst, as expected, avoiddisplay: none and (both behave the same) are slightly better if you use loading=lazy on the images, prefer. And so...loading=lazy on images in hidden contentThe old commented out techniqueThis is a technique that was somewhat preferred in the recent past when we worried about low-powered devices that take a moment to parse large chunks of HTML, when such HTML is not initially required. Since we always need to worry about low-powered devices, it makes sense to refresh on this technique again.
Say this is the hidden content you want to unhide when appropriate:
``` Stuff Integer luctus metus eros, ...
1. *Step 0:* make sure there are no HTML comments inside the hidden content
2. *Step 1:* wrap in HTML comments and put in a container:
3. *Step 3:* unhide in an opportune moment:
const trimmed = commented.innerHTML.trim();commented.innerHTML = trimmed.substring(4, trimmed.length - 3);/ alternatively...commented.innerHTML = commented.innerHTML.replace('', '');/
```
Happy new 2024!Feb 9 updateAdded experimentation with content-visibility: hidden. It's only supported in Chromes and also didn't help, behaved just like display: none. Sad trombone.
Remember spacer.gif? Yeah, "good" old days...
We may now have all the CSS features to make everything better but sometimes the ghost of spacer gif rears its transparent head. And that's an HTTP request. A request that's better devoted to something useful. Like an LCP image or something, I dunno.
So anyway, sometimes a simple change is not that simple and things need to happen gradually. So here's the first line of defence. Replace the spacer's src with a tiny little data URL. A minimum-viable src that doesn't make a request and doesn't break the app.
data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'/>
As in
<img src="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'/>" ...>
Demo:
The best request is the one never made, amiright?
Hello, dear reader and web performance enthusiast!
It’s time to sit down and write an article for the performance calendar.
Here are some more details.
Or if you're not feeling like writing, look around you and recruit the person you think should share their knowledge with the world.
What can you write about? Just share something interesting that happened this year. Research, insights, a case study, a story, bad experience, good experience, war stories, metrics, thoughts, ideas, questions...
Here's also a list of stories from a while ago that are still popular and need an update. Evergreens, if you will.
Or do you know of a nice new-ish Web Platform API?... A chunk of HTML and/or CSS that can save the world a bunch of downloading, parsing and executing JavaScript? That's always and forever appreciated! Here's an example.
Patiently awaiting your contributions,
Ever yours,
Stoyan
Couple days ago I found out about a tool called pdfcpu, a PDF processor. Among its features I saw "optimize" so I had to take it ot for a spin and see how much of an optimization we're talking about. Here's a quick study of optimizing a random-ish sample of PDF files.
Source dataI thought the easiest collection of samples is to just use all the PDFs I happen to have on my computer. Not the most random sample but certainly the most immediate.
I ran (on a Mac laptop):
find / -type f -name "*.pdf" > pdf\_paths.txt
... and voila, a nice collection of PDF file paths (sample).
Next step: deduplication, because it's likely that I have the same file(s) in multiple locations. So add MD5 hashes of the file contents to the pdf_paths.txt (sample), sort and dedup (sample). The scripts I used are on GitHub for your own experimentation.
After dedup, turns out I have close to 5GB worth of 3853 unique PDFs laying about on my harddrive. That's quite a bit more than I would've guessed but a decent sample size for experimentation.
OptimizeAgain, the actual bash script I used is on Github, but in general the command is:
./pdfcpu optimize input.png output.pdf
As part of the optimization script, I also get the filesize before and after the optimization. The stats are here as CSV and as Mac's Numbers file.
ResultsOf the total 3853 input files, 140 of the "after" files were 0 bytes, so I assume pdfcpu choked on these. We ignore them.
82 files were bigger so pdfcpu actually de-optimized file size. Ignore these too.
Of the rest, the median savings were 4.7% (9.4% average savings).
Among the outliers, the file with the largest 98.64% filesize reduction (from 167k down to 2k) was some sort of resource in Mac OS's Numbers app. Here it is as a 187 bytes PNG. Cute, innit?
Parting wordsSo... pdfcpu -optimize yey or nay? Well, the average/median savings were not bad. How would that affect web performance... that depends on how many PDFs you serve. The average site certainly serves way more images than PDFs so time is better spent optimizing those. For comparisson, if you simply run MozJPEG with jpeg -copy none you can expect over 11% median savings (2022 study)
But there are all kinds of use cases out in the wild. If a non-trivial quantity of PDF serving happens on your site, I'd say look into pdfcpu.
And in any optimization script you write, do check for 0 byte results because it might happen (3.6% of the time in my test) and for cases where pdfcpu writes a larger file than the original (2.1% in my case) and keep the original in both these cases.
Are there any other PDF optimizers worth checking out? Please let me know.
Let's see how to setup and run cjxl (and its sibling djxl) on a simple shared hosting provider so you can encode and decode JPEG-XL (aka JXL) images.
HowThere are better ways to install libjxl and its command-line tools but they require you to have sufficient privileges on your computer or server. With inexpensive shared host such as Dreamhost (buy hosting with my affiliate link) you're limitted in what you can do. Worry not, it's still possible!
Get a precompiled binary from JXL releases page. What you want is the linux-static build such as jxl-linux-x86_64-static-v0.8.1.tar.gz (at the time of writing, it's the latest available)
Download and uncompress
Copy the cjxl (and optionally djxl for decompressing jxl files) and copy (e.g. FTP) them to you home directory in a /jxl directory maybe, like /home/username/jxl
ssh and change permissions so these files are executable
chmod +x cjxlchmod +x djxl
(Or don't ssh and use your FTP program to make the files executable)
To test go to any directory with an image and try for example:
$ ~/jxl/cjxl guitar-pedals.jpg guitar-pedals.jxl
Lo! Hello, 21st century!
How do you know the JXL is not corrupt in any way? Well...
Test in Firefox nightly1. Download Firefox nightly from here
2. Go to about:preferences#experimental
3. Check the box "Media: JPEG XL"
Ok, WHY?! Why JXL? It's the future! * Why now? iOS 17 is around the corner. Suddenly the browser support will be much larger, including desktop Safari and iOS Safari and non-Safari browsers on iOS. * Why me? Chrome folks need a little nudge. They want to see adoption before they re-add support to Chrome (this is a speculation, based on their reasons to remove it in first place). So we, the Web developers, need to demonstrate interest*. * Why shared host? Shared hosting still powers large population of sites out there (another speculation, based on personal experience). It's affordable and there are no surprise bills. People use it. I use for this here site and all my sites actually.
Next?I think I'll try my luck with a WordPress plugin that helps with the adoption. Stay tuned.
So I woke up yesterday being scolded by Google. An email from Google Search Console Team with subject "Core Web Vitals INP issues detected on your site". Huh?!
It was about this WordPress-powered site that you're reading now (phpied.com). The Interaction to Next Paint metric (INP for short) was in the "Needs improvement" category as the average of all interactions Google knows about from 138 pages (only?!) is 255ms, which is over the good threshold of 200ms. Wait, what? This is a static blog, no onclick event listeners or anything, how can it be?
Yes, WordPress has a onload/DOMContentLoaded even listener that initializes emojis (of all things!). But it's hardly possible that it takes this long and people click on links 0ms after they appear. I was baffled.
I asked for help in the Web Performance Slack. Barry Pollard offered guidance and a WP/INP-related pointer and Juan Ferreras jumped to debug.
Turns out Chrome (emulating a mobile browser) takes about 250ms to "process" a simple click on the page. With no event listeners attached. How is this possible? Tried an even simpler page. Same thing. Tried google.com... 20ms. Huh?
Then Juan had an insight. Is this the ole delay when the mobile browser doesn't know about clicks? But it translates taps to clicks after a bit of a wait to see whether the user meant a double-tap to zoom. Turns out, yes! A page can opt out of the wait by using a viewport tag. However the viewport tag I had on the blog was 0.9 scale for whatever reason. Which disqualifies it from the opt out.
So the fix:
And the result after the fix:
Finally, telling the Google search console that I fixed the problem and now we wait.
Conclusions and remarks1. This is the easiest INP fix you can ever do. If you don't have a viewport tag on your mobile site (or regular site loaded by mobile browsers), go add one today. Better INP, happier users. 250ms+ interactions down to 20ms or less.
You can use the tag from the doc:
<meta name="viewport" content="width=device-width,user-scalable=no">
... or the one I used:
<meta name="viewport" content="width=device-width,initial-scale=1.0">
The one mentioned in the doc is not user-friendly as it prevents user from zooming.
The web.dev doc talks about a 300ms delay. Looks like this should be updated to 200ms in many cases.
Note how I was looking at the second click for debugging. This is a tip from Barry, the first click may have some profiler overhead. Didn't matter in my case but there is a small diff.
Updates1. Scratch the user-scalable=no for a11y purposes
Chromium bug to start gathering data in the wild https://bugs.chromium.org/p/chromium/issues/detail?id=1464498
Lighthouse bug to warn folks of non-1 scale https://github.com/GoogleChrome/lighthouse/issues/15266
Inspired by Harry Roberts' research and work on ct.css and Vitaly Friedman's Nordic.js 2022 presentation, Rick Viscomi hacked up a tool (a JS snippet) called capo.js that can do what Harry says. Next logical step is to test the results of the tool in a no-code experimental setting and see if the results make sense for your particular case. And then implement on your site and improve its performance.
OK, a little more context, pleaseHarry has been talking for a while that there is an order, there are ways of putting code in the head of a page that are more optimal than others. E.g. where blocking scripts should go vs any meta tags. Because the normal state of development is... we tend to pile these things up in any old fortuitous and adventurous order.
He has even created a clever tool (all in CSS!) that can worn you when tags in the head are less than optimal.
Now Rick has gone a step further and built a tool that reorders the tags in the head for us. Amazing, zero effort for a web developer! Just do as the tool says.
Or do you?
WebPageTest.org no-code experimentsBefore making any code changes or even filing a bug or arguing with your coworkers and project managers whether this is an optimization that you should do, wouldn't it be lovely to just do it, with near 0 effort and be able to tell whether a change is good for you? That's where WPT experiments enter the picture.
(WPT experiments are a "Pro" paid feature but you can try it out for free by testing The Metric Times website)
HowSetup capo.jsInstructions
Run itRun it on the page you want to improve and then expand the "Sorted" head
Copy the headRight-click on the head DOM element created by capo.js, copy the outer HTML.
(Capo means head in Italian, funny, eh?)
ExperimentRun a WPT test on the page you're considering improving.
Important: check the box "Save response bodies" in the Advanced options.
Then select "Opportunities" from the results navigation.
Scroll to (or search the page for) "Edit Response HTML". Click the text box and replace the whole head with the one you copied from capo.js.
Run the experiment.
Admire the resultsDid things improved? Nice! Time to implement it on the live site.
The CircleCan we generalize anything from this whole effort? Yup, I think so, and I'd like to call it the circle of web performance innovation.
Step 1: research. We have The Web full of sites built in a various ways. We look at them and poke around and form various hypothesis how these can load faster.
Step 2: We reach conclusions and generalizations. E.g. Harry's ideas for "getting out head in order".
Step 3: advocacy. We tell people what we've found in blog posts and talks. E.g. Harry And Vitaly's presentations.
Step 4: tools. We build tools as an extension to the advocacy. Often "show" works better than "tell". E.g. Harry and Rick's tools.
Step 5: experiment. Try the research and tools' results on our pages. No-code experiments, a relatively new kid on the block, are even better.
Step 6: once proven, the changes can go live so we can start looking at them and repeat the process.
Makes sense?LMK.
Details from a talk to NYWebPerf meetup will go here after the talk. For now there's: github.com/stoyan/progressive with the code and slides
"When I was younger, so much younger than today" and upset and full of vinegar about the state of the world, I'd say things like "CSS is the worst" (not really). Now, half a year later, older and wiser and more accepting, I'd agree to mellow down to "CSS is render-blocking". Un-render-blocking CSS What this […]
Lately I've been rediscovering the joy of PHP. Also been helping an MVP get off the ground which uses a lot of nocode/lowcode bits and pieces and miraculously puts them together. Anyway, I had to write to an Airtable table with PHP and some quick googling didn't find an example to copy-paste so I'm writing […]
Helloooo, dear reader and web performance enthusiast! It’s time to sit down and write an article for the performance calendar. Here are some more details. Or if you're not feeling like writing, look around you and recruit the person you think should share their knowledge with the world. What do you want to write about? […]
Thomas Steiner has a brilliant idea for this year's Perfplanet calendar edition: what if we revisit some of the best articles from the past. "Best" is subjective but how about "still popular"? So here's a list of the 31 most visited articles in the past year in reverse chronological order of publication. (31 as the […]
tsia ffmpeg -loop 1 -i image.png -i audio.mp3 -c:a copy -c:v libx264 -shortest result.mp4
I wanted to create a video that is a 3x2 grid of 6 other videos. This one to be precise: I was hoping I can use ffmpeg, because the thought of using a proper video editing software gives me the chills. In fact at some point I thought things will require iMovie and went to […]
One of my esteemed professors from Santa Monica College, Dr. Driscoll asked for an opinion on how one can use a sheet of music and reshuffle some measures to generate a unique exercise for each student. This turned out more fun than anticipated and here's a solution I came up with using the free notation […]
tl;dr: Add data-lazy="true" to your Facebook social plugins that are below the fold and reap the benefits. In code: // before
// after The following 18 seconds video demonstrates the difference. Where currently your visitors load Facebook iframe content even if it's way down the page, after you […]Yesterday was my last day at Facebook. After 9 1/2 years it was high time for a change. I dropped the news on twitter/fb and thought now it would be nice to answer the question of "what's next?" that friends are wondering. The trajectory I myself have wondered sometimes what the life after Facebook could […]
I'm not the one who philosophizes often in public, but indulge me this thought on the types of work we do as programmers and feel free to add your own dimensions. I've thought about how sometimes I like to work on user-facing products and sometimes on developer-facing ones. Real products that my mom can see […]
Previously on "Deep Note via WebAudio": intro play a sound 2.1. boots and cats loop and change pitch multiple sounds nodes In part 4 we figured out how to play all 30 sounds of the Deep Note at the same time. The problem is that's way too loud. Depending on the browser and speakers you […]