ahrm’s blog

Google AI previews helped me in Iran’s internet shutdown of 2025

2025-06-20T11:15:21+00:00

Disclaimer: Most of this post was written Jun 20th, 2025, but some links and images were added later.

Today that I am typing these words is Friday, Jun 20th, 2025 though in all likelihood it is not the date that they will be published. Almost exactly a week ago Israel attacked Iran, igniting the 40-year old cold-war between the two nations. This is probably the most serious existential threat to the Islamic Republic since its inception in 1979. In trying times like these, the Islamic Republic shows that it truly believes in something, something that has always saved them in the direst situations and always protected them from the most formidable adversaries. Yes: I am, of course, talking about internet censorship.

To understand what the internet in Iran is like now, let me first paint you the picture of the internet in “normal” times: All non-state affiliated news networks are blocked. All blogger/wordpress weblogs are blocked. All messenger apps like telegram or whatsapp are blocked (whatsapp was briefly unblocked for a few months but it is blocked again now). Facebook, twitter, youtube, twitch, reddit, instagram, tiktok, feedly, discord are blocked.

What is not blocked you ask? Well most google services are not blocked (e.g. search, gmail, maps). Though our friends in silicon valley sure try to make us feel at home:

This is what you see when you try to enter google developer console with an Iranian IP address. It is not blocked from Iran’s side but it is unavailable due to sanctions (that’s all you know? I bet you know a little more than that!). Chatgpt? 403. Google AI Studio? 403. Most online videogames? 403. Android developer documentation (for fuck’s sake)? 403.

Now you might say such an internet is completely unusable. And you would be correct. That’s why 81% of Iranians use VPNs to access the free internet (and remember, this is a survey conducted by Iranian Parliament Research Center in a country where VPNs are technically illegal, so the actual number might be much higher than this).

So that was the internet in peace time. How does the internet look like in the war time where access to the information is most vital? Well, it appears that the regime has completely shut off Iran from the rest of the world. No incoming or outgoing traffic can cross the borders.

Their excuse for doing so are the following:

There were some attacks on some banks
One of the largest cryptocurrency exchanges in Iran was hacked and more than 100 million dollars was stolen
There were claims that Israeli drones were using Iranian SIM card internet to operate

I am not going to comment on the validity of these arguments, they may be true. But remember that the internet also was shut down during the 2019 and 2022 protests and there were no cyberattacks then.

There is one external website that for some reason is not blocked though: google. It seems like the ip address 216.239.38.120 which belongs to google is specifically whitelisted.

So I can for example search for recent news about iran, but I can only view the title and the short content preview in the google search results, I can not open the articles.

Since I can’t get any real work done (even internal ssh connections are blocked now!) I decided to do some work on voil (my vscode extension which is similar to oil.nvim) which I have been putting off for a while, it is a fun way to distract myself and stop stressing about the news. The problem is, I am not really experienced when it comes to vscode extensions, which means a lot of documentation lookups is necessary. And you might guess what the problem is: while I can search for the documentation, I have no way of actually reading the contents … or do I? This is where the AI preview feature comes in, while this feature has been hated on a lot since its introduction, I can’t deny that it did really help keep my sanity during this time.

Unfortunately there is no way to force an AI preview answer (and it is not even deterministic, for the same query sometimes there is an ai answer and sometimes there isn’t). But forming the query as a question significantly increases the probability of an answer though of course it is not 100%. Unfortunately I found it to be less useful for reading news, but it was very effective for reading the documentation.

Anyway, while it is fashionable these days to hate on everything AI-related, I thought it was salient to mention this small way that AI managed to help me. Of course google used to have a cache feature where you could view the cached version of websites. It was a highly useful feature, which, of course, means it was removed a while back. If that feature was still available, all these shenanigans would have been unnecessary, though I suspect if that was the case then the google IP would not have been whitelisted in the first place. Also it would have been very useful to have a way to force/control the AI preview.

Three-eyed forehead in Stable Diffusion

2023-01-02T11:15:21+00:00

Today I saw an interesting post on hackernews where the author tried to remake an old game by recreating the pixelated art using some AI image generation models for example:

It worked reasonably well for most of the images, but there is one image which could not be easily created using the models, not even with stable diffusion inpainting:

Apparently it was impossible to recreate the three eyes on the forehead. I have a few theories why this is the case:

Probably there are a lot of “normal” looking humanoid portraits in the dataset, so the model is probably heavily biased towards producing “normal” humanoids.
These models usually have trouble with numbers, so even when there are eyes in the forehead, it is rarely exactly three eyes

I was wondering if using the advanced inpainting of my Stable Diffusion desktop frontend frontend, we could achieve the illusive three-eyed forehead.

Before we begin, let me give you a quick overview of how stable diffusion inpainting (and my advanced inpainting implementation) work. I assume you are already familiar with the basics of how diffusion models work, if you are not, there are excellent resources on the web.

In order to inpaint, first the masked portion of the image is filled in using a “dumb” inpainting algorithm (for example color each pixel with the closest non-masked pixel’s color). Then we use the encoder to encode this image to the latent representation of the diffusion model. Then we add some noise to the masked part of the image and run the normal diffusion process.

Using advanced inpainting, we modify the first part of this process, so instead of using an algorithm to inpaint the missing parts, we could manually specify an initial image in the masked area. This heavily guides the diffusion process to generate something resembling the initial image. Here is a demo of this method:

Here is how I approached this problem: First I downloaded a random eye image from the web and used advanced inpainting to create a version with just one eye:

Prompt: Demonic red eye on the forehead

Negative Prompt: Eyelashes

Generated using advanced inpainting by pasting an eye image from the web on the forehead

I didn’t bother making this look good, because we will have to inpaint over it anyway to generate the three eyes. I just needed something reasonable. Now we paste this eye on the forehead to create an initial image for the advanced inpainting:

Now we mask the three eyes, but use the original image as the initial image. This will guide the diffusion process to put three eyes in the masked location. We can even repeatedly apply this process, using the same mask each time but using the newer images (undoing changes if the new images were not as good), we can gradually guide it to generate something that we want (we can even change the prompt and parameters each time). Here is a sequence of generated images:

You may notice the border around the masked area, but we could fix that we normal inpainting:

And here is the final result:

Of course it is not a masterpiece, but it was a very fun experiment. And it has the potential to be way more fun, because I was running it on an old 1070, each inpainting took about 20 seconds which was quite annoying. But I could envision a future where generation is basically real-time, imagine navigating through possible generations using mouse wheel and tweaking the parameters and seeing the effects in real-time. With the supposed improvements in stable diffusion, this future might not be far away.

Sioyek 2.0 Release Notes

2022-12-12T11:15:21+00:00

Sioyek is an open-source, cross-platform PDF viewer, optimized for research papers and textbooks. These are the release notes for the recently released sioyek 2.0. If you are not familiar with sioyek, here is a video tutorial.

super_fast_search

We now have a super_fast_search option which can be enabled in prefs_user.config like so:

super_fast_search 1

When enabled, sioyek indexes document texts for extremely fast search:

For a more thorough benchmark and comparison with other viewers see this blog post.

The reasons that it is not enabled by default is that the index slightly increases memory usage (about 50MB for every 1000 pages). It should not be a big deal for most users though, so I recommend enabling it unless you have <2GB RAM.

When super_fast_search is enabled, we have a regex_search command which uses regular expressions to search the document. For example searching for [0-9] finds all the digits in the document.

Scrolling between pages in overview window

Sioyek allows you to open a quick overview of references (even when they are not linked in the PDF file). Previously you could scroll in this window but only in the original page. Now, we allow you to scroll to other pages in the overview window:

Search results in an overview

Using overview_next_item and overview_prev_item commands, you can now open an overview to search results instead of jumping to them:

overview_to_portal

Added a new overview_to_portal command which opens a quick overview to the closest portal. Previously portals were mostly useful for users with multiple monitors, but now they should be beneficial for all users. See the portal section of tutorial video for a brief introduction to what portals are, as well as a demo of this feature.

Macros

You can now define macros in your prefs_user.config file, which can be used to execute multiple commands. For example:

new_macro _goto_top_right goto_top_of_page;goto_right

Note that macro names must start with an underscore so as not to be confused with built-in sioyek commands. The commands in the list are separated using a semicolon.

Source other config files

Added a source command which allows you to include another config file in your prefs_user.config. Can be used like this:

source /path/to/other/file.config

Which is quite useful for easier installation of extensions and themes. For example see the dracula theme here: https://draculatheme.com/sioyek.

Improved extensions

The official python module now uses a much faster communication method with the running sioyek process. Moreover, we have added some new variables which can be used in extensions, for example %{selection_begin_document} and %{selection_end_document} which expand to the current selection locations, and %{selected_rect} which expands to the current selected rectanle using the new select_rect command.

For example here is an extension that uses these new options to add text annotations to sioyek:

Other changes

Upgrade to MuPDF 1.20 .
New keybind parsing method with support for non-standard layouts and unicode characters
Add a smooth scroll mode.
Add ability to select single words using keyboard_select command
Add a scrollbar which can be enabled using toggle_scrollbar command
Add commands to set configuration options at runtime.
Add prerendered_page_count option which allows to configure how many pages does sioyek prerender
Add an option to show the closest bookmark in the statusbar
Add an option to indicate whether we are close to a portal in the statusbar
Add an option to highlight using middle click instead of pressing a button. See https://github.com/ahrm/sioyek/commit/7390a40dec98b829c8beacd5d3997b00d2072ec7.
Add ability to specify colors in config files using hexadecimal strings. For example instead of 1 1 0 you can now use #ffff00.
Many bug-fixes and quality of life improvements

PDF viewer text search speed comparison

2022-09-11T11:15:21+00:00

Recently, I implemented a super fast search index into sioyek which accelerates normal search and also enabled regular expression search. It is not yet released in a stable sioyek build, but if you want to try it out, there are experimental builds here. It is not enabled by default (it slightly increases memory consumption, so I disabled it by default) but can be enabled by adding this to prefs_user.config file:

super_fast_search 1

In order to test it, I decided to find all instances of letter ‘a’ in a 730-page document. See the result in this video:

However, I didn’t think this benchmark was good enough for multiple reasons:

Some very popular PDF viewers are missing. The reason is that many of them don’t report the number of matches (for example sumatra just jumps to the next match and firefox just finds the first 1000 matches). Therefore we could not compare those readers.
Finding all instances of ‘a’ might not be a very useful search in practice
Sioyek finds the results so fast that we can not get an accurate measure of its time

So I decided to find a harder PDF file and do another benchmark on that. Now the previous file already wasn’t that small (it was 730 pages), but I needed a significantly larger file, and I didn’t want to create a file myself because I wanted the result to be as authentic as possible. In my search of a big-ass book I came across this behemoth:

It’s 4100 pages of tightly packed, two-column, small-font text. And it is the perfect test subject for us.

But before doing the new tests, let’s repeat our old test (finding all instances of ‘a’) in this book, just to get a sense of how large it is. I tested it only with the viewers that found all the results (sioyek, zathura, mendeley, zoreto, chrome, and edge). Here are the results:

Well, that’s not really useful. Let’s remove chrome and edge which seem to be outliers, hopefully now it will be more informative:

Fuck it. Here is the raw data:

program	time (s)
sioyek	0.9
zathura	50
mendeley	65
zotero	202
chrome	5500
edge	15000

Note that chrome and edge took so long that I terminated them after 10 minutes and extrapolated the final time based on the results found in ten minutes. Which is very generous because chrome was showing clear signs of non-linear behavior which means that the true time might be even larger than this.

Main benchmark

Okay now we get to the main benchmark. In order to reduce the variance and also effects of particular algorithms (for example some viewers search from the beginning, and so they perform better if the query is in the first pages, some other viewers start from current page, etc.) we used the following process:

10 pages in the document were chosen completely randomly
A string was chosen from each page, such that this string does not appear on any other page of the document
In order to test a viewer, we open the viewer on the first page of the document and search for the chosen strings, one by one, and we don’t change the pages (so we are on the page of ith result when starting the search for (i+1)th result)
We report the average and the median of search times

Here are the results:

Again, let’s remove the large values:

Sigh, here are the raw numbers:

program	average time (s)
sioyek	0.03
sumatra	3.0
zathura	4.8
firefox	6.2
edge	15.7
zotero	16.5
chrome	22.2
foxit	35.4
mendeley	68.1
okular	72.8
acrobat	98.7

Now I must admit, the reason sioyek is so fast is because it creates a search index when you open the document. In these tests I have waited until this search index is built (with the justification that the index-building is fast enough that by the time the user wants to do a search in the document it is done). Some other PDF viewers (namely, zathura, firefox and sumatra) seem to create indices to speed up searches too, however, instead of creating it when the document is opened, they create it the first time you perform a search, which causes the first search to take unusually long time but the subsequent searches are much faster. I don’t think this comparison is unfair, because it accurately reflects the time that the users have to wait for their search results (in fact, I think it is a little generous because 10 searches is probably above-average number of searches, and the fewer searches we have, the more pronounced the effect of first search indexing becomes). But to be completely fair, I also computed the median search time which is not affected by the indexing in the first search. Here are the results:

You can see there is a visible gap between the programs that do the indexing and those that don’t. Here is a comparison the programs that do the indexing:

Indexing Time

One more important factor for the programs that do the indexing is the time it takes to create the index. Here are the results:

Finally sioyek has been dethroned, although it is very close (30 seconds vs 28 seconds). I think the reason is that we don’t just index the text of the document during the index procedure. We also try to find all the figures, references, equations, etc. which enables the smart jump feature.

How does the indexing work?

I wish I could tell you that I made some genius optimizations to make the search fast, however, the truth is that the index is extremely trivial: we just concatenate the text of all the pages, and also create some backward indices so that we can find the page and location of a match in the document given its location in the concatenated string. That’s it. In fact almost all the credits goes to the writers of c++’s standard library functions std::find and std::regex_search.

So after all, sioyek’s speed is not that impressive. What I would say is impressive though is how slow some programs manage to be. For example the average search time in acrobat, the program create by adobe, the multi-billion dollar company that created the PDF format and employs more than 25000 people is more than 3000 times slower than the average search time in sioyek. It is even more than 3 times slower than the time it takes sioyek to build the entire search index! Now that’s impressive.

Reading textbooks with lots of references using sioyek

2022-08-30T11:15:21+00:00

This post is an overview of main features of sioyek, a PDF viewer optimized for reading research papers and textbooks.

Suppose you are reading a textbook with a lot of references. Something like this:

Imagine how much time and context you lose by scrolling back and forth every time we see a reference. Sioyek automatically detects the reference targets (even if the document doesn’t have links, which is the case for the document in this example) and jumps to references. You can also mark your location before the jump so that you don’t lose your context when you come back:

But wait, there is more! You don’t even have to jump to the references because sioyek can show a preview of the referenced location:

But wait, there is more! The marker used in the first video to mark the line can also be moved to highlight the current line being read. This has many advantages:

Makes the current line stand out, which makes it more readable, especially for people with dyslexia
You never lose the context of which line you were reading (e.g. when someone calls you)
Automatically handles multicolumn documents
Since there is usually only one reference on the current line, we can automatically detect it and show the destination just by pressing a button, without even needing to click on the reference.

But wait, there is more! You can search the papers in google scholar just by middle-clicking on their name. Or you can download them directly from google scholar and scihub by control+clicking on their name. And the beautiful thing is that last feature (downloading papers from google scholar and scihub) is not a built-in feature but it is implemented using an extension, and you can create similar extensions of your own! This is the documentation on how to build your own extensions, and these are some of the extensions that I have built (including the one that downloads from google scholar and scihub).

Implementing text to speech for sioyek PDF viewer

2022-07-05T11:15:21+00:00

Note: the scripts in this post were tested on windows and do have some windows-specific code, but they can easily be ported to other operating systems.

Here is the final result (enable audio):

Introduction

One of the main new features in sioyek 1.4 is the ability to execute external scripts and the ability to control sioyek from command line. In this post, we show how to combine this features to implement a simple (yet completely functional) screen reader for sioyek.

Sioyek has the ability to execute scripts, for example consider the following script which creates an OCRed version of a PDF file using ocrmypdf and then opens the result:

import sys
import os

if __name__ == '__main__':
    file_path = sys.argv[1]
    new_path = file_path.split('.')[0] + '_new.pdf'
    os.system('ocrmypdf "' + file_path + '" "' + new_path + '"')
    os.system('sioyek "' + new_path + '"')

you can run it from sioyek by running the execute command and entering the following:

python /path/to/script.py "%1"

Here the %1 expands to the path of the current file in sioyek. Note that the quotation marks are necessary if the path contains spaces. There are other expanded variables other than %1, here is the complete list:

%1 expands to the path of the current file
%2 expands to just the file name of the current file
%3 expands to the selected text
%4 expands to the current page number
%5 expands to an input text which is received from the user using a text prompt
%6 expands to the text of the current line in sioyek’s visual line mode

Here is how it looks like in action:

Of course, typing this command every time is not a good solution, you can predefine commands in your prefs_user.config file:

execute_command_o python /path/to/script.py "%1"

Now instead of typing the command, you can run the execute_predefined_command command in sioyek (which itself can be bound to a key) and then press o (o is the name of the predefined command, you can have 26 predefined commands with names a-z). Or you could directly bind a key to execute execute_command_o in your keys_user.config file:

execute_command_o

Note that the o is just the name of the command and doesn’t have anything to do with its keybinding, for example here we have bound it to shift+r.

Here is another sample script which translates the highlighted text into french:

import sys
from googletrans import Translator
from tkinter import messagebox
import tkinter

if __name__ == '__main__':
    text = sys.argv[1]
    translator = Translator()
    translation = translator.translate(text, dest='fr')
    root = tkinter.Tk()
    root.withdraw()
    messagebox.showinfo("tanslation", translation.text)

prefs_user.config:

execute_command_t python D:\sioyek-scripts\translate.py "%6"

keys_user.config:

execute_command_t

Here is how it looks in action:

Here is a very simple text to speech scripts (works only on windows, can easily be ported to other operating systems by replacing windows text to speech with alternatives):

import os
import sys

def escape(s):
    temp = "".join(c for c in s if ord(c) < 127)
    return temp.replace("'", "''")

if __name__ == '__main__':
    os.system('''PowerShell -Command "Add-Type -AssemblyName System.Speech; (New-Object System.Speech.Synthesis.SpeechSynthesizer).Speak('{}');'''.format(escape(sys.argv[1])))

prefs_user.config:

execute_command_t python D:\sioyek-scripts\tts.py "%6"

This is of course, very basic and requires the user to manually read every line but it is a good base and can easily be extended to include more advanced features. I implemented a more sophisticated version here which is too long to include in this post, but here is how it works at a high level:

Instead of generating speech line-by-line (which would not flow very well), we concatenate all the lines and create an audio file for the whole page
First, we generate a low-quality but fast audio file using windows tts and while that is playing we use mozilla’s tts to generate a more high-quality sample. When the high-quality sample is ready, we swap it in.
We align the audio and text using aeneas, when the user requests a read command from a specific line, we use this alignment to find the location of line within the audio file
We automatically highlight the current line as it is being read

here is the relevant prefs_user.config file:

# start reading from highlighted line
execute_command_a python \path\to\server_read.py "%1" %4 "%6"
# stop reading
execute_command_b python \path\to\server_stop.py
# keep highlighting the current line being read
execute_command_c python \path\to\server_follow.py
# stop highlighting the current line being read
execute_command_d python \path\to\server_unfollow.py
# start the tts server (should be running before executing previous commands)
execute_command_e python \path\to\manager_server.py

and keys_user.config file:

execute_command_a r
execute_command_b 
execute_command_e >
# when we manually move the line, stop following it
move_visual_mark_down;execute_command_d j
move_visual_mark_up;execute_command_d k

Notes

Unfortunately mozilla tts is prone to something that I call “Spontaneous Stroke Syndrome” which is shown in the video below, I am not sure exactly what causes it, if someone has any ideas on what I may be doing wrong I would appreciate any help.

Using Language Models to (probably) Read Faster

2022-04-14T11:15:21+00:00

Idea

A couple of weeks ago I saw this hackernews article about a method of text rendering to increase text readability. The algorithm is pretty simple: highlight the first few characters of each word (how many characters depends on the size of the word). Here is a screenshot of what it looks like from its website:

That got me thinking: what if instead of using a heuristic method to determine how many characters to highlight, we used a language model? Specifically we highlight the character only when the language model fails to predict the character given its preceding context. Presumably if a language model is smart enough to predict the character, so are we!

Implementation

First of all, we need a character-based language model. I used a single-character version of reformer fine tuned for enwiki8 dataset which is available on huggingface (as I will mention in the notes section, this is a huge overkill but whatever, this is just an experiment ;) ). Let’s test it:

import torch
from transformers import ReformerModelWithLMHead

model = ReformerModelWithLMHead.from_pretrained("google/reformer-enwik8")

# removed for brevity, you can find them on the hugginface repo homepage
def encode(list_of_strings, pad_token_id=0): ...
def decode(outputs_ids): ... 

def generate_next_char(text, n_chars=1):
    return decode(model.generate(encode([text])[0],
                  max_length=len(text)+n_chars))

>>> generate_next_char("This is a ")
"This is a s"
>>> generate_next_char("This is a p")
"This is a pr"
>>> generate_next_char("This is a pr")
"This is a pro"
>>> generate_next_char("This is a pre")
"This is a prec"
>>> generate_next_char("This is a pred")
"This is a prede"
>>> generate_next_char("This is a predi")
"This is a predic"
>>> generate_next_char("This is a predic")
"This is a predict"
>>> generate_next_char("This is a predict")
"This is a predicti"
>>> generate_next_char("This is a predicti")
"This is a predictio"
>>> generate_next_char("This is a predictio")
"This is a prediction"

If we wanted to highlight the word “prediction” using the language model, it would look something like this: prediction, only the characters which language model got wrong are highlighted. I implemented this in sioyek PDF reader and the results look like this (if it looks blurry open the image in a new tab and zoom in):

I find sudden highlights in the middle of a word a little off-putting, let’s change it so that a word is highlighted from the begining until the last mispredicted character. Using this scheme, prediction would become prediction (I call this process refinement). It looks like this in sioyek:

It looks much better, but still words like continued annoy me. If I have already read most of the word, there is little benefit in hiding the rest. So I changed it such that if more than 50% of a word is highlighted, we highlight the entire word (I call this process filling). It looks like this:

Here is a comparison of different highlight modes and the original (bionic) heuristic:

For performance reasons, instead of feeding the entire page from the begining to the point where I want to predict, I only feed the last n characters before the prediction points. Here is a comparison of the results for different values of n:

It seems that we reach dimininshing returns at about 30 characters.

Enabling in Sioyek

If you want to try these out on a PDF file, you can download the latest experimental version of sioyek. Here are the relevant configurations:

text_summary_url: The url of the server which provides the summary. I did not include the server in sioyek itself because I did’nt want to bundle the entire pytorch with sioyek for an experimental feature. Instead I created a python script which runs a local server providing this feature. You can find the script here. The default value is http://localhost:5000/ which is the default port of the script, so if you don’t change the script you don’t have to set this value.
text_summary_should_refine: 1 if you want refinement and 0 otherwise
text_summary_should_fill: 1 if you want filling and 0 otherwise
text_summary_context_size: number of characters in context for next character prediction

For example here is the relevant parts in my prefs_user.config:

text_summary_should_refine 1
text_summary_should_fill 1
text_summary_context_size 40

Of course we have default values for all of these configs so you don’t have to change anything if you are comfortable with the default settings.

Now, in order to use this feature, run the summary_highlight_server.py script and then enable highlights in sioyek by executing toggle_fastread command (press : and type toggle_fastread, it may take a few seconds to compute highlights depending on your GPU).

Notes and Improvements

I don’t have any data on whether this actually improves reading speed or not. But in my own subjective experience, I think it does.
Currently this is too GPU-intensive to be deployed. Of course using a full-fledged language model for this task is overkill. Also, as mentioned in huggingface repo page, this model is not optimized for language generation. Probably the best option would be a relatively small RNN language model, however, I could not find any decent pre-trained character-based RNN language models and I don’t have the resources to train it myself. Even simpler non-neural network models are probably good enough.
One limitation of this approach is that we don’t consider the future context to determine whether to remove a word. For example consider the snippet “task-specific training examples” (in our examples all three words were highlighted). But maybe if we knew that we were going to include both “task-specific” and “examples” then a language model could predict that the middle word in “task-specific [MASK] examples” is “training” with high probability and we could unhighlight the word “training”. However, this is probably too computationally intensive to be worth it.
Is it possible to use language models that use non-character tokens for this task? That would help a lot since most pre-trained language models are not character-based.

Using LF file manager on windows

2022-04-02T11:15:21+00:00

lf is an extremely fast and customizable terminal file manager for windows, mac and linux. By default lf doesn’t have many features that you might expect from a file manager (for example archiving and unarchiving files), but provides a powerful interface for the user to add these features themselves. Of course, the documentation has a long list of recipes for most common features that a user might want to add. However, the documentation and the tooling surrounding lf is mostly focused on linux, and setting it up on windows requires some modifications to the recipes provided in the documentation. In this post I will explain my journey to make lf the perfect file manager in windows. You can download all of the files and scripts detailed in this post here. Note that these script are not meant to be copied verbatim (for example they contain some hard-coded paths which you may need to modify).

Prequisites

I will not describe the basics of using lf in this post, the documentation has done an excellent job of that. I assume you are already familiar with the basics of lf.

In order to run all of the commands and scripts in this post, you will need python3, fzf, 7zipand msys2. You will also need the following packages for python:

Pillow
mupdf
tkinter
tkinterdnd2

You can install all these packages using pip.

Navigating the drives

As far as I know, by default in order to navigate to another drive in lf you have to type something like this: :cd ‘D:\’ which is a lot of keystrokes for something as common as this. So I placed a mark with the name name at the root of each drive. For example after navigating to drive D, I marked it by pressing md and now I can jump to it by pressing ‘d.

Basic Utilities

Here we we configure renaming, quick reloading and other utilities (put the contents into your lfrc file)

set filesep " "

# quick rename using r
cmd rename %sh -c 'mv -i %f% $0'
map r push :rename

# reload config file using f5
map  push :sourceC:/Users/Lion/AppData/Local/lf/lfrc

# use a and A to create files and directories
cmd createfile %sh -c 'touch $0'
cmd createdir %sh -c 'mkdir $0'
map a push :createfile
map A push :createdir

# open explorer in current directory
map S push &start.

# copy file path
map Y %echo %fx% | clip 

# open file in nvim
map V &nvim-qt %f%

# archive management
cmd zip %sh -c '7z a $0 %fx%'
cmd extract_here %sh -c '7z e %f%'
cmd extract_to %sh -c '7z e %f% -o$0'

Fuzzy file search using `fzf`

lf wiki has a section for fzf integration, however the commands specified there are for linux and need some modification in order to work on windows. Which I have done and they are available here. Just copy findfzf.bat and fzfpy.py on your system and add the following to your lfrc.

# use c-f to fuzzy search
cmd fzf_jump push $pythonD:/lf_scripts/fzfpy.py%id%
map  :fzf_jump

Of course, you have to replace D:/lf_scripts/fzfpy.py with the location of the file on your system (note that you probably need to edit findfzf.bat and specify the correct path of find.exe in your msys2 installation).

Drag and Drop

Being a terminal application, lf does not support drag and drop. However, some applications are impossible to use or very inconvenient without drag and drop. I wrote a script to add drag and drop functionality, just copy the script to your system and add the following to your lfrc (again, you have to modify the paths to point to your file locations).

# drag and drop
cmd drag push &pythonD:/lf_scripts/drag.pymulti%fx%

# close the drag window after one use
cmd dragonce push 
&pythonD:/lf_scripts/drag.pyonce%fx%
map D push :dragonce

Here is how it looks like:

File Preview

lf is a 3 panel file manager: left panel shows the parent directory, middle panel shows the current directory and the right panel shows the contents of selected directory. If the selected item is a file instead of a directory, lf can show a preview of file on the right panel. By default it does so only for text files. I have written a preview script that displays some useful information for other file types. These include:

File size and last modify date for all files
Image dimensions for image files
Number of pages and text content of the first page for PDF files

It looks like this:

In order to activate it, you need to download lf_preview.py and preview.bat and add the following to your lfrc.

# custom file preview
set previewer "D:\\lf_scripts\\preview.bat"