Sunday, 15 April 2018

Duh - Saves you the trouble to correct your command


This is no more an expression for me but a command now. Thanks to the hack I have been doing for past couple of days.

What's it about? Well, here it goes.

How many times it happens that we screw up commands on terminal?
A typo, a syntax mistake or jumbled up arguments. The command doesn't run and then we spend time retyping it ensuring everything is in place this time.
Quite time consuming, eh?

My laziness simply denied me such a behaviour. So I coded up a powershell cmdlet which can do this for me.
Now if I mess up a command, I just have to type 'Duh' and the right command will be displayed on the prompt for you to check and execute (press ENTER).

How Duh operates internally?
Well, guess what. Answer lies in the "tries".
We have a trie and we do closest match using Leveinshtein Distance.
In short, how to figure out how close two strings are?  Find the no of letters you need to remove/insert/replace in order to attain string 2 from string 1.
This is what's being used in finding the closest correct command to the input command.
Hey but wait, what are the right set of commands constituting the tries? Where do we get these from?
For now I am using two things:
1.  A list of standard git commands (with placeholders for params)
scraped from a website.
2. The history of currently opened powershell.
Why to use the history of commands on a powershell you may ask.
Here's the catch, while in a session there are a lot of commands that we use and reuse , using the latest history of powershell commands will help us correct the cases where we mess up a recently pre-used command (along with right set of params values) with ease since they exist in trie.

How to get history? Use "get-history". Lol.
But this command gives me history of all the commands typed on powershell, including the wrong ones. And we for sure don't want to put them in the trie. After all the programs is not: "We will return another wrong command in exchange with the wrong command you wrote, well just because it's chic, also beauty lies in imperfection".

There's a need to filter out these commands and pick only the right ones.
How to figure out which ones are right from the dumped output? I thought about drawing some heuristics from sample data.
Is there a pattern in execution time of failed commands? Is the time fixed? Or is it the least taken?
Stupid questions as I think of them now. There were definitely commands that were much faster to execute. Also the logic that fails the execution of any command is not the same and takes it's own time. Most importantly I believe the execution time of the same command may also differ based on how much is your CPU occupied.
So, how to filter out? Presently I am using the assumption that the wrong commands are more likely to be present in infrequent numbers. Especially because the chances of repeating a command wrongly in the same manner are really low. Well this is not very definitive and for sure there lies a good amount of drawbacks in this approach : we might lose out on many of the potentially correct commands and also, in case the wrong command is repeated again we will add it to trie and offer it as a suggestion the next time- a totally corrupted experience.
But well this is what I picked up for now.
There's one more better approach where we maintain a map of standard commands with their params and then find the closest match , this support will also help us in handling flags and new params with grace. So will be adding this up soon.

So that is what runs in the cmdlet code ( in c#).
For using this command you need to import the dll as a module and also set the alias duh for ease of usage. I have added a powershell profile.ps1 to ease that up.
We need to place that file in systems32 powershell directory (C:\Windows\System32\WindowsPowerShell\v1.0) and we should be all set.

Haah, so that's what the hack was all about.
I did it for powershell only, for now since I am using that these days but the code can be extended to operate with the other shells too, essentially the IO logic will change.
If you get interested, you can check out the source code at:

There are a lot of issues and TODOs in this task, will open them up on GitHub shortly!
Feel free to drop by any comments/queries/suggestions, would be glad to get the conversation going.

On a parallel note, how this exactly differs from thefuck, a popular project on GitHub which seems to do the same:
thefuck is a "magnificent app", which is totally rule based and quite robust, I believe. It handles cases like if your git push fails, it will set an upstream and then do git push which is an additional layer of intelligence added. But it's an overkill for my specific use, it needs the rules to be explicitly coded up, secondly it takes decent amount of time evaluating the right matches and then generating the command. Duh isn't doing anything distantly resembling this, it's just a 300 lines of code not dependent on any third party libs, that gives me the closest match from a trie. The distinguishing feature is it takes into consideration the user behaviour on the terminal. If I use a set of 10-15 commands for a powershell session, I am likely to mess up one of those commands, now since I have a corresponding trie for that I can get the right one faster without coding any explicit rule to the repo. Thus the number of commands you can check is not constrained.

See yu in the nex\t postt soooon!

See you in the next post soon! :)

Till then,
Happy hacking

1 comment:

  1. This sounds like a tool I'd like to try out… if only there was a zsh version.


Outreachy experience and application tips

One of the best experiences of my student life was to make it to this list: