Emacs

All posts in here will have the category set to emacs

Getting Emacs 29 to Automatically Use Tree-sitter Modes

Recently, /u/casouri posted a guide to getting started with the new built-in tree-sitter capabilities for Emacs 29. In that post, they mention that there will be no automatic major-mode fallback for Emacs 29. That means I would have to use M-x python-ts-mode manually, or change the entry in auto-mode-alist to use python-ts-mode, in order to take advantage of the new tree-sitter functionality. Of course, that would still leave the problem of when the Python tree-sitter grammar isn’t installed, in which case python-ts-mode is going to fail.

To solve this issue, I wrote a very small package that adjusts the new major-mode-remap-alist variable based on what grammars are ready on your machine. If a language’s tree-sitter grammar is installed, it will use that mode. If not, it will use the original major mode. Simple as that!

For the impatient: `treesit-auto.el`

The package I wound up with is available on GitHub and MELPA as treesit-auto.el. So long as MELPA is on your package-archives list like this:

(add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/") t)

Then you can use M-x package-refresh-contents followed by M-x package-install RET treesit-auto.

If you also like having a local copy of the git repository itself, then package-vc-install is a better fit:

M-x package-vc-install RET https://github.com/renzmann/treesit-auto.el

Then, in your configuration file:

(use-package treesit-auto
  :demand t
  :config
  (global-treesit-auto-mode))

See the README on GitHub for all the goodies you can put in the :config block.

Origins of `treesit-auto.el`

The recommendation in Yuan’s article was to use define-derived-mode along with treesit-ready-p. In the NEWS (C-h n), however, I noticed a new variable major-mode-remap-alist, which at a glance appears suitable for a similar cause. For my Emacs configuration, I had two things I wanted to accomplish:

Set all of the URLs for treesit-language-source-alist up front, so that I need only use treesit-install-language-grammar RET python RET, instead of writing out everything interactively
Use the same list of available grammars to remap between tree-sitter modes and their default fallbacks

Initially, I tried Yuan’s suggested approach with define-derived-mode, but I didn’t want to repeat code for every major mode I wanted fallback for. Trying to expand the major mode names correctly in a loop wound up unwieldy, because expanding the names properly for the define-derived-mode macro was too challenging for my current skill level with Emacs lisp, and wound up cluttering the global namespace more than I liked when auto-completing through M-x. Instead, I decided take a two step approach:

Set up treesit-language-source-alist with the grammars I’ll probably use
Loop over the keys in this alist to define the association between a tree-sitter mode and its default fallback through major-mode-remap-alist

This makes the code we need to actually write a little simpler, since an association like python-mode to python-ts-mode can be automatic (since they share a name), and we can use a customizable alist for specifying the edge cases, such as toml-ts-mode falling back to conf-toml-mode.

To start with, I just had this:

(setq treesit-language-source-alist
  '((bash "https://github.com/tree-sitter/tree-sitter-bash")
    (c "https://github.com/tree-sitter/tree-sitter-c")
    (cmake "https://github.com/uyha/tree-sitter-cmake")
    (common-lisp "https://github.com/theHamsta/tree-sitter-commonlisp")
    (cpp "https://github.com/tree-sitter/tree-sitter-cpp")
    (css "https://github.com/tree-sitter/tree-sitter-css")
    (csharp "https://github.com/tree-sitter/tree-sitter-c-sharp")
    (elisp "https://github.com/Wilfred/tree-sitter-elisp")
    (go "https://github.com/tree-sitter/tree-sitter-go")
    (go-mod "https://github.com/camdencheek/tree-sitter-go-mod")
    (html "https://github.com/tree-sitter/tree-sitter-html")
    (js . ("https://github.com/tree-sitter/tree-sitter-javascript" "master" "src"))
    (json "https://github.com/tree-sitter/tree-sitter-json")
    (lua "https://github.com/Azganoth/tree-sitter-lua")
    (make "https://github.com/alemuller/tree-sitter-make")
    (markdown "https://github.com/ikatyang/tree-sitter-markdown")
    (python "https://github.com/tree-sitter/tree-sitter-python")
    (r "https://github.com/r-lib/tree-sitter-r")
    (rust "https://github.com/tree-sitter/tree-sitter-rust")
    (toml "https://github.com/tree-sitter/tree-sitter-toml")
    (tsx . ("https://github.com/tree-sitter/tree-sitter-typescript" "master" "tsx/src"))
    (typescript . ("https://github.com/tree-sitter/tree-sitter-typescript" "master" "typescript/src"))
    (yaml "https://github.com/ikatyang/tree-sitter-yaml")))

At this point, I can just use M-x treesit-install-language-grammar RET bash to get the Bash grammar, and similarly for other languages.

Then, I made an alist of the “weird” cases:

(setq treesit-auto-fallback-alist
      '((toml-ts-mode . conf-toml-mode)
        ;; I don't actually know if the future tree-sitter mode for HTML will be
        ;; called html-ts-mode or mhtml-ts-mode, but if it's the former I'd include this
        (html-ts-mode . mhtml-mode)
        ;; See the note in their README: https://github.com/emacs-typescript/typescript.el#a-short-note-on-development-halt
        (typescript-ts-mode . nil)
        (tsx-ts-mode . nil)))

Setting the CDR to nil explicitly means I didn’t want any type of fallback to be attempted whatsoever for a given tree-sitter mode, even if something similarly named might be installed.

Finally, I had a simple loop where I constructed the symbols for the mode and the tree-sitter mode via intern and concat, and check whether the tree-sitter version is available through treesit-ready-p. If it is, we remap the base mode to the tree-sitter one in major-mode-remap-alist. If it isn’t ready, then we do the opposite: remap the tree-sitter mode to the base version.

(dolist (language-source treesit-language-source-alist)
  (let* ((name (car language-source))
         (name-ts-mode (intern (concat (symbol-name name) "-ts-mode")))
         (fallback-assoc (assq name-ts-mode treesit-auto-fallback-alist))
         (fallback-name (cdr fallback-assoc))
         (name-mode (or fallback-name
                        (intern (concat (symbol-name name) "-mode"))))
         (name-mode-bound-p (fboundp name-mode))
         (skip-remap-p (and fallback-assoc
                            (not (cdr fallback-assoc)))))
    (and (not skip-remap-p)
         (fboundp name-ts-mode)
         (if (treesit-ready-p name t)
             (add-to-list 'major-mode-remap-alist `(,name-mode . ,name-ts-mode))
           (when name-mode-bound-p
             (add-to-list 'major-mode-remap-alist `(,name-ts-mode . ,name-mode)))))))

Of course, the actual code has a bit more wrapped around it, but the core idea is more or less the same.

The `Completions` Buffer Gets a Big Upgrade in Emacs 29

There’s been a lot of talk about how eglot and tree-sitter will be distributed with Emacs 29, but I’ve seen less buzz around the new functionality coming to the vanilla ∗Completions∗ buffer. Now, I’ve been an ardent vertico + orderless + marginalia + corfu user since seriously picking up Emacs over the summer, and when initially looking for options I found Prot’s MCT pretty alluring. I didn’t choose it since he had already decided to discontine development given upcoming changes in Emacs 29, and as of writing even he opted for vertico and corfu.

There is still that tempting, bitter fruit on the horizon though - maximizing everything I can out of the vanilla Emacs experience. Getting to that mythical “vanilla extract” that keeps my muscle memory nearly entirely intact between emacs -Q and my config (check out “Goals” in my .emacs.d to see the reasoning behind why I would want this).

Now that treesit.el, use-package, and eglot are all merged into the emacs-29 branch, I finally decided to give our good old friend the ∗Completions∗ buffer another try, so that you don’t have to.

(Some verbiage below is taken directly from C-h n (view-emacs-news))

New ‘visible’ and ‘always’ values for ‘completion-auto-help’

There are two new values to control the way the “∗Completions∗” buffer behaves after pressing a ‘TAB’ if completion is not unique.

The (old) default value t always hides the completion buffer after some completion is made.

(setq completion-auto-help t)

file:content/licecap/completion/auto-help-t.gif

The value ‘always’ updates or shows the ∗Completions∗ buffer after any attempt to complete, including the first time we press TAB. Comparing to the one above, notice that the buffer pops up as soon as I complete ~/.emacs.d/. Before, I had to start another completion by typing tra<TAB>. Also, after completing transient/, the buffer once again updates with the contents of that directory.

(setq completion-auto-help 'always)

file:content/licecap/completion/auto-help-always.gif

The value ‘visible’ is like ‘always’, but only updates the completions if they are already visible. The main difference in this one is that we don’t get the ∗Completions∗ buffer on the first TAB for ~/.emacs.d/:

(setq completion-auto-help 'visible)

file:content/licecap/completion/auto-help-visible.gif

If your goal is reduction of visual noise because you already know how a chain of TAB’s are going to complete, then ‘visible’ seems like a good option.

The ∗Completions∗ buffer can now be automatically selected.

This was my biggest gripe with ∗Completions∗ and what made it downright unusable for completion-at-point. Here’s what the current behavior looks like with completion in a buffer:

(setq completion-auto-select nil)

In the minibuffer, we’ve always had M-v to switch to ∗Completions∗, but there was no analogue for completion-in-region. Now, in Emacs 29, we can set completion-auto-select to one of t or second-tab to enable automatic selection of the “∗Completions∗” buffer

(setq completion-auto-select t)

If the value is ‘second-tab’, then the first TAB will display “∗Completions∗”, and the second one will switch to the “∗Completions∗” buffer.

(setq completion-auto-select 'second-tab)

With ‘second-tab’, I can use the “∗Completions∗” buffer a lot like how I would use corfu: type a bit, request completion with TAB, examine the list, and keep typing to narrow the candidates, and request completion again. If I see the option I like, I just hit TAB a few times to get it.

New commands for navigating completions from the minibuffer.

M-<up> and M-<down> for minibuffer-next-completion and minibuffer-previous-completion
M-RET to choose active candidate
C-u M-RET to insert active candidate without exiting minibuffer
C-x <up> (minibuffer-complete-history) is like minibuffer-complete but completes on the history items instead of the default completion table.
C-x <down> (minibuffer-complete-defaults) is like minibuffer-complete, but completes on the default items instead of the completion table.

The first two also work for completion-at-point (in-buffer completion).

file:content/licecap/completion/completion-nav-commands.gif

Some may find the arrow keys an unfortunate choice, though, and bind something more convenient:

;; Up/down when completing in the minibuffer
(define-key minibuffer-local-map (kbd "C-p") #'minibuffer-previous-completion)
(define-key minibuffer-local-map (kbd "C-n") #'minibuffer-next-completion)

;; Up/down when competing in a normal buffer
(define-key completion-in-region-mode-map (kbd "C-p") #'minibuffer-previous-completion)
(define-key completion-in-region-mode-map (kbd "C-n") #'minibuffer-next-completion)

My apologies to Mohamed Suliman, since I was also not able to figure out a fix for eshell that permits the use of M-<up> and M-<down> with M-RET. The issue there, it seems, is that eshell uses its own pcomplete instead of completion-at-point, which comes from minibuffer.el. I have, however, had success simply using TAB and BACKTAB with RET, by setting completion-auto-select to 'second-tab, as shown above.

New user option ‘completions-sort’.

Much like how oantolin’s live-completions gave us a way to sort candidates in ∗Completions∗, we now have a built-in method for specifying the sorting function. I took inspiration from Prot’s MCT documentation here to put candidates I use frequently near the top, followed by the length of their name.

(defun renz/sort-by-alpha-length (elems)
  "Sort ELEMS first alphabetically, then by length."
  (sort elems (lambda (c1 c2)
                (or (string-version-lessp c1 c2)
                    (< (length c1) (length c2))))))

(defun renz/sort-by-history (elems)
  "Sort ELEMS by minibuffer history.
Use `mct-sort-sort-by-alpha-length' if no history is available."
  (if-let ((hist (and (not (eq minibuffer-history-variable t))
                      (symbol-value minibuffer-history-variable))))
      (minibuffer--sort-by-position hist elems)
    (renz/sort-by-alpha-length elems)))

(defun renz/completion-category ()
  "Return completion category."
  (when-let ((window (active-minibuffer-window)))
    (with-current-buffer (window-buffer window)
      (completion-metadata-get
       (completion-metadata (buffer-substring-no-properties
                             (minibuffer-prompt-end)
                             (max (minibuffer-prompt-end) (point)))
                            minibuffer-completion-table
                            minibuffer-completion-predicate)
       'category))))

(defun renz/sort-multi-category (elems)
  "Sort ELEMS per completion category."
  (pcase (renz/completion-category)
    ('nil elems) ; no sorting
    ('kill-ring elems)
    ('project-file (renz/sort-by-alpha-length elems))
    (_ (renz/sort-by-history elems))))

(setq completions-sort #'renz/sort-multi-category)

Other Niceties

completions-max-height limits the height of the “∗Completions∗” buffer
completions-header-format is a string to control the heading line to show in the “∗Completions∗” buffer before the list of completions

Do We Stick With Vanilla Extract?

Now the fun part - let’s tally pros and cons to see if I should abandon everything for the Vanilla behavior:

property	score
Consistent minibuffer + CAP	+1
Vanilla GUI + TTY support	+1
No marginalia for sole completion	-0.5
Extra key press to cycle/complete	-0.5
Candidates not buffered until requested	-2
Eyes shift focus to another part of screen for CAP	-0.5
Total	-1.5

In my typical day, I need to have a working TTY and GUI version of Emacs, so when something just works for both, that’s a +1 for me. Corfu does have corfu-terminal, but it’s maintained separately. Also, having a consistent interface for both the minibuffer and completion-at-point shrinks the configuration domain, making it easier to maintain my config over time.

Unfortunately, in the case that there’s only one completion candidate, marginalia isn’t triggered, so I don’t get to see a key binding or flavor text alongside the candidate I choose. Vanilla Emacs will remind me about what key combination I could have used, which I can check any time with C-h e (the ∗Messages∗ buffer), and I can use C-h f directly from the minibuffer, so this only get -0.5. The fact that I need extra key strikes compared to something like Corfu’s Tab-N-Go is an annoyance, but just requires a bit of muscle memory change. The real impasse here, though, is that candidates aren’t shown until requested. I think Prot summed it up best here:

Vertico has official extensions which can make it work exactly like MCT without any of MCT’s drawbacks. These extensions can also expand Vertico’s powers such as by providing granular control over the exact style of presentation for any given completion category (e.g. display Imenu in a separate buffer, show the switch-to-buffer list horizontally in the minibuffer, and present find-file in a vertical list—whatever the user wants).

So will I stick with just ∗Completions∗? No, probably not. But these changes do put the default completion system squarely in the “usable” category, which I’m not sure I could have said before Emacs 29. I will give it an honest chance to see just how far I can push it, much in the spirit of MCT, before switching Vertico and Corfu back on.

Moving My Emacs Configuration to a Literate Programming Document

I’ve got a (relatively) stable version of my Emacs configuration as a literate document now. It’s easy to read either on my GitHub or my website. The website version may lag behind my GitHub verison a bit, but they should be pretty close. Many thanks to the maintainers of ox-hugo for making it possible.

Virtual Environments with Eglot, Tramp, and Pyright

Motivation

My most reliable setup for developing Python projects on remote hosts with LSP support so far has been with eglot and pyright. I’ve also tried lsp-mode with pyright, and both of lsp-mode and eglot with the python-lsp-server, however I’ve landed on eglot + pyright for a few reasons:

eglot requires zero configuration to work over Tramp, unlike lsp-mode.
Fewest number of Tramp hangs. This could just be a symptom of my particular setup, though.
eglot will have built-in support in future Emacs versions. This may or may not be worth a damn to other Emacs users.
pyright has been strictly faster at error checking and diagnostic updates as compared to python-language-server in the machines I’m using.

One hiccup remained though: pyright is typically a system or user installation, not something you install per virtual environment. Getting pyright to see the virtual environment of my choosing , and correctly report which dependencies are installed was a bit of a hassle, but I think my favorite solution so far has been to configure the virtual environment through the pyrightconfig.json file at the root of my project, and just have this file ignored by git. Typically, pyrightconfig.json looks like this:

{
    "venvPath": "/absolute/path/to/dir/",
    "venv": ".venv"
}

I’m pretty happy with the other default configurations for pyright, so I leave those be, and just configure the virtual environment path this way. What was annoying me, though, is that I’d need to write out this absolute path for each machine I clone a project into, since relative paths and shortcuts using ~~~ aren’t supported. Much better if we can just have Emacs do it for us.

In the spirit of other Emacs/Python tools like pythonic and pyvenv for activating virtual environments, I wanted something that would just prompt for a directory using completing-read, and then populate the contents of pyrightconfig.json automatically based on my selection.

Getting Functions That Write `pyrightconfig.json`

Edit 2022-11-20: Thanks to Mickey Petersen of mastering emacs for pointing out that json-encode exists. I originally had my own function pyrightconfig--json-contents here, but I’ve modified the function below to use this built-in version instead.

We really just need to do three things:

Prompt for a directory that houses a Python virtual environment
Break the result into an absolute parent path + base name, cleaning any Tramp prefix in the process
Write the contents of pyrightconfig--json-contents using the previous result to a file in the version control root.

It’s worth mentioning that we must put this file in the VC root, otherwise eglot just won’t pick it up. For my purposes, the VC system will always be git, so I’m going to make an assumption here and use vc-git-root instead of something more generic.

(defun pyrightconfig-write (virtualenv)
  (interactive "DEnv: ")

  (let* (;; file-truename and tramp-file-local-name ensure that neither `~' nor
         ;; the Tramp prefix (e.g. "/ssh:my-host:") wind up in the final
         ;; absolute directory path.
         (venv-dir (tramp-file-local-name (file-truename virtualenv)))

         ;; Given something like /path/to/.venv/, this strips off the trailing `/'.
         (venv-file-name (directory-file-name venv-dir))

         ;; Naming convention for venvPath matches the field for
         ;; pyrightconfig.json.  `file-name-directory' gets us the parent path
         ;; (one above .venv).
         (venvPath (file-name-directory venv-file-name))

         ;; Grabs just the `.venv' off the end of the venv-file-name.
         (venv (file-name-base venv-file-name))

         ;; Eglot demands that `pyrightconfig.json' is in the project root
         ;; folder.
         (base-dir (vc-git-root default-directory))
         (out-file (expand-file-name "pyrightconfig.json" base-dir))

         ;; Finally, get a string with the JSON payload.
         (out-contents (json-encode (list :venvPath venvPath :venv venv))))

    ;; Emacs uses buffers for everything.  This creates a temp buffer, inserts
    ;; the JSON payload, then flushes that content to final `pyrightconfig.json'
    ;; location
    (with-temp-file out-file (insert out-contents))))

Here’s a quick demo where I interactively choose a virtual environment directory, write the pyrightconfig.json, launch eglot, and use M-. to leverage the LSP’s jump-to-definition of a library, then show that the library we jumped to is indeed inside the virtual environment.

Follow-ups

Feel free to take this package and modify it to suit your needs. Over time I might make some modifications to it:

Maybe integrate with the variety of activate functions? So activating or setting a venv root for use with run-python automatically sets this.
Support other VC roots than just git
I’d love to get to VSCode-like intelligence about common venv locations and just prompt for those automatically through completing-read, instead of going through the pathing processing myself. Maybe that would become a function like pyrightconfig-suggest.

Python in Emacs: Vanilla is a powerful flavor

Intro

There are a lot of great guides on getting set up with Python in Emacs. Many of them have titles like “Emacs as a Python IDE” and start off by installing pyvenv for virtual environment management, eglot or lsp-mode for autocomplete/error checking, and maybe a host of other non-python things, like the helm or projectile packages.

This is not that guide.

This guide is for picky @#$%!s like me who want to exhaust every builtin capability before reaching out to external dependencies. Dependencies that, in turn, I will also have to learn and manage. Once I really understand what pyvenv is solving, then, and only then, will I add it to my package-selected-packages.

Despite the excellent swath of materials both new and old on how to get IDE-like performance for Python out of Emacs, the collected materials on just running “vanilla extract” are fairly scant. The builtin python.el documentation is thorough and the keybindings easily discoverable, but not all documentation is collated into a single place. This guide started out as just my working notes as I began primarily working in emacs for my Python projects, and has grown into a workflow guide using nothing but the builtin capabilities of Emacs 28.1+. With that in mind, the examples and walkthroughs presented here are designed for emacs -q - i.e. starting emacs without any user configuration or your distribution’s default.el.

Editing

Let’s get our feet wet by bopping around some Python buffers first. I’m going to start up a new python file with C-x C-f and naming my file editing.py. I’m going to start by just adding a couple functions and a print statement, obfuscating the typical “Hello, world!” example a bit by introducing some functions and a “main” section right away.

# These funtions are a little basic and silly right now, but we'll use
# them to showcase some Emacs features later on.
def hello_text():
    """Just gives back 'Hello'"""
    return "Hello"


def world_text():
    """Just gives back 'world!'"""
    return "world!"


if __name__ == "__main__":
    # Emacs 28.1+ has f-string syntax highlighting built in
    print(f"{hello_text()}, {world_text()}!")

By visiting this file, Emacs automatically goes into python-mode, which turns on a lot of Python-specific functionality. If you’re impatient like me and want to see everything that’s available right away, I’d start with C-c C-h from the editing.py buffer to see key commands specific to python-mode, and also use C-h a python to see every command involving the word “python” in some way. Out of the box we also get syntax highlighting, including within f-strings.

Useful `C-c` commands

Emacs typically has commands that are specific to the active major mode bound to C-c C-<letter>. What each <letter> does will depend on the buffer you’re currently in and what major mode is active. In our case, that’s python-mode, which has a lot of handy shortcuts already mapped out. For any of the keyboard shortcuts you can always use C-h k, or C-h f for the function names (prefixed by M-x below) to get the official documentation.

`C-c C-p` or `M-x run-python` to start a python REPL

This boots up what Emacs calls an “inferior Python shell”. “Inferior” here just means that Python is running as a subprocess of Emacs; not that there’s some other, “superior” method of running a Python process. If you need to control the exact command Emacs runs to start the shell, you can use the universal C-u prefix before either C-c C-p or M-x run-python to edit the command Emacs runs. Based on the previous article, what I’m frequently doing is holding down the Ctrl key with my left little finger, then rapidly typing u, c, and p to get C-u C-c C-p, bringing up a minibuffer prompt like this:

Run Python: python3 -i█

Where █ is point (my cursor). I then use C-a to move point back to the start and add a poetry run:

Run Python: poetry run█python3 -i

Emacs is typically smart enough to figure out what to do even if we leave off the -i, but generally it’s good to leave it in there.

`C-c C-z` jumps to python REPL if already running

Once the REPL is running, this is a very handy one for swapping back and forth between a file I’m actively editing and a running Python process

`C-c C-{c,e,r}` for sending chunks to the REPL

A handy complement to C-c C-z, these commands are for taking pieces of Python that I’m actively editing and sending them to the Python buffer all at once.

`C-c C-v` or `M-x python-check`

`C-c C-t ...` or `python-skeleton-...`

Using C-c C-t d and C-c C-t c it’s easy to insert new def and class statements (think t for “template”, d for “def”, and c for “class”). Ater invoking one of these, Emacs will guide us through the process of filling out each part needed to define a new function or class via the minibuffer. Using C-g at any point while editing the template wil revert the buffer back to its original state, as if you never started filling out the skeleton.

# editing.py
# --snip--
# Here we use `C-c C-t d` and follow the prompts to design a new
# function signature.
def whatever(my_string: str = hello_text, my_integer: int = 0):
    """Whatever, man"""
    return f"{hello_text}, {my_integer}"

# Next, `C-c C-t c` to make a new class
class MyGuy:
    """My guy is ALWAYS there for me"""
    pass
# --snip-- "__main__"

`C-c C-j` or `M-x imenu`

The nimble, builtin imenu is a way to quickly navigate between major symbol definitions in the current buffer - especially those off screen. In our editing.py we now have three functions, hello_text(), world_text(), and whatever(), and one class MyGuy. If we use C-c C-j, a minibuffer menu like this comes up:

1/5 Index item: █
*Rescan*
MyGuy.(class)
whatever.(def)
world_text.(def)
hello_text.(def)

My minibuffer displays a vertical preview of the options because I’ve set (fido-mode) and (vertical-fido-mode) in my init.el, both of which are included in Emcacs 28.1 or later. Then, if I partially type out a result the list will filter down to possible completions:

1/1 Index item: My█
MyGuy.(class)

imenu is very, very handy across Emacs, not just for Python, so it’s worth trying in a variety of major modes.

Running

Now its time to actually start executing some code. Before getting to all the complexity of virtual environments, we’ll start simply by just invoking the system Python for our script. Once that feels comfortable, we’ll throw in all the venv goodies.

As a script with `M-x compile`

This mode has built-in error parsing, so it’s the best way to run a script for real if we want to quickly navigate any traceback messages that come up. Conversely, the M-& async shell command does not have error parsing, so it’s not the right tool for launching processes we have to debug. Same goes for booting up a shell and running Python from there. Taking our script from the previous section, if we run M-x compile and give it an argument of python3 editing.py, up pops the *compilation* buffer, with the starting time, output of our program, and finish time.

-*- mode: compilation; default-directory: "~/repos/renzmann.github.io/content/posts/006_emacs_2_python/" -*-
Compilation started at Sun Aug 14 13:50:39

python3 editing.py
Hello, world!

Compilation finished at Sun Aug 14 13:50:39

Now, let’s try a different script, with an error in it:

# hello_error.py
print("Not an error yet!")
fdafdsafdsafdsa
print("Shouldn't make it here...")

Now, M-x compile will error out:

-*- mode: compilation; default-directory: "~/repos/renzmann.github.io/content/posts/006_emacs_2_python/" -*-
Compilation started at Sun Aug 14 13:53:26

python3 hello_error.py
Not an error yet!
Traceback (most recent call last):
  File "/home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/hello_error.py", line 4, in <module>
    fdafdsafdsafdsa
NameError: name 'fdafdsafdsafdsa' is not defined

Compilation exited abnormally with code 1 at Sun Aug 14 13:53:26

Emacs will parse the error message, so that after “compiling”, we can use M-g M-n and M-g M-p to move between error messages, or just click the link provided by the *compilation* buffer directly.

If just parsing Python tracebacks doesn’t excite you, mypy is also supported out of the box. Assuming mypy is already installed, M-x compile with mypy hello_error.py as the command results in this:

-*- mode: compilation; default-directory: "~/repos/renzmann.github.io/content/posts/006_emacs_2_python/" -*-
Compilation started at Sun Aug 14 14:02:03

.venv/bin/mypy hello_error.py
hello_error.py:4: error: Name "fdafdsafdsafdsa" is not defined
Found 1 error in 1 file (checked 1 source file)

Compilation exited abnormally with code 1 at Sun Aug 14 14:02:04

The hello_error.py:4: error: ... message will be a functional link, just as before. mypy is much more suitable for general error-checking though, so as scripts (and bugs) grow, the M-x compile command can keep up:

# errors.py
import typing

import requests
import aaaaaaa

foo
print(typing.fdafdsafdsafdsafdsafdsafdsa)


def whatever(x: str) -> str:
    """Here's a docstring!"""
    return x + 1

M-x compile RET mypy errors.py

-*- mode: compilation; default-directory: "~/repos/renzmann.github.io/content/posts/006_emacs_2_python/" -*-
Compilation started at Sun Aug 14 14:06:55

.venv/bin/mypy errors.py
errors.py:6: error: Cannot find implementation or library stub for module named "aaaaaaa"
errors.py:6: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
errors.py:8: error: Name "foo" is not defined
errors.py:9: error: Module has no attribute "fdafdsafdsafdsafdsafdsafdsa"
errors.py:14: error: Unsupported operand types for + ("str" and "int")
Found 4 errors in 1 file (checked 1 source file)

Compilation exited abnormally with code 1 at Sun Aug 14 14:06:55

Now, we can use M-g M-n and M-g M-p to quickly navigate between the errors in our code, even after navigating away from the original errors.py buffer - Emacs will remember what’s going on in the *compilation* buffer so we can hop all around the code base while addressing errors one at a time.

Interactively with the Python shell

python-mode centers heavily around the use of an active, running Python session for some of its features, as we’ll see in the next section. Its documentation recommends regular use of C-c C-c, which sends the entire buffer to the active inferior Python process. That means actually executing Python code, which may feel a bit dangerous for those of us who grew up with static analysis tools. So the first thing we need to make sure we don’t accidentally kick off our whole script is ensure that the main part of our program is properly ensconced.

# editing.py
# --snip--
if __name__ == "__main__":
    print(f"{hello_text()}, {world_text()}!")

Code Completion

Emacs uses the currently running *Python* process for looking up symbols to complete. As such, python.el recommends using C-c C-c to send the entire buffer’s contents to the Python shell periodically. if __name__ == "__main__" blocks do not execute when using C-c C-c. To send all code in the current buffer, including the __main__ block, instead we must use C-u C-c C-c.

Another awkward default in Emacs is that what we typically know of as “tab-complete” is bound to M-TAB, or the equivalent C-M-i (C-i and TAB are the same thing). On most Windows and Linux desktops, Alt+Tab changes the active window, and C-M-i is much too cumbersome to be a reasonable completion shortcut. I prefer just being able to hit TAB to invoke completion-at-point, so I use this snippet in my init.el:

;; init.el
;; Use TAB in place of C-M-i for completion-at-point
(setq tab-always-indent 'complete)

Now to demonstrate this new completion power. In our python file editing.py, I know we have a function called hello_text(). Within the main block, I might have been typing something that looked like this:

if __name__ == "__main__":
    print(f"{hell█

Where █ is point. Attempting a completion-at-point using C-M-i (or just TAB as I have re-bound it above) will yield … nothing. Maybe the indentation cycles, or it says “No match”, or just - no response. What we require is a running inferior Python process, which will look up completion symbols. After booting up Python with C-c C-p and sending all the current buffer contents with C-c C-c, hitting TAB completes the hell into hello_text:

if __name__ == "__main__":
    print(f"{hello_text█

In the case that the completion is ambiguous, a *completions* buffer will pop up, prompting for input on how to continue. Another nice thing about this completion method is that it respects your completion-styles setting. Personally, I keep mine globally set to include the flex style, which closely mimics fuzzy matching styles like you get in VSCode, JetBrains, or fzf:

;; init.el
(setq completion-styles '(flex basic partial-completion emacs22))

This allows me to type something like hltx, hit TAB and it completes to hello_text.

Debugging

If by running our Python code we encounter the breakpoint() builtin, Emacs will automatically break into pdb/ipdb (depending on your PYTHONBREAKPOINT environment variable), jump to the breakpoint in the code, and put an arrow at the next line to execute.

`M-x pdb`

Simply populates the command to run with python -m pdb. Can be configured with the variable gud-pdb-command-name

The `poetry` + `pyright` stack

The stack I use most frequently (for now) consists of:

python3.10 as the Python runtime
poetry for dependency and environment management[fn:poetry]
pyright for error checking[fn:pyright]
emacs for everything else

Each component should, in theory, be easy to replace. That is, if I want conda as a package manager and flake8 or mypy for linting/type checking, it should be easy to do a drop-in replacement for them.

For those who haven’t heard the good news of poetry, it takes care of a lot of headaches that every pythonista regularly deals with. It manages your virtual environment (creation and update), pyproject.toml specification, and a poetry.lock file that serves as a replacement for requirements.txt, housing exact dependency version numbers for project collaborators to install. All of these are automatically kept in sync, so you never have the case like with conda where someone does a conda or pip install into their environment but never bothers to update the setup.py, environment.yml, requirements.txt or whatever.

Earlier we mentioned that running our Python scripts via the M-& async shell command interface wasn’t a great use case for it. However, using it to set up a poetry environment is a fantastic example of when it is appropriate.

Async shell command: poetry init -n --python=^3.10

Assuming the poetry command ran without error, it plopped down the pyproject.toml in the same directory as errors.py. In a similar vein, we can add project dependencies using M-&

Async shell command: poetry add pyright requests

The *Async Shell Command* buffer will update as poetry runs and installs the required dependencies. Following this, we should have the pyright CLI installed to the virtual environment poetry set up for us. As a sanity check, I’ll start up either M-x shell or M-x eshell (whichever happens to be behaving better that day) to just get a simple cross-platform shell running where I can try it out:

~/tmp $ # using the same `errors.py` as in the earlier sectons
~/tmp $ poetry run pyright errors.py
No configuration file found.
pyproject.toml file found at /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python.
Loading pyproject.toml file at /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/pyproject.toml
Pyproject file "/home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/pyproject.toml" is missing "[tool.pyright]" section.
stubPath /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/typings is not a valid directory.
Assuming Python platform Linux
Searching for source files
Found 1 source file
/home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:5:8 - error: Import "aaaaaaa" could not be resolved (reportMissingImports)
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:7:1 - error: "foo" is not defined (reportUndefinedVariable)
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:7:1 - warning: Expression value is unused (reportUnusedExpression)
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:8:14 - error: "fdafdsafdsafdsafdsafdsafdsa" is not a known member of module (reportGeneralTypeIssues)
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:13:12 - error: Operator "+" not supported for types "str" and "Literal[1]"
    Operator "+" not supported for types "str" and "Literal[1]" when expected type is "str" (reportGeneralTypeIssues)
  /home/robb/repos/renzmann.github.io/content/posts/006_emacs_2_python/errors.py:4:8 - warning: Import "requests" could not be resolved from source (reportMissingModuleSource)
4 errors, 2 warnings, 0 informations
Completed in 1.033sec

Emacs actually has a couple ways of running error-checking tools like this. The typical one is M-x compile, which we saw earlier, but there’s also C-c C-v for M-x python-check. The latter will automatically check for tools like pyflakes or flake8, but can be configured with the python-check-command variable to pre-populate the command to run. Like M-x compile, M-x python-check will use a buffer that looks identical to *compilation* in every way except name: it will be called the *Python check: <command you ran>* buffer.

For me, that means I typically have something like

(setq python-check-command "poetry run pyright")

and then C-c C-v from a python buffer will prompt like this while errors.py is my active buffer

Check command: poetry run pyright errors.py

Adding error parsing to the pyright compile output

Unlike the mypy output, the error messages from pyright aren’t links, and we can’t hop between messages using M-g M-n and M-g M-p like before. In order to gain this functionality, we need to add a regex that can parse pyright messages. There are two objects of interest to accomplish this:

compilation-error-regexp-alist
compilation-error-regexp-alist-alist

Here’s the formal description from C-h v compilation-error-regexp-alist:

Alist that specifies how to match errors in compiler output.
On GNU and Unix, any string is a valid filename, so these
matchers must make some common sense assumptions, which catch
normal cases.  A shorter list will be lighter on resource usage.

Instead of an alist element, you can use a symbol, which is
looked up in ‘compilation-error-regexp-alist-alist’.

In not so many words, this says we should modify the *-alist-alist version, and simply add a symbol to the *-alist variable. Examining the current value via C-h v compliation-error-regexp-alist-alist, it’s easy to see that we’re after an expression a bit like this,

(add-to-list 'compilation-error-regexp-alist-alist
             '(pyright "regexp that parses pyright errors" 1 2 3))

eventually replacing the string in the middle with an actual Emacs regexp. Thankfully, Emacs has the M-x re-builder built in for doing exactly that! Since *Python check: poetry run pyright errors.py* is a buffer like any other, we can hop over to it, and run M-x re-builder to piece together a regex that extracts file name, line number, and column number from each message.

Clearly, there are some errors in the regexp so far, but as we edit the text in the *RE-Builder* buffer, the highlighting in the *compilation* buffer will update live to show us what would be captured by the regexp we’ve entered. After fiddling with the contents in the bottom buffer to get the highlighting correct, we’ve got this regular expression:

"^[[:blank:]]+\\(.+\\):\\([0-9]+\\):\\([0-9]+\\).*$"

Now we just need to add this into the compilation-error-regexp-alist-alist in our init.el:

;; init.el
(require 'compile)
(add-to-list 'compilation-error-regexp-alist-alist
             '(pyright "^[[:blank:]]+\\(.+\\):\\([0-9]+\\):\\([0-9]+\\).*$" 1 2 3))
(add-to-list 'compilation-error-regexp-alist 'pyright)

After restarting emacs with the modified alist, we get error prasing from pyright output:

Virtual Environments

Since I use poetry so frequently, and I can prefix all of the Emacs or shell commands with poetry run, it’s pretty rare that I have to invoke specific virtual environments. That said, this guide would have a pretty large hole in it if we didn’t mention the vanilla virtual environment experience.

Most folks tend to run a slightly different virtual environment workflow from one another. What I’m showing off below is the one I think fits most easily with the flavor of vanilla already presented in this article, with some added knowledge about how .dir-locals.el works (coming up shortly).

Create a virtual environment

Keeping a .venv folder at the top level of a project is one valid way to organize things, but (vanilla) Emacs isn’t going to make it easy for us to use it that way. Instead, I’d recommend keeping all virtual environments in a central place. For me, that looks like this:

M-! python3 -m venv ~/.cache/venvs/website

This builds a virtualenv named website for python utilities that help buld my blog under the ~/.cache directory on Unix. To use this virtualenv explicitly for shell utilities, I can always run commands like this

M-! ~/.cache/venvs/website/bin/python -m pip install mypy
M-! ~/.cache/venv/website/bin/mypy errors.py

Of course, adding the prefix ~/.cache/venvs/website/bin every time is a bit cumbersome, especially for frequent commands like M-x python-check.

`.dir-locals.el` for setting virtual environment

One quick way to reduce some typing is to add entries in a project file called .dir-locals.el. This is a special data file that Emacs will read, if it exists, and apply to all new buffers within the project. For our needs, we want to apply a couple changes to python-mode specifically to use the virtual environment instead of system python. The two easy ones are the python-check-command and python-shell-virtualenv-root:

;; .dir-locals.el
((python-mode . ((python-check-command . "%HOME%\\.cache\\venvs\\website\\Scripts\\python.exe -m mypy")
                 (python-shell-virtualenv-root . "~/.cache/venvs/website"))))

I’ve included a quirk of working on Microsoft Windows here - the python-check-command needs to run through your shell, which is cmd.exe by default, and hence requires Windows-style paths. The python-shell-virtualenv-root, however, is evaulated by Emacs, and can use tilde-expansion and Unix-style paths. Changing default shell commands to run through pwsh on Windows would likely alleviate this issue, but it’s worth calling out for cmd.exe users.

It’s also worth mentioning here that M-x add-dir-local-variable provides an easy interactive interface to editing the .dir-locals.el file.

The python-shell-virtualenv-root part only affects running Python as a shell within Emacs, it does not affect things like PATH, async commands, or M-x compile. To demonstrate this, once we’ve set up .dir-locals.el as above, and we either revert a Python buffer with C-x x g or open a new Python buffer in the same project, a popup like this appears:

The local variables list in c:/Users/robbe/repos/renzmann.github.io/content/posts/006_emacs_2_python/
contains values that may not be safe (*).

Do you want to apply it?  You can type
y  -- to apply the local variables list.
n  -- to ignore the local variables list.
!  -- to apply the local variables list, and permanently mark these
      values (*) as safe (in the future, they will be set automatically.)
i  -- to ignore the local variables list, and permanently mark these
      values (*) as ignored

  * python-check-command : "%HOME%\\.cache\\venvs\\website\\Scripts\\python.exe -m mypy"
  * python-shell-virtualenv-root : "~/.cache/venvs/website"

Responding with y will set the python-check-command and python-shell-virtualenv-root for just the current session, while ! will add both of these values to the custom section in either init.el or wherever you’ve set your custom-file. This is another reason for using a common, central spot for virtual environments, since across workstations I can use the same path relative to my $HOME directory. After confirming, and using C-c C-p, we can check which Python executable we’re using in the *Python* buffer now:

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; sys.executable
'c:\\Users\\robbe\\.cache\\venvs\\website\\Scripts\\python.exe'

Keep in mind, the values provided in .dir-locals.el are evaluated on a per-buffer basis, so attempting to set a relative path like (python-shell-virtualenv-root . ".venv/website") will only work when executing run-python in the same directory as .dir-locals.el and .venv/.

The various compile and shell commands will not respect the virtualenv we’ve set via .dir-locals.el. On *nix, M-x compile RET which python3 will still bring back some variant of /usr/bin/python3, as will M-& which python or M-! which python. In a follow-up article we might explore how it is possible to take care of all this via .dir-locals.el and the special exec variable, but it’s not very elegant.

All things considered: `pyvenv`

=pyvenv= is a very lightweight package, clocking in at around 540 source lines of code, designed specifically around the challenge of ensuring the correct python virtual environment is at the front of PATH when running (async) shell commands, M-x eshell, M-x shell, M-x term, M-x python-check, M-x compile, and more. When written, it was based around virtualenv and virtualenvwrapper.sh, and some of the language it uses will reflect that. Although virtualenv has mostly fallen out of favor, the core functionality of pyvenv is still very relevant. Especially if you choose to adopt a central store of virtual environments, as above, you can set that as a WORKON_HOME variable (“workon” is terminology held over from virtualenvwrapper.sh) to a directory that all your virtual environments sit under, so that it’s easy to select one with the pyvenv-workon function. When using poetry, that usually looks like this:

(if (eq system-type 'windows-nt)
    ;; Default virtualenv cache directory for poetry on Microsoft Windows
    (setenv "WORKON_HOME" "$LOCALAPPDATA/pypoetry/Cache/virtualenvs")
  ;; Default virtualenv cache directory for poetry on *nix
  (setenv "WORKON_HOME" "~/.cache/pypoetry/virtualenvs"))
(pyvenv-mode)

Setting WORKON_HOME to ~/.cache/venvs as in the previous examples is another valid option. Doing it this way also plays nice with .dir-locals.el, since pyvenv exposes a way to set a project-level venv with a single variable:

;; .dir-locals.el
((python-mode . ((pyvenv-workon . "website"))))

Also of use for folks who frequently swap between different projects is (pyvenv-tracking-mode), which will automatically change the active python virtual environment when you navigate to a different buffer.

And, of course, if the whole “workon” and virtualenvs grouped together under ~/.cache/venvs isn’t to taste, there’s always M-x pyvenv-activate, which lets you choose a virtual environement anywhere on your system. So, all-in-all, I’ll probably stick with pyvenv in my configuration, because setting all the different utility PATHs without it is just such a pain.

Next: Notebooking

Belive it or not, we’ve only scratched the surface. org-mode and org-babel together provide a fully-functional “notebooking” (technically “literate programming”) experience out of the box with recent versions of Emacs. The next article will focus exclusively on Python and data science in Org as a near-complete Jupyter replacement.

Footnotes

[fn:pyright] https://github.com/microsoft/pyright#command-line [fn:poetry] https://python-poetry.org/docs/#installation [fn:ddavis-workon] https://ddavis.io/posts/emacs-python-lsp/

Fully Remote Python Development in Emacs

Lay of the land

TRAMP vs. SSH + TTY

Remote virtual environments

Remote LSP

Inline `matplotlib` Images in Org-mode

Less intuitive than we might hope.

python3 -m venv .venv
.venv/bin/python -m pip install matplotlib

Without running a session, here is the most basic functional example

from random import random

import matplotlib.pyplot as plt

xs = [random() for _ in range(100)]
ys = [random() for _ in range(100)]

fig, ax = plt.subplots(figsize=(5, 5))
ax.scatter(xs, ys)
ax.set_title("A Random Scatterplot")
fig.savefig("demo.png")
return "demo.png"

It’s annoying how we have to specify the file name twice, so we can tangle in a variable as a header arg too

from random import random

import matplotlib.pyplot as plt

xs = [random() for _ in range(100)]
ys = [random() for _ in range(100)]

fig, ax = plt.subplots(figsize=(5, 5))
ax.scatter(xs, ys)
ax.set_title("A Random Scatterplot")
fig.savefig(fp)
return fp

Files

all-posts.org

Latest commit

History