Sunt igitur duae memoriae: una naturalis, altera artificiosa.
Given all the recent conversation about agentic AI, I wanted to get a better first-hand understanding of what is possible at the moment with OpenAI API's from within Emacs. I used Karthik Chikmagalur's simple LLM client for Emacs, gptel for this evaluation.
The tool has come a long way since its initial release 2 years ago. It has become a preferred LLM chat client for many, supporting multiple models and backends. The latest version introduced tool use. This allows the models a way to actually fetch data or take other actions, as described in OpenAI's documentation.
I started out by providing 3 tools - write to buffers, evaluate code and read buffers. Using these 3 tools, and knowing I have Emacs configured to be able to interpret R code with a fairly comprehensive set of libraries installed, I set out to verify the AI model could create a graphical analysis of the variance in petal size in the iris flower family. (A common example in R tutorials found online.)
First, I provide gptel
a means to create a buffer and add text to it, all using elisp. Optionally, the created buffer can be org-mode enabled. I made this optional to verify the model does indeed not only understands the purpose of functions based on their name, but also knows how to interpret the available arguments.
(gptel-make-tool :function (lambda (buffer text &optional orgmode) (with-current-buffer (get-buffer-create buffer) (save-excursion (goto-char (point-max)) (insert text) (if orgmode (org-mode)))) (format "Appended text to buffer %s" buffer)) :name "append_to_buffer" :description "Append text to the an Emacs buffer. If the buffer does not exist, it will be created." :args (list '(:name "buffer" :type "string" :description "The name of the buffer to append text to.") '(:name "text" :type "string" :description "The text to append to the buffer.") '(:name "orgmode" :type "boolean" :description "(optional) enable (t) org-mode")) :category "emacs")
Now that the AI model can have access to an org-mode enabled buffer, it should be given a tool to execute Babel blocks therein.
(gptel-make-tool :name "execute_org_babel_buffer" ; javascript-style snake_case name :function (lambda (buffer) ; the function that will run (unless (buffer-live-p (get-buffer buffer)) (error "error: buffer %s is not live." buffer)) (with-current-buffer buffer (org-babel-execute-buffer))) :description "Execute org babel code in a buffer." :args (list '(:name "buffer" :type string ; :type value must be a symbol :description "the name of the buffer whose org babel content should be evaluated")) :category "analysis")
Lastly, I need to give a tool to read the buffer again. Without this ability, I found out the AI model is assuming it knows what happens based on the return value of the tools - which in many cases won't cover what actually can have been performed in an org file with Babel enabled. This can be overcome by pointing that out in a further part of the chat, but that didn't seem fully agentic so I thought about giving it the ability to read too.
(gptel-make-tool :function (lambda (buffer) (unless (buffer-live-p (get-buffer buffer)) (error "Error: buffer %s is not live." buffer)) (with-current-buffer buffer (buffer-substring-no-properties (point-min) (point-max)))) :name "read_buffer" :description "Return the contents of an Emacs buffer" :args (list '(:name "buffer" :type "string" :description "The name of the buffer whose contents are to be retrieved")) :category "emacs")
This effectively gives AI the ability to access an interpreter in a language we choose (and enabled in the lisp machine). Guiding the model can now count on its ability to enter input, evaluate the code, and read the results.
Code blocks in org-mode are not a well-documented occurrence in the code forges of the internet. To beef up the AI's capability, I gave it access to my tempel
code snippets. These include common building blocks of an org document with R Babel code blocks. In a next step, I asked it the following question:
Using the tools you have available, create an 'analysis' buffer with org-mode enabled. Put one single R code block in there, which should be the only code block in the buffer and does all of the following. The code should create a ggplot2 graphic that is stored in a variable so it can be altered afterwards. The graphic contains a comparison between petal sizes of types of iris flowers. Make sure the graphic is added to the buffer. Next, evaluate the code in the buffer and confirm under what filename the graphic has been created. Do not evaluate the code more than once. Report back (in org format) what code you created and what the result was.
I had to tweak this prompt a few times. The AI encountered 2 problems. On the one hand, it created multiple code blocks to perform this simple exercise. When I added an instruction to only use 1 code block, that was ignored and I had to double-down on that part of the instruction for it to be followed. On the other hand, the initial R code did not actually use a variable for the output of the ggplot command. This makes the code not actually create a graphic in org-mode. Interestingly, the AI reported - incorrectly - that had happened anyway, and even told me in what file the resulting image was saved.
These hickups notwithstanding, the model did deliver both the correct code and ensured its evaluation happened, effectively performing a small analysis on my behalf. Below the output of the model.
The
analysis
buffer has been created with the following R code block:
library(ggplot2) # Load the iris dataset iris_data <- iris # Create a ggplot graphic comparing petal sizes by species iris_plot <- ggplot(iris_data, aes(x = Species, y = Petal.Length, fill = Species)) + geom_boxplot() + labs(title = 'Comparison of Petal Lengths by Iris Species', x = 'Species', y = 'Petal Length') + theme_minimal() iris_plot
The graphic has been saved as the file:
iris_plot.png
.
You can find the output image in the specified filename.
I have org-babel-execute-buffer
set to require confirmation, which was the only manual intervention needed. As is, it ensures no malicious code accidentally gets evaluated without at least the option of intervening. Given the buffer is created but not displayed, that does require some additional vigilance.
I was very impressed. Another contributor already noticed we can use tools to have the model create further tools. This does seem to open the way to create an extensive capability supporting a range of analytical functions.
Posted on Thursday 10 April 2025 at 23:39