Skip to content

Commit

Permalink
docs: improved server/computer use docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ErikBjare committed Nov 14, 2024
1 parent 5a228d4 commit 501008a
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 8 deletions.
33 changes: 25 additions & 8 deletions docs/server.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Server
======

gptme has a minimal REST API with very minimalistic web UI.

.. note::
The server and web UI is still in development and does not have all the features of the CLI.
It does not support streaming, doesn't ask for confirmation before executing, lacks the ability to interrupt responses and tool calls, etc.

gptme has a minimal REST API with very minimalistic web UI.
In particular, it doesn't ever ask for confirmation before executing (yet).

It can be started by running the following command:

Expand All @@ -20,7 +20,7 @@ There are a few different interfaces available:
Web UI
------

A basic chat interface with minimal dependencies that is bundled with the server.
A tiny chat interface with minimal dependencies that is bundled with the server.

Simply start the server to access the interface at http://localhost:5000

Expand All @@ -40,17 +40,19 @@ Features:
- Mobile-friendly design
- Dark mode support
- Offline/exports support

- Computer use interface integration

Computer Use Interface
----------------------

.. include:: computer-use-warning.rst

The computer use interface provides a split view with a chat on the left and a desktop on the right.

Requires Docker.

.. include:: computer-use-warning.rst

To run the computer use interface in Docker, follow these steps:

.. code-block:: bash
# Clone the repository
Expand All @@ -71,10 +73,25 @@ Features:

- Split view with chat on the left, desktop on the right
- Toggle for view-only/interactive desktop mode
- Fullscreen support
- Automatic screen scaling for optimal LLM vision

Requirements:

- Docker for running the server with X11 support
- Network ports 6080 (VNC) and 8080 (web UI) available

.. rubric:: Using Computer Use Locally

You can use the ``computer`` tool (which enables computer use) locally on Linux without Docker or VNC, but it is not recommended due to security concerns.

Requirements:
- X11 server
- ``xdotool`` installed

To enable the ``computer`` tool, specify a ``-t/--tools`` list that includes the computer tool (as it is disabled by default):

.. code-block:: bash
gptme -t computer # and whichever other tools you want
You also want to set a screen resolution that is suitable for the vision model you are using.
1 change: 1 addition & 0 deletions gptme/tools/computer.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ def computer(
path = output_dir / "screenshot.png"

if shutil.which("gnome-screenshot"):
# FIXME: incorrect call to xdotool
run_xdotool(f"gnome-screenshot -f {path} -p", display)
elif os.name == "posix":
path = _screenshot(path) # Use existing screenshot function
Expand Down

0 comments on commit 501008a

Please sign in to comment.