{"name":"napari-chatgpt","display_name":"chatgpt","visibility":"public","icon":"","categories":[],"schema_version":"0.2.0","on_activate":null,"on_deactivate":null,"contributions":{"commands":[{"id":"napari-chatgpt.make_qwidget","title":"Make example QWidget","python_name":"napari_chatgpt._widget:OmegaQWidget","short_title":null,"category":null,"icon":null,"enablement":null}],"readers":null,"writers":null,"widgets":[{"command":"napari-chatgpt.make_qwidget","display_name":"Omega -- a ChatGPT-enabled agent","autogenerate":false}],"sample_data":null,"themes":null,"menus":{},"submenus":null,"keybindings":null,"configuration":[]},"package_metadata":{"metadata_version":"2.1","name":"napari-chatgpt","version":"2023.8.8","dynamic":null,"platform":null,"supported_platform":null,"summary":"A napari plugin to process and analyse images with chatGPT.","description":"# napari-chatgpt\n\n## Home of\n_Omega_, a napari-aware autonomous LLM-based agent specialised in image processing and analysis.\n\n[![License BSD-3](https://img.shields.io/pypi/l/napari-chatgpt.svg?color=green)](https://github.com/royerlab/napari-chatgpt/raw/main/LICENSE)\n[![PyPI](https://img.shields.io/pypi/v/napari-chatgpt.svg?color=green)](https://pypi.org/project/napari-chatgpt)\n[![Python Version](https://img.shields.io/pypi/pyversions/napari-chatgpt.svg?color=green)](https://python.org)\n[![tests](https://github.com/royerlab/napari-chatgpt/workflows/tests/badge.svg)](https://github.com/royerlab/napari-chatgpt/actions)\n[![codecov](https://codecov.io/gh/royerlab/napari-chatgpt/branch/main/graph/badge.svg)](https://codecov.io/gh/royerlab/napari-chatgpt)\n[![napari hub](https://img.shields.io/endpoint?url=https://api.napari-hub.org/shields/napari-chatgpt)](https://napari-hub.org/plugins/napari-chatgpt)\n\nA [napari](napari.org) plugin that leverages OpenAI's Large Language Model\nChatGPT to implement _Omega_\na napari-aware agent capable of performing image processing and analysis tasks\nin a conversational manner.\n\nThis repository was created as a 'week-end project'\nby [Loic A. Royer](https://twitter.com/loicaroyer)\nwho leads a [research group](https://royerlab.org) at\nthe [Chan Zuckerberg Biohub](https://czbiohub.org/sf/). It\nlevegages [OpenAI](https://openai.com)'s ChatGPT API via\nthe [LangChain](https://python.langchain.com/en/latest/index.html) Python\nlibrary, as well as [napari](https://napari.org), a fast, interactive,\nmulti-dimensional\nimage viewer for\nPython, [another](https://ilovesymposia.com/2019/10/24/introducing-napari-a-fast-n-dimensional-image-viewer-in-python/)\nof Loic's week-end projects.\n\n# What is Omega?\n\nOmega is a LLM-based and tool-armed autonomous agent that demonstrates the\npotential for Large Language Models (LLMs) to be applied to image processing,\nanalysis and visualisation.\nCan LLM-based agents write image processing code and napari widgets, correct its\ncoding mistakes, perform follow-up analysis, and control the napari viewer? \nThe answer appears to be yes.\n\n#### In this video I ask Omega to segment an image using the [SLIC](https://www.iro.umontreal.ca/~mignotte/IFT6150/Articles/SLIC_Superpixels.pdf) algorithm. It makes a first attempt using the implementation in scikit-image, but fails because of an inexistant 'multichannel' parameter. Realising that, Omega tries again, and this time, succeeds:\n\nhttps://user-images.githubusercontent.com/1870994/235768559-ca8bfa84-21f5-47b6-b2bd-7fcc07cedd92.mp4\n\n#### After loading in napari a sample 3D image of cell nuclei, I ask Omega to segment the nuclei using the Otsu method. My first request was very vague, so it just segmented foreground versus background. I then ask to segment the foreground into distinct segments for each connected component. Omega does a rookie mistake by forgetting to 'import np'. No problem, it notices, tries again, and succeeds:\n\nhttps://user-images.githubusercontent.com/1870994/235769990-a281a118-1369-47aa-834a-b491f706bd48.mp4\n\nAs LLMs continue to improve, Omega will become even more adept at handling\ncomplex\nimage processing and analysis tasks. The current version of ChatGPT, 3.5,\nhas a cutoff date of 2021, which means that it lacks nearly two years of\nknowledge\non the napari API and usage, as well as the latest versions of popular libraries\nlike scikit-image, OpenCV, numpy, scipy, etc... Despite this, you can see in the\nvideos below\nthat it is quite capable. While ChatGPT 4.0 is a significant upgrade, it is not\nyet widely\navailable.\n\nOmega could eventually help non-experts process and analyse images, especially\nin the bioimage domain.\nIt is also potentially valuable for educative purposes as it could\nassist in teaching image processing and analysis, making it more accessible.\nAlthough ChatGPT, which powers Omega, may not be yet on par with an expert image\nanalyst or computer vision\nexpert, it is just a matter of time...\n\nOmega holds a conversation with the user and uses the following tools to acheive\nanswer questions,\ndownload and operate on images, write widgets for napari, and more:\n\n### napari related tools:\n\n- napari viewer control:\n  Gives Omega the ability to control all aspects of the napari viewer.\n\n- napari query:\n  Gives Omega the ability to query information about the state of the viewer, of\n  its layers, and their contents.\n\n- napari widget maker:\n  Gives Omega the ability to make napari functional widgets that take layers as\n  input and return a new layer.\n\n### cell segmentation tools:\n\n- cell and nuclei segmentation:\n  This tool specialises in segmenting cells and nuclei in images using some\n  predefined segmentation algorithms. Right now only cellpose is implemented.\n\n### Generic python installation queries:\n\n- python function signature query:\n  Lets Omega query the signature of function when it is unsure how to call a\n  function and what the names and type of the parameters are.\n\n### web search related tools:\n\n- web search:\n  Usefull to give Omega access to the knowledge accessible through the web\n\n- web image serach:\n  Streamlined path to search the web for images and open them in napari\n\n- wikipedia search:\n  Gives Omega access to the whole wikipedia\n\n----------------------------------\n\n## Installation from within napari:\n\nYou can install `napari-chatgpt` directly from within napari in the Plugins>\nInstall/Uninstall Plugins menu.\n(Please note that the Omega agent will hapilly install packages in the\ncorresponding environment).\n\nIMPORTANT NOTE: Makre sure you have a recent version of napari! Ideally the\nlatest one!\n\n## Installation in an new conda environment (RECOMMENDED):\n\nMake sure you have an [miniconda](https://docs.conda.io/en/latest/miniconda.html) installation on your system.\nAsk [ChatGPT](https://chat.openai.com/auth/login) what is that all about if you are unsure ;-)\n\nCreate environment:\n\n    conda create -y -n napari-chatgpt -c conda-forge python=3.9\n\nActivate environment:\n\n    conda activate napari-chatgpt \n\nInstall [napari](napari.org) in the environment using conda-forge: (very important on Apple M1/M2)\n\n    conda install -c conda-forge napari\n\n**Or**, with pip:\n\n    pip install napari\n\nInstall napari-chatgpt in the environment:\n\n    pip install napari-chatgpt\n\n## Installation variations:\n\nTo install latest development version (not recommended for end-users):\n\n    conda create -y -n napari-chatgpt -c conda-forge python=3.9\n    conda activate napari-chatgpt\n    pip install napari  \n    git clone https://github.com/royerlab/napari-chatgpt.git\n    cd napari-chatgpt\n    pip install -e .\n\nor:\n    \n    # same steps as above and then:\n    pip install git+https://github.com/royerlab/napari-chatgpt.git\n\n## System specific tweaks:\n\nOn Ubuntu systems, I recommend setting changing the UI timeout, \notherwise whenever Omega is thinking, the UI will freeze, and a popup will block\neverything which is very annoying:\n\n    gsettings set org.gnome.mutter check-alive-timeout 60000\n    \n\n\n## Requirements:\n\nYou need an OpenAI key, there is no way around this, I have been experimenting with \nother models, but right now the best results, by far are obtained with ChatGPT 4 (and to\na lesser extent 3.5). You can get your OpenAI key by signing up [here](https://openai.com/blog/openai-api).\nDeveloping Omega cost me $13.97, hardly a fortune. OpenAI pricing on ChatGPT 3.5\nis very reasonable at 0.002 dollars per 1K tokens, which means $2 per 750000 words. A\nbargain. Now, ChatGPT 4.0 is about 10x more expensive... But that could eventually drop,\nhopefully.\n\nNote: you can limit the burn-rate to a certain amount of dollars per month, just\nin case you let Omega thinking over the weekend and forget to stop it (don't worry, \nthis is actually **not** possible).\n\n## Usage:\n\nOnce all is installed, and if it is not already running, start napari:\n\n    napari\n\nYou can then the Omega napari plugin via the plugins menu:\n\n<img width=\"498\" alt=\"image\" src=\"https://user-images.githubusercontent.com/1870994/235790134-1d87fd50-583f-4fd9-ade2-c64497b91331.png\">\n\n\nYou just opened the plugin as a widget, this widget will appear:\n\n<img width=\"267\" alt=\"image\" src=\"https://github.com/royerlab/napari-chatgpt/assets/1870994/fdbde938-548d-4104-9241-d87c46c76dcf\">\n\nI recommend that initially you stick to the defaults values, which work well.\nThe best memory is 'hybrid'.\nThe 'autofix' features only make sense if you are choosing a ChatGPT 4 model, \nChatGPT might get confused... \nIncreasing creativity also decreases 'attention to detail'; the models will make more\ncoding mistakes, but might try more original solutions...\n\nYou then need to actually start Omega:\n\n<img width=\"104\" alt=\"image\" src=\"https://user-images.githubusercontent.com/1870994/235811111-9e468785-9562-410a-8e9a-c63cb03fb765.png\">\n\n\nIf you have not set the 'OPENAI_API_KEY' environment variable as is typically\ndone, Omega will ask you for your OpenAI API key, and will store it _safely_ in an\n_encrypted_ way on your machine (~/.omega_api_keys/OpenAI.json):\n\n<img width=\"293\" alt=\"image\" src=\"https://user-images.githubusercontent.com/1870994/235793528-9e892c5e-d8ca-43e1-9020-f2dfab45b32d.png\">\n\n\nJust enter an encryption/decryption key, your OpenAI key, and\neverytime you start Omega it will just ask for the decryption key:\n\n<img width=\"300\" alt=\"image\" src=\"https://user-images.githubusercontent.com/1870994/235794262-4c0eff4d-1c81-47b0-a097-f34e3d5c93b8.png\">\n\n(The idea is that you might not be able to remember your openAI key by heart, obviously,\nbut you might be able to do so with your own password or passphrase)\n\nYou can then direct your browser\nto: [http://127.0.0.1:9000/](http://127.0.0.1:9000/)\nand start having a hopefully nice chat with Omega:\n\n<img width=\"631\" alt=\"image\" src=\"https://github.com/royerlab/napari-chatgpt/assets/1870994/a5cf6d4d-deea-4df8-be8a-601d1cc0424c\">\n\n\n## Example prompts:\n\nHere are example prompts/questions/requests to try:\n\n- What is your name?\n- What tools do you have available?\n- Make me a Gaussian blur widget with sigma parameter\n- Open this tiff file in\n  napari: https://people.math.sc.edu/Burkardt/data/tif/at3_1m4_03.tif\n- Make a widget that applies the transformation: y = x^alpha + y^beta with alpha\n  and beta two parameters.\n- Create a widget to multiply two images\n- Can you open in napari a photo of Albert Einstein?\n- Downscale by a factor 3x the image on layer named 'img'\n- Rename selected layer to 'downscaled_image'\n- Upscale image 'downscaled_image' by a factor 3 using some smart interpolation\n  scheme of your choice (not nearest-neighbor)\n- Caveat: makes a plugin instead of actually doing teh job\n- How many channels has the image on layer 0\n- Make a image sharpening filter widget, expose relevant parameters\n- Can you open this file in\n  napari: https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr\n- Split the two channels of the first layer (first axis) into two separate\n  layers\n- Switch viewer to 3d\n- Create a napari widget for a function that takes two image layers and returns\n  a 3D image stack of n images where each 2D image corresponds to a linear\n  blending of the two layer images between 0 and 1.\n- Loaded the ‘cell’ sample image. there is one cell in the image on the first\n  layer, it is roughly circular and brighter than its surroundings, ca you write\n  segmentation code that returns a labels layer for it?\n- Can you create a widget to blend two images?\n- Can you tell me more about the 'guided Canny edge filter' ?\n- Write a configurable RGB to grayscale widget, ensure weights sum to 1\n\n## Video Demos:\n\nNot everyone will want, or can, get an API key for the latest and best LLM\nmodels,\nso here are videos showcasing what's possible. You will notice that Omega\nsometimes\nfails on its first attempt, typically because of mistaken parameters for\nfunctions,\nor other syntax errors. But it also often recovers by having access to the error\nmessage,\nand reasoning its way to the right piece of code. The videos below were made with ChatGPT 3.5,\nversion 4 works much better imagine what will be possible with future even more capable models...\n\n##\n\nIn this first video, I ask Omega to make a napari widget to convert images from\nRGB to grayscale:\n\nhttps://user-images.githubusercontent.com/1870994/235769895-23cfc7ed-622a-47f9-95aa-4be77efc0f78.mp4\n\n##\n\nOf course Omega is capable of holding a conversation, it sort of knows 'who it\nis', can search the web\nand wikipedia. Eventually I imagine it could leverage the ability to search for\nimproving its responses,\nand I have seen doing it a few times:\n\nhttps://user-images.githubusercontent.com/1870994/235769920-86b02d9d-1196-4339-a8d9-9a028bcd4607.mp4\n\n##\n\nFollowing-up from the previous video, I ask Omega to create a new labels layer\ncontaining just the largest segment. The script that Omega writes as another\nrookie mistake: it confuses layers and images. The error message then confuses\nOmega into thinking that it got the name of the layer wrong, setting it off in a\nquest\nto find the name of the labels layer. It succeeds at writing code that searches\nfor the labels layer, and uses that name to write a script that then does\nextract the largest segment into its own layer. Not bad:\n\nhttps://user-images.githubusercontent.com/1870994/235770741-d8905afd-0a9b-4eb7-a075-481979ab7b01.mp4\n\n##\n\nIn this video, I ask Omega to write a 'segmentation widget'. Pretty unspecific.\nThe answer is a vanilla yet effective widget that uses the Otsu approach to\nthreshold the image and then finds the connected components.\nNote that when you ask Omega to make a widget, it won't know of any runtime\nissues with the code because\nit is not running the code itself, yet. It can tell if there is a syntax problem\nthough... Nevertheless, the widget ends up working just fine:\n\nhttps://user-images.githubusercontent.com/1870994/235770794-90091bfe-b546-4dd0-bd9c-3895bfc33a1d.mp4\n\n##\n\nNow it gets more interesting. Following up on the previous video, can we ask\nOmega to do some follow-\nup analysis on the segments themselves? I ask Omega to list the 10 largest\nsegments and compute their\nareas and centroids. No problem:\n\nhttps://user-images.githubusercontent.com/1870994/235770828-0f829f76-1f3d-44b8-b8e8-89fcbcde6e11.mp4\n\nNote: You could even ask for it in markdown format, which would look better (not\nshown here).\n\n##\n\nNext I ask Omega to make a widget that lets me filter segments by area. And it\nworks beautifully.\nArguably it is not rocket science, but the thought-to-widget time ratio must be\nin the hundreds when comparing Omega to an average user trying to write their\nown widget:\n\nhttps://user-images.githubusercontent.com/1870994/235770860-4287e6a3-dae3-4c6d-a588-dea2bb1f69b7.mp4\n\n##\n\nThis is an example of a failed widget. I ask for a widget that can do dilations\nand erosions. The widget\nis created but is 'broken' because Omega made the mistake of using floats for\nthe number of dilations\nand erosions: (In the next video I tell Omega to fix it)\n\nhttps://user-images.githubusercontent.com/1870994/235770896-819f394d-9785-46e8-a31a-a135b19316bf.mp4\n\n##\n\nFollowing up from previous video, I explain that I want the two parameters (\nnumber erosions and dilations)\nto be integers. Notice that I exploit the conversational nature of the agent by\nassuming that it remembers\nwhat the widget is about:\n\nhttps://user-images.githubusercontent.com/1870994/235770914-90991ac4-337e-4dcd-a04c-dd44b5e8be3e.mp4\n\n##\n\nThis video demos a specialised 'cell and nuclei segmentation tool' which\nleverages [cellpose 2.0](https://www.cellpose.org/) to segment cell cytoplasms\nor nuclei. In general, we can't assume that\nLLMs know about every single image processing library, especially for specific\ndomains. So it can be\na good strategy to provide such specialised tools. After Omega successfully\nsegments the nuclei, I ask\nfrom it to count the nuclei. Answer: 340. Notice that the code generated '\nsearches' the layer with name 'segmented' with a loop. Cute:\n\nhttps://user-images.githubusercontent.com/1870994/235770933-07f5cbe6-2224-4dcd-b378-e81cc4e66500.mov\n\n##\n\nEnough with cells. Aparently The 'memory' of ChatGPT is filled with unescessary\ninformation, it knows the url of Albert Einstein's photo on wikipedia, and\ncombined with the 'napari file open' tool it can therefore open that photo in\nnapari:\n\nhttps://user-images.githubusercontent.com/1870994/235770959-406e8173-8416-4100-bcb6-7f0b617ce234.mp4\n\n##  \n\nYou can ask for rather incongruous widgets, widgets you would probably never\nwrite because you just need them once or something. Here I ask for a widget that\napplies a rather odd non-linear transformation to each\npixel. The result is predictably boring, but it works, and I don't think that\nthe answer was 'copy pasted'\nfrom somewhere else...\n\nhttps://user-images.githubusercontent.com/1870994/235770984-c88c8eac-d3b2-47d7-81b1-48fbe4429e90.mp4\n\n##\n\nIn this one, starting again from our beloved Albert, I ask to rename that layer\nto 'Einstein' which looks\nbetter than just 'array'. Then I ask Omega to apply a Canny edge filter.\nPredictably it uses scikit-image:\n\nhttps://user-images.githubusercontent.com/1870994/235771000-89dba0db-e710-4f76-b271-e9dcf65239b1.mp4\n\n##  \n\nThen I ask for a 'Canny edge detection widget'. It happily makes the widget and\noffers relevant parameters:\n\nhttps://user-images.githubusercontent.com/1870994/235771031-d978b652-2e28-4178-aa7e-dbdfd2e21c2d.mp4\n\n##\n\nFollowing up on previous video, I play with dilations on the edge image.\nOmega has some trouble when I ask to 'do it again'. Fine, sometimes you have a\nbit more explicit:\n\nhttps://user-images.githubusercontent.com/1870994/235771066-adc7f0bb-0b8e-415c-8e89-6107182cd5b1.mp4\n\n##\n\nYou can also experiment with more classic 'numpy' code by creating and\nmanipulating arrays and visualising\nthe output live:\n\nhttps://user-images.githubusercontent.com/1870994/235771093-85a751c8-cc5a-4685-b40a-acdf81f0e5c9.mp4\n\n##\n\nThis video demonstrates that Omega understand many aspects of the napari viewer\nAPI. It can switch viewing modes, translate layers, etc... :\n\nhttps://user-images.githubusercontent.com/1870994/235771129-db095c1f-56f7-4bb9-9bff-ef57ce66387b.mp4\n\n##\n\nI never thought this one would work: I ask Omega to open in napari a mp4 video\nfrom a URL and then use OpenCV to detect people. It does it. But the one thing\nthat Omega does not know is that creating a layer for each frame of the video is\nnot a practical approach. Not clear what happened to the colors though. Probably\nan RGB ordering or format issue:\n\nhttps://user-images.githubusercontent.com/1870994/235771146-ced45353-4886-42cb-b48f-3ce0859ed434.mp4\n\n## Disclaimer:\n\nDo not use this software lightly, it will download libraries by its own volition,\nwrite any code that it deems necessary, it might actually do what you ask, even\nif\nit is a bad idea. Also, beware that it might _misunderstand_ what you ask and\nthen do\nsomething bad. For example, it is unwise to use Omega to delete 'some' files\nfrom your system,\nit might end up deleting more than that if you are unclear in your request.  \nTo be 100% safe, we recommend that you use this software from within a sandboxed\nvirtual machine.\n\nTHE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED,\nINCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A\nPARTICULAR\nPURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS\nBE LIABLE\nFOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,\nTORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR\nTHE\nUSE OR OTHER DEALINGS IN THE SOFTWARE.\n\n## Contributing\n\nContributions are extremely welcome. Tests can be run with [tox], please ensure\nthe coverage at least stays the same before you submit a pull request.\n\n## License\n\nDistributed under the terms of the [BSD-3] license,\n\"napari-chatgpt\" is free and open source software\n\n## Issues\n\nIf you encounter any problems, please [file an issue] along with a detailed\ndescription.\n\n[napari]: https://github.com/napari/napari\n\n[Cookiecutter]: https://github.com/audreyr/cookiecutter\n\n[@napari]: https://github.com/napari\n\n[MIT]: http://opensource.org/licenses/MIT\n\n[BSD-3]: http://opensource.org/licenses/BSD-3-Clause\n\n[GNU GPL v3.0]: http://www.gnu.org/licenses/gpl-3.0.txt\n\n[GNU LGPL v3.0]: http://www.gnu.org/licenses/lgpl-3.0.txt\n\n[Apache Software License 2.0]: http://www.apache.org/licenses/LICENSE-2.0\n\n[Mozilla Public License 2.0]: https://www.mozilla.org/media/MPL/2.0/index.txt\n\n[cookiecutter-napari-plugin]: https://github.com/napari/cookiecutter-napari-plugin\n\n[file an issue]: https://github.com/royerlab/napari-chatgpt/issues\n\n[napari]: https://github.com/napari/napari\n\n[tox]: https://tox.readthedocs.io/en/latest/\n\n[pip]: https://pypi.org/project/pip/\n\n[PyPI]: https://pypi.org/\n","description_content_type":"text/markdown","keywords":null,"home_page":"https://github.com/royerlab/napari-chatgpt","download_url":null,"author":"Loic A. Royer","author_email":"royerloic@gmail.com","maintainer":null,"maintainer_email":null,"license":"BSD-3-Clause","classifier":["Development Status :: 2 - Pre-Alpha","Framework :: napari","Intended Audience :: Developers","License :: OSI Approved :: BSD License","Operating System :: OS Independent","Programming Language :: Python","Programming Language :: Python :: 3","Programming Language :: Python :: 3 :: Only","Programming Language :: Python :: 3.8","Programming Language :: Python :: 3.9","Programming Language :: Python :: 3.10","Topic :: Scientific/Engineering :: Image Processing"],"requires_dist":["numpy","magicgui","scikit-image","qtpy","langchain ==0.0.208","fastapi","uvicorn","websockets","openai","tiktoken","wikipedia","lxml","gTTS","playsound","matplotlib","xarray","arbol","playwright","duckduckgo-search","ome-zarr","scikit-video","pygpt4all ==1.1.0","GoogleBard ==1.0.2","transformers","cryptography","tabulate","numba","tox ; extra == 'testing'","pytest ; extra == 'testing'","pytest-cov ; extra == 'testing'","pytest-qt ; extra == 'testing'","napari ; extra == 'testing'","pyqt5 ; extra == 'testing'"],"requires_python":">=3.8","requires_external":null,"project_url":["Bug Tracker, https://github.com/royerlab/napari-chatgpt/issues","Documentation, https://github.com/royerlab/napari-chatgpt#README.md","Source Code, https://github.com/royerlab/napari-chatgpt","User Support, https://github.com/royerlab/napari-chatgpt/issues"],"provides_extra":["testing"],"provides_dist":null,"obsoletes_dist":null},"npe1_shim":false}