ベクトルデータベース Weaviate を試す。Docker で起動〜簡単なテストまで(Nvidia GPU使用)

ベクトルデータベース Weaviate を試す。Docker で起動〜簡単なテストまでで M1 Mac でインストールしましたが、CPUバージョンですので、改めて、RTX4090を搭載した Ubuntu で、GPU版のWeaviateをインストールするまでの手順をメモしておきます。

色々アップデート

$ sudo apt update
$ sudo apt upgrade -y

Nvidia-Driver の更新

$ cd
$ cd Downloads
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
$ sudo dpkg -i cuda-keyring_1.1-1_all.deb
$ sudo apt update
$ sudo apt-get upgrade -y

$ sudo apt-get remove --purge nvidia-kernel-common-535 nvidia-kernel-common-545 -y
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt autoremove -y
$ sudo apt-get install nvidia-driver-545 -y
*****************************************************************************
*** Reboot your computer and verify that the NVIDIA graphics driver can   ***
*** be loaded.                                                            ***
*****************************************************************************
$ sudo reboot

CUDAのインストール

$ sudo apt install cuda-toolkit -y
$ ls -la /usr/local/cuda/bin
$ ls -la /usr/local/cuda/lib64
$ nano ~/.bashrc
export PATH="/usr/local/cuda/bin${PATH:+:${PATH}}"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
$ source ~/.bashrc
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:18:23_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0

cuDNNのインストール

$ apt list libcudnn8 -a
$ sudo apt install libcudnn8* -y

Docker のインストール

https://docs.docker.com/engine/install/ubuntu/

$ sudo apt-get update
$ sudo apt-get install ca-certificates curl
$ sudo install -m 0755 -d /etc/apt/keyrings
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
$ sudo chmod a+r /etc/apt/keyrings/docker.asc

$ echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update

$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
$ sudo groupadd docker
$ newgrp docker

NVIDIA Container Toolkitインストール

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}
sudo systemctl restart docker

Weaviateのインストール&起動

$ cd
$ mkdir weaviate
$ cd weaviate
$ nano docker-compose.yaml

---
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.25.4
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8080'
      SUM_INFERENCE_API: 'http://sum-transformers:8080'
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
      ENABLE_MODULES: 'text2vec-transformers,sum-transformers'
      CLUSTER_HOSTNAME: 'node1'
  t2v-transformers:
    image: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-paraphrase-multilingual-mpnet-base-v2
    environment:
      ENABLE_CUDA: '1'
      NVIDIA_VISIBLE_DEVICES: 'all'
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: 
            - 'gpu'
  sum-transformers:
    image: cr.weaviate.io/semitechnologies/sum-transformers:facebook-bart-large-cnn-1.0.0
    environment:
      ENABLE_CUDA: '1'
      NVIDIA_VISIBLE_DEVICES: 'all'
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: 
            - 'gpu'
volumes:
  weaviate_data:
...

$ docker compose up -d
[+] Running 10/13
 ⠹ sum-transformers [⣿⣿⣿⣿⣿⣿⣿⠀⣿⣿⠀⣿] 176.3MB / 3.651GB Pulling                                                                                                                                                                     23.3s 
   ✔ 2238450926aa Pull complete                                                                                                                                                                                                   5.3s 
   ✔ 15d04b3d1b9d Pull complete                                                                                                                                                                                                   5.4s 
   ✔ 83de38ae3b75 Pull complete                                                                                                                                                                                                   5.5s 
   ✔ f2ae8a19c88b Pull complete                                                                                                                                                                                                   5.5s 
   ✔ ceddb41fe1a9 Pull complete                                                                                                                                                                                                   5.7s 
   ✔ 9dee627939dc Pull complete                                                                                                                                                                                                   6.6s 
   ✔ f161228a73f9 Pull complete                                                                                                                                                                                                   6.6s 
   ⠸ cfd96b848d2d Downloading     [==>                                                ]  75.73MB/1.652GB                                                                                                                         19.3s 
   ✔ bc65e06e09d9 Download complete                                                                                                                                                                                               6.6s 
   ✔ c63473e7607d Download complete                                                                                                                                                                                               7.8s 
   ⠸ 3c1b74bdfc26 Downloading     [=>                                                 ]  55.85MB/1.955GB                                                                                                                         19.3s 
   ✔ c5c560c3d7e7 Download complete                                                                                                                                                                                              10.4s

インストール(合計 3.6G)が開始されるので、のんびり待ちます。

インストール後の確認

$ docker ps
CONTAINER ID   IMAGE                                                                                                                COMMAND                  CREATED         STATUS         PORTS                                                                                      NAMES
4a0e5e94d22a   cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-paraphrase-multilingual-mpnet-base-v2   "/bin/sh -c 'uvicorn…"   7 seconds ago   Up 5 seconds                                                                                              weaviate-t2v-transformers-1
7050db439b4a   cr.weaviate.io/semitechnologies/sum-transformers:facebook-bart-large-cnn-1.0.0                                       "/bin/sh -c 'uvicorn…"   7 seconds ago   Up 5 seconds                                                                                              weaviate-sum-transformers-1
f757be53415f   cr.weaviate.io/semitechnologies/weaviate:1.25.4                                                                      "/bin/weaviate --hos…"   7 seconds ago   Up 5 seconds   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 0.0.0.0:50051->50051/tcp, :::50051->50051/tcp   weaviate-weaviate-1

3つのサービスが正しく起動しています。

$ docker logs weaviate-t2v-transformers-1
INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     CUDA_PER_PROCESS_MEMORY_FRACTION set to 1.0
INFO:     CUDA_CORE set to cuda:0
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     172.18.0.3:55456 - "GET /.well-known/ready HTTP/1.1" 204 No Content

t2v-transformers が、正しくCUDAを認識して起動しています

$ docker logs weaviate-sum-transformers-1
INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     CUDA_CORE set to cuda:0
/usr/local/lib/python3.9/site-packages/torch/cuda/__init__.py:146: UserWarning: 
NVIDIA GeForce RTX 4090 with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 4090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     172.18.0.3:54294 - "GET /.well-known/ready HTTP/1.1" 204 No Content

sum-transformers が、正しくCUDAを認識して起動しています

※Jul 29, 2024 追記: ちゃんと読んでいませんでした。NVIDIA GeForce RTX 4090のCUDA能力（sm_89）と互換性がないと書かれています。本記事最後に対応方法を追記します。

$ docker logs weaviate-weaviate-1
{"action":"startup","default_vectorizer_module":"text2vec-transformers","level":"info","msg":"the default vectorizer modules is set to \"text2vec-transformers\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-07-16T04:11:12Z"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-07-16T04:11:12Z"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-07-16T04:11:12Z"}
{"level":"info","msg":"open cluster service","servers":{"node1":8300},"time":"2024-07-16T04:11:12Z"}
{"address":"172.18.0.3:8301","level":"info","msg":"starting cloud rpc server ...","time":"2024-07-16T04:11:12Z"}
{"level":"info","msg":"starting raft sub-system ...","time":"2024-07-16T04:11:12Z"}
{"address":"172.18.0.3:8300","level":"info","msg":"tcp transport","tcpMaxPool":3,"tcpTimeout":10000000000,"time":"2024-07-16T04:11:12Z"}
{"level":"info","msg":"loading local db","time":"2024-07-16T04:11:12Z"}
{"level":"info","msg":"database has been successfully loaded","n":0,"time":"2024-07-16T04:11:12Z"}
{"level":"info","metadata_only_voters":false,"msg":"construct a new raft node","name":"node1","time":"2024-07-16T04:11:12Z"}
{"action":"raft","index":1,"level":"info","msg":"raft initial configuration","servers":"[[{Suffrage:Voter ID:node1 Address:172.18.0.2:8300}]]","time":"2024-07-16T04:11:12Z"}
{"action":"raft","follower":{},"leader-address":"","leader-id":"","level":"info","msg":"raft entering follower state","time":"2024-07-16T04:11:12Z"}
{"last_snapshot_index":0,"last_store_applied_index":0,"last_store_log_applied_index":0,"level":"info","msg":"raft node constructed","raft_applied_index":0,"raft_last_index":9,"time":"2024-07-16T04:11:12Z"}
{"action":"raft","last-leader-addr":"","last-leader-id":"","level":"warning","msg":"raft heartbeat timeout reached, starting election","time":"2024-07-16T04:11:13Z"}
{"action":"raft","level":"info","msg":"raft entering candidate state","node":{},"term":10,"time":"2024-07-16T04:11:13Z"}
{"action":"raft","level":"info","msg":"raft election won","tally":1,"term":10,"time":"2024-07-16T04:11:13Z"}
{"action":"raft","leader":{},"level":"info","msg":"raft entering leader state","time":"2024-07-16T04:11:13Z"}
{"action":"bootstrap","level":"info","msg":"node reporting ready, node has probably recovered cluster from raft config. Exiting bootstrap process","time":"2024-07-16T04:11:13Z"}
{"address":"172.18.0.3:8300","level":"info","msg":"current Leader","time":"2024-07-16T04:11:14Z"}
{"level":"info","msg":"starting migration from old schema","time":"2024-07-16T04:11:14Z"}
{"level":"info","msg":"legacy schema is empty, nothing to migrate","time":"2024-07-16T04:11:14Z"}
{"level":"info","msg":"migration from the old schema has been successfully completed","time":"2024-07-16T04:11:14Z"}
{"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50051","time":"2024-07-16T04:11:16Z"}
{"action":"restapi_management","level":"info","msg":"Serving weaviate at http://[::]:8080","time":"2024-07-16T04:11:16Z"}
{"action":"telemetry_push","level":"info","msg":"telemetry started","payload":"\u0026{MachineID:4cd3e3ea-97d7-4b2c-a5a8-6ca7e22cfff0 Type:INIT Version:1.25.4 NumObjects:0 OS:linux Arch:amd64 UsedModules:[]}","time":"2024-07-16T04:11:16Z"}

weaviate も正しく起動しています。

http://0.0.0.0:8080/v1

にブラウザでアクセス（URLは、環境にあわせて変更してください）

正しく起動しています。

テスト

curl -X POST http://localhost:8080/v1/objects \
-H 'Content-Type: application/json' \
-d '{
  "class": "Article",
  "properties": {
    "title": "Persistent Data Test",
    "content": "This data should persist across container restarts."
  }
}'

上記を実行時に

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        On  | 00000000:01:00.0  On |                  Off |
|  0%   42C    P8              24W / 450W |   3688MiB / 24564MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

GPU-Util (上記の1%となっている箇所)が、0% 以上になれば、GPUを使って動作してくれています。

というわけで、GPUを使った Weaviate のインストール＆テスト完了です🍻

Jul 29, 2024 追記。 RTX 4090 に対応した Pytorch をインストールする

$ docker logs weaviate-sum-transformers-1
INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     CUDA_CORE set to cuda:0
/usr/local/lib/python3.9/site-packages/torch/cuda/__init__.py:146: UserWarning: 
NVIDIA GeForce RTX 4090 with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 4090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     172.18.0.3:54294 - "GET /.well-known/ready HTTP/1.1" 204 No Content

と、現在の PyTorch が、NVIDIA GeForce RTX 4090のCUDA能力（sm_89）と互換性がないと出力されているのを見逃していました。

PyTorch のバージョンを新しくしてGPUを使用できるようにしていきます。

コンテナにログイン

$ docker exec -it weaviate-sum-transformers-1 /bin/sh

RTX4090対応のライブラリをインストール

# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

確認

# python3
Python 3.9.13 (main, Aug 23 2022, 09:29:17) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.4.0+cu124

現状のコンテナをイメージ化

$ docker commit weaviate-sum-transformers-1 sum-transformers:facebook-bart-large-cnn-1.0.0
$ docker images
REPOSITORY                                               TAG                                                           IMAGE ID       CREATED         SIZE
sum-transformers                                         facebook-bart-large-cnn-1.0.0                                 801adc1a4c4c   6 minutes ago   14.1GB
cr.weaviate.io/semitechnologies/weaviate                 1.25.4                                                        9274bac717b1   6 weeks ago     121MB
cr.weaviate.io/semitechnologies/transformers-inference   sentence-transformers-paraphrase-multilingual-mpnet-base-v2   72b8ad6b2240   3 months ago    10.1GB
cr.weaviate.io/semitechnologies/sum-transformers         facebook-bart-large-cnn-1.0.0                                 69fab2080269   23 months ago   6.03GB

sum-transformers ができています。

Docker Compose の修正

$ nano docker-compose.yaml

image: cr.weaviate.io/semitechnologies/sum-transformers:facebook-bart-large-cnn-1.0.0
↓
image: sum-transformers:facebook-bart-large-cnn-1.0.0

先程作ったイメージに書替え

再起動

$ docker compose down
$ docker compose up -d

Python で確認

# python3
Python 3.9.13 (main, Aug 23 2022, 09:29:17) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> 
>>> if torch.cuda.is_available():
...     device = torch.device("cuda:0")
...     print("CUDA is available. Using GPU.")
... else:
...     device = torch.device("cpu")
...     print("CUDA is not available. Using CPU.")
... 
CUDA is available. Using GPU.

上記コードでは、CUDA が使用可能でGPUが使われることもわかります。

ログ確認

e$ docker logs weaviate-sum-transformers-1
INFO:     Started server process [7]
INFO:     Waiting for application startup.
INFO:     CUDA_CORE set to cuda:0
/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py:461: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(checkpoint_file, map_location="cpu")
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     172.18.0.3:36588 - "GET /.well-known/ready HTTP/1.1" 204 No Content

CUDAが認識されています。ワーニングは問題ない内容でしたが return torch.load(checkpoint_file, map_location="cpu") と書かれているのがすごく気になる。

ソースコード確認

/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py

を読んでみます。

def load_state_dict(checkpoint_file: Union[str, os.PathLike]):
    """
    Reads a PyTorch checkpoint file, returning properly formatted errors if they arise.
    """
    try:
        return torch.load(checkpoint_file, map_location="cpu")
    except Exception as e:

この return torch.load(checkpoint_file, map_location="cpu") ですね。

前後に、

if torch.cuda.is_available():
    ...
else:
    ...

といった、分岐はないので、おそらく、仕様のようです。

もし、無理やり治すのであれば、

def load_state_dict(checkpoint_file: Union[str, os.PathLike]):
    """
    Reads a PyTorch checkpoint file, returning properly formatted errors if they arise.
    """
    # GPUが使用可能かどうかをチェック
    if torch.cuda.is_available():
        map_location = "cuda:0"
    else:
        map_location = "cpu"
    
    try:
        return torch.load(checkpoint_file, map_location=map_location)
    except Exception as e:

このように修正しても良いかもしれませんが、一旦、このままにしておきます。