Aller directement au contenu
  • Catégories
  • Récent
  • Mots-clés
  • Populaire
  • Web
  • Utilisateurs
  • Groupes
Habillages
  • Clair
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Sombre
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Défaut (Darkly)
  • Aucun habillage
Réduire

NodeBB

fariasF

farias

@farias
Se désabonner S'abonner
À propos
Messages
723
Sujets
307
Partages
0
Groupes
0
Abonnés
5
Abonnements
14

Messages

Récent Meilleur sujets Contesté

  • Blocage du jours
    fariasF farias

    Link Preview Image
    157.245.57.78 | Singapore, AS14061, VPN Not Detected

    Get Details for IP 157.245.57.78: Hosted by DigitalOcean, LLC, located in Singapore, AS14061. View ranges, ASN info, and related IPs.

    favicon

    (ipinfo.io)

    # zgrep "xmlrpc.php" /var/log/apache2/access.*.log | sed 's/:/ /g' | awk '{print $2}' | sort -n | uniq -c | sort -n | tail -10
        405 45.91.22.64
        408 45.91.22.77
        413 45.91.22.95
        418 45.91.22.62
        420 45.91.22.68
        420 45.91.22.97
        422 45.91.22.59
        426 45.91.22.75
        431 45.91.22.65
      15224 157.245.57.78
    
    
    Scan Attack

  • llama.cpp avec Vulkan
    fariasF farias

    https://huggingface.co/unsloth/Qwen3.5-2B-GGUF/resolve/main/Qwen3.5-2B-Q4_0.gguf?download=true

    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Le meilleur modèle semble être https://huggingface.co/Qwen/Qwen3.5-2B pour mes cartes.

    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Mon fichier service :

    # systemctl status llama-server
    ● llama-server.service - Llama Server
         Loaded: loaded (/etc/systemd/system/llama-server.service; disabled; preset: enabled)
         Active: active (running) since Fri 2026-06-19 17:27:42 UTC; 29s ago
       Main PID: 37413 (llama-server)
          Tasks: 41 (limit: 94224)
         Memory: 91.7M (peak: 91.7M)
            CPU: 3.103s
         CGroup: /system.slice/llama-server.service
                 └─37413 /usr/local/bin/llama-server --model /models/qwen2.5-1.5b-instruct-q4_k_m.gguf --host 0.0.0.0 --port 8080
    
    juin 19 17:27:42 jellyfin systemd[1]: Started llama-server.service - Llama Server.
    root@jellyfin:/home/arias/llama.cpp/build# cat /etc/systemd/system/llama-server.service
    [Unit]
    Description=Llama Server
    After=network.target
    
    [Service]
    Type=simple
    User=root
    WorkingDirectory=/home/XXXX/llama.cpp
    Environment="NVM_BIN=/root/.nvm/versions/node/v26.3.1/bin"
    Environment="LD_LIBRARY_PATH=:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
    Environment="VULKAN_VERSION=1.4.350.1"
    ExecStart=/usr/local/bin/llama-server \
      --model /models/qwen2.5-1.5b-instruct-q4_k_m.gguf \
      --host 0.0.0.0 --port 8080
    Restart=on-failure
    RestartSec=5s
    StandardOutput=file:/tmp/llama-server.stdout.log
    StandardError=file:/tmp/llama-server.stderr.log
    
    [Install]
    WantedBy=multi-user.target
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Test en ligne de commande :

    # llama-server -m /models/qwen2.5-1.5b-instruct-q4_k_m.gguf --host 0.0.0.0
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Arret de openwebui :

    # systemctl stop openwebui
    # systemctl disable openwebui
    Removed "/etc/systemd/system/multi-user.target.wants/openwebui.service".
    
    

    Arret de ollama :

    # systemctl stop ollama
    # systemctl disable ollama
    Removed "/etc/systemd/system/default.target.wants/ollama.service".
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Petit test :

    # make install
    # ldconfig -v
    #  llama-bench -m  /models/qwen2.5-1.5b-instruct-q4_k_m.gguf
    ggml_vulkan: Found 2 Vulkan devices:
    ggml_vulkan: 0 = Quadro M5000 (NVIDIA) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: none
    ggml_vulkan: 1 = Quadro M4000 (NVIDIA) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: none
    | model                          |       size |     params | backend    | ngl |            test |                  t/s |
    | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
    | qwen2 1.5B Q4_K - Medium       | 934.69 MiB |     1.54 B | Vulkan     |  -1 |           pp512 |         53.48 ± 0.42 |
    | qwen2 1.5B Q4_K - Medium       | 934.69 MiB |     1.54 B | Vulkan     |  -1 |           tg128 |         63.55 ± 0.73 |
    
    build: 5fd2dc2c4 (9721)
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    La commande pour le build :

    # cmake --build build --config Release -j
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Nouveau build :

    #  cmake -B build -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release
    CMAKE_BUILD_TYPE=Release
    -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
    -- CMAKE_SYSTEM_PROCESSOR: x86_64
    -- GGML_SYSTEM_ARCH: x86
    -- Including CPU backend
    -- x86 detected
    -- Adding CPU backend variant ggml-cpu: -march=native 
    -- Found Vulkan: /usr/lib/x86_64-linux-gnu/libvulkan.so (found version "1.4.313") found components: glslc glslangValidator 
    -- Vulkan found
    -- GL_KHR_cooperative_matrix supported by glslc
    -- GL_NV_cooperative_matrix2 supported by glslc
    -- GL_NV_cooperative_matrix_decode_vector not supported by glslc
    -- GL_EXT_integer_dot_product supported by glslc
    -- GL_EXT_bfloat16 supported by glslc
    -- Including Vulkan backend
    -- ggml version: 0.15.2
    -- ggml commit:  5fd2dc2c4
    -- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "3.0.13")  
    -- Performing Test OPENSSL_VERSION_SUPPORTED
    -- Performing Test OPENSSL_VERSION_SUPPORTED - Success
    -- OpenSSL found: 3.0.13
    -- Generating embedded license file for target: llama-app
    -- Configuring done (5.0s)
    -- Generating done (0.6s)
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    La boulette j’ai pas pris la bonne version… on recommance :

    rm  /etc/apt/sources.list.d/lunarg-vulkan-jammy.list 
    wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
    sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-noble.list http://packages.lunarg.com/vulkan/lunarg-vulkan-noble.list
    sudo apt update
    sudo apt install vulkan-sdk
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Installation de SDK Vulkan :

    # wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
    # sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list http://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list
    # sudo apt update
    # sudo apt install vulkan-sdk
    

    Mais erreur :

    # sudo apt install vulkan-sdk
    Lecture des listes de paquets... Fait
    Construction de l'arbre des dépendances... Fait
    Lecture des informations d'état... Fait      
    Certains paquets ne peuvent être installés. Ceci peut signifier
    que vous avez demandé l'impossible, ou bien, si vous utilisez
    la distribution unstable, que certains paquets n'ont pas encore
    été créés ou ne sont pas sortis d'Incoming.
    L'information suivante devrait vous aider à résoudre la situation : 
    
    Les paquets suivants contiennent des dépendances non satisfaites :
     crashdiagnosticlayer : Dépend: libyaml-cpp0.7 (>= 0.7.0) mais il n'est pas installable
    E: Impossible de corriger les problèmes, des paquets défectueux sont en mode « garder en l'état ».
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Tentative de build :

    # cmake -B build -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release
    -- The C compiler identification is GNU 13.3.0
    -- The CXX compiler identification is GNU 13.3.0
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /usr/bin/cc - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    CMAKE_BUILD_TYPE=Release
    -- Found Git: /usr/bin/git (found version "2.43.0") 
    -- The ASM compiler identification is GNU
    -- Found assembler: /usr/bin/cc
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
    -- Found Threads: TRUE  
    -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
    -- CMAKE_SYSTEM_PROCESSOR: x86_64
    -- GGML_SYSTEM_ARCH: x86
    -- Found OpenMP_C: -fopenmp (found version "4.5") 
    -- Found OpenMP_CXX: -fopenmp (found version "4.5") 
    -- Found OpenMP: TRUE (found version "4.5")  
    -- Including CPU backend
    -- x86 detected
    -- Adding CPU backend variant ggml-cpu: -march=native 
    CMake Error at /usr/share/cmake-3.28/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
      Could NOT find Vulkan (missing: glslc) (found version "1.3.275")
    Call Stack (most recent call first):
      /usr/share/cmake-3.28/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
      /usr/share/cmake-3.28/Modules/FindVulkan.cmake:600 (find_package_handle_standard_args)
      ggml/src/ggml-vulkan/CMakeLists.txt:9 (find_package)
    
    
    -- Configuring incomplete, errors occurred!
    
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Test :

    # vulkaninfo | grep -i "deviceName"
    'DISPLAY' environment variable not set... skipping surface info
    error: XDG_RUNTIME_DIR is invalid or not set in the environment.
    	deviceName        = Quadro M5000
    	deviceName        = Quadro M4000
    	deviceName        = llvmpipe (LLVM 20.1.2, 256 bits)
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Ajout pour la compilation :

    # sudo apt install libvulkan-dev vulkan-tools glslang-tools  cmake build-essential git
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Actuellement on est :

    Downloading Vulkan SDK version 1.4.350.1
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Lancement de l’installation :

    sudo apt update
    sudo apt install curl wget xz-utils
    
    export VULKAN_VERSION="$(curl -fsSL https://vulkan.lunarg.com/sdk/latest/linux.txt)"                                                                                      
    
    echo "Downloading Vulkan SDK version ${VULKAN_VERSION}"
    curl --progress-bar "https://sdk.lunarg.com/sdk/download/${VULKAN_VERSION}/linux/vulkan_sdk.tar.xz" -o "/opt/vulkan-sdk.tar.xz"
    
    echo "Installing Vulkan SDK to /opt/vulkan-sdk"
    rm -rf "/opt/vulkan-sdk" && mkdir -p "/opt/vulkan-sdk"
    tar -Jxf "/opt/vulkan-sdk.tar.xz" --strip-components=1 -C "/opt/vulkan-sdk"
    rm -f "/opt/vulkan-sdk.tar.xz"
    
    echo "Adding Vulkan SDK environment variables to shell profiles"
    ([ ! -f "$HOME/.bashrc" ] || grep -qxF "source /opt/vulkan-sdk/setup-env.sh" "$HOME/.bashrc") || (echo "source /opt/vulkan-sdk/setup-env.sh" >> "$HOME/.bashrc")
    ([ ! -f "$HOME/.zshrc" ] || grep -qxF "source /opt/vulkan-sdk/setup-env.sh" "$HOME/.zshrc") || (echo "source /opt/vulkan-sdk/setup-env.sh" >> "$HOME/.zshrc")
    source /opt/vulkan-sdk/setup-env.sh
    
    
    Linux llama.cpp ubuntu

  • llama.cpp avec Vulkan
    fariasF farias

    Test de la configuration :

    # npx --no node-llama-cpp inspect gpu
    OS: Ubuntu 24.04.4 LTS (x64)
    Node: 26.3.1 (x64)
    
    node-llama-cpp: 3.18.1
    Prebuilt binaries: b8390
    
    ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version
    CUDA: available
    ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version
    Vulkan: available
    
    CUDA devices: Quadro M5000, Quadro M4000
    CUDA used VRAM: 0.85% (138.88MB/15.86GB)
    CUDA free VRAM: 99.14% (15.73GB/15.86GB)
    
    Vulkan devices: Quadro M5000, Quadro M4000
    Vulkan used VRAM: 1.76% (298.13MB/16.48GB)
    Vulkan free VRAM: 98.23% (16.19GB/16.48GB)
    
    CPU model: Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
    Math cores: 32
    Used RAM: 16.49% (2.99GB/18.14GB)
    Free RAM: 83.5% (15.15GB/18.14GB)
    Used swap: 0% (0B/4GB)
    Max swap size: 4GB
    mmap: supported
    
    
    Linux llama.cpp ubuntu

  • llama.cpp : Installation
    fariasF farias

    Mise à jours de Ollama :

    # curl -fsSL https://ollama.com/install.sh | sh
    >>> Cleaning up old version at /usr/local/lib/ollama
    >>> Installing ollama to /usr/local
    >>> Downloading ollama-linux-amd64.tar.zst
    ######################################################################## 100.0%
    >>> Adding ollama user to render group...
    >>> Adding ollama user to video group...
    >>> Adding current user to ollama group...
    >>> Creating ollama systemd service...
    >>> Enabling and starting ollama service...
    >>> NVIDIA GPU installed.
    
    

    Visiblement même problème :

    ...
    juin 19 10:51:37  ollama[2964]: time=2026-06-19T10:51:37.153Z level=INFO source=model_list_cache.go:111 msg="model list cache hydration complete" models=16 failures=0 elapsed=654.370427ms
    juin 19 10:51:42  ollama[2964]: time=2026-06-19T10:51:42.591Z level=WARN source=cuda_compat.go:38 msg="NVIDIA driver too old" device="Quadro M5000" compute=5.2 driver=535 required_driver="570 or newer"
    juin 19 10:51:42  ollama[2964]: time=2026-06-19T10:51:42.591Z level=WARN source=cuda_compat.go:38 msg="NVIDIA driver too old" device="Quadro M4000" compute=5.2 driver=535 required_driver="570 or newer"
    juin 19 10:51:43  ollama[2964]: time=2026-06-19T10:51:43.181Z level=INFO source=types.go:32 msg="inference compute" id=1 filter_id=1 library=Vulkan compute=0.0 name=Vulkan1 description="Quadro M4000" libd>
    juin 19 10:51:43  ollama[2964]: time=2026-06-19T10:51:43.181Z level=INFO source=types.go:32 msg="inference compute" id=0 filter_id=0 library=Vulkan compute=0.0 name=Vulkan0 description="Quadro M5000" libd>
    ...
    
    Linux llama.cpp ubuntu

  • llama.cpp : Installation
    fariasF farias

    Le GPU est trop ancien : https://en.wikipedia.org/wiki/CUDA#Supported_GPUs

    Linux llama.cpp ubuntu

  • llama.cpp : Installation
    fariasF farias

    Je teste la compilation CUDA :

    # cd llama.cpp
    # export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
    # export PATH=$PATH:$CUDA_HOME/bin
    # cmake -B build -DGGML_CUDA=ON  -DCMAKE_CUDA_COMPILER=`which nvcc`
    ...
    # cmake --build build --config Release -j 20
    [  8%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
    [ 10%] Built target ggml-cpu
    nvcc fatal   : Unsupported gpu architecture 'compute_52'
    
    
    Linux llama.cpp ubuntu
  • Se connecter

  • Vous n'avez pas de compte ? S'inscrire

  • Connectez-vous ou inscrivez-vous pour faire une recherche.
Powered by NodeBB Contributors
  • Premier message
    Dernier message
0
  • Catégories
  • Récent
  • Mots-clés
  • Populaire
  • Web
  • Utilisateurs
  • Groupes