AI Tools Directory

Local AI tools directory. Launchers, inference engines, model formats, and GPU backends for running LLMs on your hardware.

25 items
NameRoleTypeExecLanguagesScoreCold StartMemory
GGUF
GPT-Generated Unified Format for efficient LLM storage
FormatformataotAny
safetensors
Safe and fast tensor serialization format by Hugging Face
FormatformataotAny
Metal
Apple's GPU framework for Apple Silicon acceleration
BackendbackendaotSwift, Objective-C, C++
CUDA Runtime
NVIDIA's parallel computing platform for GPU acceleration
BackendbackendaotC, C++, Python
Vulkan
Cross-platform GPU API for compute and graphics
BackendbackendaotC, C++
llama.cpp
LLM inference in C/C++ with minimal dependencies
EngineengineaotC, C++C+100ms50MB
ROCm
AMD's open-source GPU computing platform
BackendbackendaotC, C++, Python
llamafile
Distribute and run LLMs with a single file
EngineengineaotC, C++C-500ms100MB
ONNX Runtime
Cross-platform, high performance ML inferencing and training accelerator
InteropenginehybridPython, C++, C#, ...C-500ms300MB
LLM (Python CLI)
Access large language models from the command-line
TooltoolhybridPythonD500ms100MB
Ollama
Get up and running with large language models locally
LauncherlauncherhybridPython, JavaScript, GoD1000ms500MB
Candle
Minimalist ML framework for Rust with GPU support
EngineenginejitRustD300ms200MB
ExLlamaV2
Fast inference library for running LLMs locally on NVIDIA GPUs
EngineengineaotPython, C++, CUDAD1000ms300MB
MLX
Apple's array framework for machine learning on Apple Silicon
EngineenginejitPython, C++, SwiftD500ms200MB
CTransformers
Python bindings for GGML models with GPU acceleration
EngineenginehybridPython, C++D800ms200MB
Open WebUI
User-friendly WebUI for LLMs with Ollama/OpenAI support
UIlauncherhybridPython, TypeScriptF3000ms500MB
Text Generation Inference
Hugging Face's production-ready LLM serving solution
ServingenginehybridRust, PythonF10000ms2000MB
KoboldCpp
Easy-to-use AI text generation with llama.cpp backend
UIlauncherhybridC++, PythonF1500ms400MB
MLC LLM
Machine Learning Compilation for LLMs
InteropengineaotPython, C++F2000ms500MB
LocalAI
Free, open-source OpenAI alternative with local inference
LauncherlauncherhybridGo, PythonF3000ms800MB
vLLM
High-throughput LLM serving with PagedAttention
ServingenginejitPythonF5000ms2000MB
GPT4All
Free-to-use, locally running, privacy-aware chatbot
UIlauncherhybridC++, PythonF2000ms600MB
Jan
Open-source ChatGPT alternative that runs offline
UIlauncherhybridTypeScript, PythonF2000ms600MB
LM Studio
Discover, download, and run local LLMs with a beautiful GUI
UIlauncherhybridPythonF2000ms800MB
Text Generation WebUI
Gradio web UI for running Large Language Models
UIlauncherhybridPythonF5000ms1000MB