Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
-
Updated
Jun 2, 2024 - Python
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.
🤖 Collect practical AI repos, tools, websites, papers and tutorials on AI. 实用的AI百宝箱 💎
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
By: Anshuman Gupta, Prince Choudhary and Tisha Patel. Exploring Diverse Domains Of Wisdom Through Conversational Journeys
An Easy-to-use Knowledge Editing Framework for LLMs.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
🤯 Lobe Chat - an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Bedrock / Azure / Mistral / Perplexity ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT chat application.
Llama.cpp é uma biblioteca desenvolvida em C++ para a implementação eficiente de grandes modelos de linguagem, como o LLaMA da Meta. Otimizada para rodar em diversas plataformas, incluindo dispositivos com recursos limitados, oferece performance, velocidade de inferência e uso eficiente da memória, essenciais para a execução de grandes. modelos
Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM). Powers 👋 Jan
A lightweight, fast, parallel inference server for Llama
Build LLM-enabled FastAPI applications without build configuration.
Self-Augmented In-Context Learning for Unsupervised Word Translation (ACL 2024). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
Add a description, image, and links to the llama2 topic page so that developers can more easily learn about it.
To associate your repository with the llama2 topic, visit your repo's landing page and select "manage topics."