Infinity
Introduction
Infinity has contributed a platform to integrate the generation capability of large language models with your real-world business data, including finance, taxation and customer service, etc.
With Infinity, you can train models to connect to your downstream jobs, and provide corresponding service.
Available Devices
- Intel CPU
- Nvidia GPU
- Ascend NPU
Supported Models
- aixuexi
- Baichuan2-7B-Chat
- Baichuan2-13B-Chat
- bce-embedding-base-v1
- chinese-ocr-db-crnn-server
- CodeFuse-DeepSeek-33B-4bits
- deepseek-coder-1.3b-instruct
- deepseek-coder-6.7b-instruct
- deepseek-coder-33B-instruct-GPTQ
- gemma-2b-it
- gemma-7b-it
- internlm2-chat-7b
- m3e-base
- m3e-small
- m3e-large
- nomic-embed-text-v1
- OpenCodeInterpreter-DS-1.3B
- OpenCodeInterpreter-DS-6.7B
- pyramidbox-face-detection
- Qwen-7B
- Qwen-14B
- stable-diffusion
- starcoder2-3b
- starcoder2-7b
- SUS-Chat-34B-GPTQ
Work Stations
debug mode: Schedule daily task to obtain computing power points.
train mode: Based on your own business data, tune and save the large language model.
infer mode: Use FastAPI and Gradio to offer an online prediction stage.
Experience
fine-tune: python3 main.py
inference(OpenAI): python3 api.py
inference(GUI): python3 app.py
build dataset: python3 utils/dataset.py
monolithic: python3 mono/xxxx.py
I strongly advise you not to knowingly generate or spread harmful content, including rumor, hatred, violence, reactionary, pornography, deception, etc.