I. Introduction

DeepseekR1 is a high-performance general-purpose large language model that supports complex reasoning, multimodal processing, and technical document generation. This manual provides a comprehensive guide for the local deployment of DeepseekR1, covering hardware configuration, domestic chip adaptation, quantization schemes, cloud-based alternatives, and the complete deployment method for the 671B MoE model using Ollama.

Key Notes:

**Individual Users:** Deployment of 32B and above models is not recommended due to high hardware costs and complex maintenance.
**Enterprise Users:** Professional team support is required, and ROI (Return on Investment) should be evaluated before deployment.

II. Core Configuration Requirements for Local Deployment

1. Model Parameters and Hardware Requirements

Model Parameters	Windows Configuration	Mac Configuration	Use Case
1.5B	RAM: 4GB, GPU: Integrated/Modern CPU, Storage: 5GB	Memory: 8GB (M1/M2/M3), Storage: 5GB	Simple text generation, basic code completion
7B	RAM: 8-10GB, GPU: GTX 1680 (4-bit quantization), Storage: 8GB	Memory: 16GB (M2 Pro/M3), Storage: 8GB	Medium complexity Q&A, code debugging
14B	RAM: 24GB, GPU: RTX 3090 (24GB VRAM), Storage: 20GB	Memory: 32GB (M3 Max), Storage: 20GB	Complex reasoning, technical document generation
32B+	Enterprise-level deployment (multi-GPU required)	Not supported	Scientific research, large-scale data processing

2. Computing Power Requirements

Model	Parameter Size	Precision	Minimum VRAM	Minimum Computing Power
DeepSeek-R1 (671B)	671B	FP8	≥890GB	2XE9680 (16H20 GPU)
DeepSeek-R1-Distill-70B	70B	BF16	≥180GB	4L20 or 2H20 GPU

III. Domestic Chip and Hardware Adaptation

1. Domestic Ecosystem Partner Updates

Company	Adaptation Content	Performance Benchmark (vs NVIDIA)
Huawei Ascend	Native support for R1 series, end-to-end inference optimization	Equivalent to A100 (FP16)
Moore Threads	MXN series supports 70B model BF16 inference, VRAM utilization increased by 30%	Equivalent to RTX 3090
Hygon DCU	Adapted for V3/R1 models, performance benchmarked against NVIDIA A100	Equivalent to A100 (BF16)

2. Recommended Domestic Hardware Configurations

Model Parameters	Recommended Solution	Use Case
1.5B	Taichu T100 accelerator card	Prototype validation for individual developers
14B	Kunlun K200 cluster	Enterprise-level complex task inference
32B	Bichen computing platform + Ascend 910B cluster	Scientific research and multimodal processing

IV. Cloud Deployment Alternatives

1. Recommended Domestic Cloud Service Providers

Platform	Core Advantages	Use Case
Silicon Flow	Official API, low latency, supports multimodal models	Enterprise-level high-concurrency inference
Tencent Cloud	One-click deployment + limited-time free trial, supports VPC privatization	Rapid deployment for small to medium-scale models
PPIO Cloud	Price is 1/20 of OpenAI, 50 million tokens free upon registration	Low-cost testing and experimentation

2. International Access Channels (requires VPN or enterprise network environment)

**NVIDIA NIM:** Enterprise-level GPU cluster deployment (link)
**Groq:** Ultra-low latency inference (link)

V. Complete 671B MoE Model Deployment (Ollama + Unsloth)

1. Quantization Schemes and Model Selection

Quantization Version	File Size	Minimum Memory + VRAM	Use Case
DeepSeek-R1-UD-IQ1_M	158 GB	≥200 GB	Consumer-grade hardware (e.g., Mac Studio)
DeepSeek-R1-Q4_K_M	404 GB	≥500 GB	High-performance servers/cloud GPU

**Download Links:**

HuggingFace Model Library
UnslothAI Official Documentation

2. Hardware Configuration Recommendations

Hardware Type	Recommended Configuration	Performance (Short Text Generation)
Consumer-grade	Mac Studio (192GB unified memory)	10+ tokens/sec
High-performance server	4xRTX 4090 (96GB VRAM + 384GB RAM)	7-8 tokens/sec (mixed inference)

3. Deployment Steps (Linux Example)

**Install Dependencies:**

bash
brew install llama.cpp

**Download and Merge Model Shards:**

bash
llama-gguf-split --merge DeepSeek-R1-UD-IQ1_M-00001-of-00004.gguf DeepSeek-R1-UD-IQ1_S.gguf

**Install Ollama:**

bash
curl -fsSL https://ollama.com/install.sh | sh

**Create Modelfile:**

markdown
FROM /path/to/DeepSeek-R1-UD-IQ1_M.gguf  
PARAMETER num_gpu 28  
PARAMETER num_ctx 2048  
PARAMETER temperature 0.6  
TEMPLATE "<|end_of_thinking|>{{.Prompt}}<|end_of_thinking|>"

**Run the Model:**

bash
ollama create DeepSeek-R1-UD-IQ1_M -f DeepSeek01_Modelfile  
ollama run DeepSeek-R1-UD-IQ1_M --verbose

4. Performance Tuning and Testing

**Low GPU Utilization:** Upgrade to high-bandwidth memory (e.g., DDR5 5600+).

**Expand Swap Space:**

bash
sudo fallocate -l 100G /swapfile  
sudo chmod 600 /swapfile  
sudo mkswap /swapfile  
sudo swapon /swapfile

VI. Precautions and Risk Warnings

1. Cost Warnings:

**70B Model:** Requires 3+ 80GB VRAM GPUs (e.g., RTX A6000), not feasible for single-GPU users.
**671B Model:** Requires 8xH100 cluster, only deployable in supercomputing centers.

2. Alternatives:

**Individual Users:** Recommended to use cloud APIs (e.g., Silicon Flow) for maintenance-free and compliant solutions.

3. Domestic Hardware Compatibility: Requires customized frameworks (e.g., Ascend CANN, Moore Threads MXMLLM).

VII. Appendix: Technical Support and Resources

**Huawei Ascend:** Ascend Cloud Services
**Moore Threads:** Free API Trial
**Li Xihan's Blog:** Complete Deployment Tutorial

Conclusion

Local deployment of Deepseek R1 requires significant hardware investment and technical expertise. Individual users should proceed with caution, while enterprise users should thoroughly evaluate their needs and costs. Domestic adaptation and cloud services can significantly reduce risks and improve efficiency. Rational planning is essential for cost-effectiveness!

**Manual Updates and Feedback:** For additions or corrections, please contact the document author. For detailed access instructions, refer to the Silicon Flow community documentation.

Global Enterprise and Individual Channels

**Meta Search:** https://metaso.cn
**360 Nano AI Search:** https://www.n.cn/
**Silicon Flow:** https://cloud.siliconflow.cn/i/OBklluwO
**ByteDance Volcano Engine:** https://console.volcengine.com/ark/region:ark+cn-beijing/experience
**Baidu Cloud Qianfan:** https://console.bce.baidu.com/qianfan/modelcenter/model/buildln/list
**NVIDIA NIM:** https://build.nvidia.com/deepseek-ai/deepseek-r1
**Groq:** https://groq.com/
**Fireworks:** https://fireworks.ai/models/fireworks/deepseek-r1
**Chutes:** https://chutes.ai/app/chute/
**Github:** https://github.com/marketplace/models/azureml-deepseek/DeepSeek-R1/playground
**POE:** https://poe.com/DeepSeek-R1
**Cursor:** https://cursor.sh/
**Monica:** https://monica.im/invitation?c=ACZ7WJJ9
**Lambda:** https://lambdalabs.com/
**Cerebras:** https://cerebras.ai
**Perplexity:** https://www.perplexity.ai
**Alibaba Cloud Bailian:** https://api.together.ai/playground/chat/deepseek-ai/DeepSeek-R1

**Note:** Requires VPN or enterprise network environment.

Domestic AI Chip Company Support

Date	Company	Announcement Title
February 1	Huawei	First Release! Silicon Flow x Huawei Cloud Jointly Launch DeepSeek R1 & V3 Inference Services Based on Ascend Cloud!
February 1	Moore Threads	Gitee AI Jointly Launches Full Suite of DeepSeek R1 Distilled Models with Moore Threads, Free Trial Available!
February 4	Hygon	Hygon DCU Successfully Adapts DeepSeek V3 and R1 Models, Officially Launched!
February 4	Huawei	Ascend Native: Luchen Tech Launches DeepSeek R1 Series Inference API and Cloud Image Services Based on Ascend Computing Power
February 5	Moore Threads	DeepSeek-V3 Full Version Launched on Domestic Moore Threads GPU for First Experience!
February 5	Hygon	Hygon DCU Successfully Adapts DeepSeek-Janus-Pro Multimodal Large Model
February 5	Bichen Tech	DeepSeek R1 Launched on Bichen Domestic AI Computing Platform, Empowering Developer Innovation with Full Series Models
February 5	Taichu Yuanqi	DeepSeek-R1 Series Models Adapted on Taichu T100 Accelerator Card in 2 Hours, Free API Service Available!
February 5	Yuntian Lifey	DeepEdge10 Completes Adaptation of DeepSeek R1 Series Models
February 6	Suiyuan Tech	Suiyuan Tech Achieves Full Deployment of DeepSeek Inference Services Across National AI Computing Centers
February 6	Kunlun Core	Domestic AI Card Fully Adapts DeepSeek Training and Inference Versions, Outstanding Performance, One-Click Deployment Available (Document Download Included)

Cloud and AI Computing Company Support

Date	Company	Announcement Title
January 28	WuWen XinQiong	WuWen XinQiong Infini-AI Heterogeneous Cloud Now Offers DeepSeek-R1-Distill, Perfect Combination of Domestic Models and Heterogeneous Cloud
January 28	PPIO Cloud	Big News! DeepSeek-R1 Launched on PPIO Computing Cloud
January 28	Silicon Flow	Silicon Cloud Launches DeepSeek Multimodal Model: Janus-Pro-7B is Here!
February 1	Huawei Cloud	First Release! Silicon Flow x Huawei Cloud Jointly Launch DeepSeek R1 & V3 Inference Services Based on Ascend Cloud!
February 1	Silicon Flow	First Release! Silicon Flow x Huawei Cloud Jointly Launch DeepSeek R1 & V3 Inference Services Based on Ascend Cloud!
February 1	China Telecom Cloud	Mysterious "Eastern Power" Gathers! DeepSeek-R1 Model Launched on China Telecom Cloud!
February 2	Tencent Cloud	One-Click Deployment, 3-Minute Call! DeepSeek-R1 Lands on Tencent Cloud
February 2	ZStack	First Release! ZStack Smart Tower Supports DeepSeek V3/R1/Janus Pro, Multiple Domestic CPU/GPU Available for Private Deployment
February 2	PPIO Cloud	PPIO Computing Cloud Integrates Full DeepSeek Models, Price Only 1/20 of OpenAI, 50 Million Tokens Free Upon Registration!
February 3	Alibaba Cloud	3 Steps, 0 Code! One-Click Deployment of DeepSeek-V3 and DeepSeek-R1
February 3	Baidu Smart Cloud	Baidu Smart Cloud Qianfan Fully Supports DeepSeek-R1/V3 Calls, Ultra-Low Price
February 3	SCNet	Supercomputing Internet Launches DeepSeek Series Models, Provides Super Intelligent Fusion Computing Power Support
February 4	Tencent Cloud	One-Click Deployment + Limited-Time Free Trial! Tencent Cloud Launches DeepSeek Series Models
February 4	Silicon Flow	Full Package Arrives! Silicon Flow Launches Accelerated Version of DeepSeek-R1 Distilled Model
February 4	Volcano Engine	Full-Size DeepSeek Models Land on Volcano Engine!
February 4	QingCloud	Limited-Time Free, One-Click Deployment! Jishi Computing Officially Launches DeepSeek-R1 Series Models
February 4	Computing Interconnect	Domestic GPU and DeepSeek Accelerated Adaptation, Computing Interconnect Collaborates with Hygon to Launch DeepSeek-R1 Model Services
February 4	JD Cloud	One-Click Deployment! JD Cloud Fully Launches DeepSeek-R1/V3
February 4	SCNet	New Arrival! Try DeepSeek on Supercomputing Internet!
February 5	China Unicom Cloud	"Nezha Stirs the Sea"! China Unicom Cloud Launches DeepSeek-R1 Series Models!
February 5	PPIO Cloud	PPIO Holiday Report: 99.9% Availability! Overnight Support for Full Version of DeepSeek, Helping Customers Easily Handle Traffic Peaks
February 5	Tencent Video	Bingji Tech
February 5	UCloud	UCloud Adapts Full DeepSeek Series Models Based on Domestic Chips
February 5	China Mobile Cloud	Full Version, Full Size, Full Function! China Mobile Cloud Fully Launches DeepSeek
February 6	QingCloud	Continuous Launch of DeepSeek! Jishi Computing Janus-Pro-7B Text-to-Image Model Arrives
February 6	Digital China	3-Minute Deployment of High-Performance AI Model DeepSeek, Digital China Helps Enterprises Transform with Intelligence
February 6	China Telecom Cloud	New Breakthrough in Domestic AI Ecosystem! "Xirang" + DeepSeek Super Combination Arrives!
February 6	Parallel Tech	Server Busy? Parallel Tech Helps You DeepSeek Freely!
February 6	UCloud	UCloud Private Cloud Launches DeepSeek Series Models
February 7	Inspur Cloud	Inspur Cloud First Releases 671B DeepSeek Large Model All-in-One Solution
February 7	Beijing Supercomputing	Beijing Supercomputing x DeepSeek: Dual Engines Ignite, Driving Trillion-Level AI Innovation Storm

DeepSeek AI Insights

Search This Blog

Complete Manual for Local Deployment of Deepseek R1