【STM32F769】搭建ollama转发服务器-电子产品世界论坛

【前言】

搭建好本地deepseek服务器后，可以使用http post到服务器，获取反馈数据。从服务器返回的是一块一块的数据，比如如下：

在STM32F769处理这些数据时，有时会由于返回数据太大，导致缓冲区溢出。所以需要在本地再搭建一个转发服务，我们只需要提出有用的信息转发给客户端即可。

【实现步聚】

1、首先引入需要的模块：

import http.server
import socketserver
import http.client
import json
import re

2、定义好host 端口为11434 模型名称

OLLAMA_HOST = 'localhost'
OLLAMA_PORT = 11434
MODEL_NAME = "deepseek-r1:1.5b"

3、编写转发服务

在POST路由中，首先获取客户端转发过来的post_data，然后再把model添加进post_data中，再组装进行转发给ollama服务器

 def do_POST(self):
        content_length = int(self.headers['Content-Length'])
        post_data = self.rfile.read(content_length)
        try:
            request_json = json.loads(post_data)
            request_json["model"] = MODEL_NAME
            modified_post_data = json.dumps(request_json).encode('utf-8')
 
            # 只保留必要的请求头
            necessary_headers = {
                'Content-Type': 'application/json',
                'Content-Length': str(len(modified_post_data))
            }
 
            conn = http.client.HTTPConnection(OLLAMA_HOST, OLLAMA_PORT)
            print(f"Forwarding request: {self.path}, {necessary_headers}, {modified_post_data}")
            conn.request('POST', self.path, modified_post_data, necessary_headers)
            response = conn.getresponse()

4、在拿到ollama返回的装诚码和头信息时，把他返回请求客户端：

            # 打印响应状态码和头信息
            print(f"Response status: {response.status}")
            print(f"Response headers: {response.getheaders()}")
 
            self.send_response(response.status)

5、由于服务器返回的是分块编码，我把他去掉掉：

            # 去掉分块传输编码头
            headers_to_send = [
                (header, value) for header, value in response.getheaders()
                if header.lower() != 'transfer-encoding'
            ]
            for header, value in headers_to_send:
                self.send_header(header, value)

6、接着把非需要的字符去掉，比如think标签，以及其他不需要的数据，只保留response的内容，并重新拼接：

         # 收集所有 response 字段的内容
            response_content = ""
            while True:
                line = response.readline()
                if not line:
                    break
                try:
                    json_obj = json.loads(line)
                    print(f"Parsed JSON: {json_obj}")
                    # 去除 <think> 和 </think> 标签
                    response_text = json_obj.get('response', "")
                    response_text = re.sub(r'<think>|</think>', '', response_text)
                    # 将 \n 或 \n\n 替换为 \r\n
                    response_text = re.sub(r'\n+', '\r\n', response_text)
                    response_content += response_text
                except json.JSONDecodeError:
                    print(f"Invalid JSON line: {line.decode('utf-8').strip()}")
                    continue  # 忽略非 JSON 行，继续处理后续数据

7、重新计算返回内容的长度，并发送回客户端：

         # 计算内容长度
            response_content_bytes = response_content.encode('utf-8')
            content_length = len(response_content_bytes)
            self.send_header('Content-Length', str(content_length))
            self.end_headers()
 
            # 一次性发送 response 内容给客户端
            self.wfile.write(response_content_bytes)
 
            conn.close()

8、最后定义端口为8000，开启服务器转发：

PORT = 8000
with socketserver.TCPServer(("", PORT), OllamaProxyHandler) as httpd:
    print(f"Serving at port {PORT}")
httpd.serve_forever()

【编写测试代码】

import requests
import json
LOCAL_SERVER_HOST = 'localhost'
LOCAL_SERVER_PORT = 8000
url = f'http://{LOCAL_SERVER_HOST}:{LOCAL_SERVER_PORT}/api/generate'
data = {
    "prompt": "你好"
}
json_data = json.dumps(data).encode('utf-8')
headers = {
    'Content-Type': 'application/json'
}
try:
    response = requests.post(url, data=json_data, headers=headers)
    if response.status_code == 200:
        print("Response from server:")
        print(response.text)
    else:
        print(f"Request failed with status code {response.status_code}: {response.text}")
except requests.RequestException as e:
print(f"An error occurred while making the request: {e}")

执行后，我们回收到的内容就简单许多了：

【总结】

在直接与ollama服务器进行请求时，会返回许多信息，我只关心需要的字段，经过如此处理后，在单片机中处理就简单许多了。

附代码：

post_test.zip

有奖活动
【有奖活动】分享工程师的一天
【工程师专属福利】每天30秒，积分轻松拿！EEPW宠粉打卡计划启动！
送您一块开发板，2025年“我要开发板活动”又开始了！
分享开发笔记，赚取电动螺丝刀（便携轻巧与无刷电机两款可选）
看《关键跃升》，聊聊工作心法、动力、能力、沟通与协作，从五个方面深入交流你的工作心得
春风十里，不如现场有你｜来领取你2025慕尼黑上海电子展观众预登记的奖励吧！
我要给自己挣一部逻辑分析仪
【有奖活动】EEPW网站征稿正在进行时，欢迎踊跃投稿啦

打赏帖
【S32K146】S32DS watchdog 配置使用被打赏20分
【Zephyr】使用 IAR 调试 Zephyr 镜像被打赏20分
【Zephyr】MCXN947 Zephyr 开发入门适配shell被打赏20分
【我要开发板】6.联合MATLAB记录数据被打赏50分
【瑞萨RA2E1开发板】：使用ADC功能实现位移传感器采集方案被打赏20分
【nRF7002DK】基于sht30的温湿度计被打赏20分
【nRF7002DK】日志打印被打赏20分
【换取手持示波器】RGB屏幕移植ARM-2D库被打赏35分
【分享开发笔记，赚取电动螺丝刀】分享一下如何解决瑞萨RA2E1使用printf编译报错问题被打赏27分
rtthread硬件加密-5hash加密分析被打赏10分

热门分类
STM32	MCU
通讯及无线技术	物联网技术
电子DIY	板卡试用
基础知识	软件与操作系统
我爱生活	小e食堂

EEPW论坛

【STM32F769】搭建ollama转发服务器

回复