SpringAI项目介绍！

月伴飞鱼2025-03-012025-08-16

Spring AI是一个专为AI工程设计的Java应用框架。

SpringAI是Spring框架的一个扩展，用于方便开发者集成AI调用AI接口。

官网：https://spring.io/projects/spring-ai

SpringAI API官网：https://docs.spring.io/spring-ai/reference/api/chatclient.html

Spring AI有以下特点:

在AI的聊天、文生图、嵌入模型等方面提供API级别的支持。

与模型之间支持同步式和流式交互。

多种模型支持。

基本使用

GPT-API-Free：https://gitcode.com/chatanywhere/GPT_API_free/overview

使用：https://start.spring.io/，构建一个`Spring Boot` 项目。

点击ADD DEPENDENCIES，搜索Ollama添加依赖。

打开生成的项目，查看pom.xml，可以看到核心依赖：

<dependency>
	<groupId>org.springframework.ai</groupId>
	<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>

安装 Ollama：

1	curl -fsSL https://ollama.com/install.sh \| sh

官网：https://ollama.com/download/

运行 deepseek-r1：

1	ollama run deepseek-r1:671b

配置Ollama的相关信息：

1 2	spring.ai.ollama.base-url=http://localhost:11434 spring.ai.ollama.chat.model=deepseek-r1:1.5b

spring.ai.ollama.base-url: Ollama的API服务地址。

spring.ai.ollama.chat.model: 要调用的模型名称。

调用Ollama中的deepseek-r1模型：

public class TestOllama {

    @Autowired
    private OllamaChatModel ollamaChatModel;

    @Test
    public void testChatModel() {
        String prompt = "英文就翻译成中文";
        String message = "test";
        String result = ollamaChatModel.call(prompt + ":" + message);
        System.out.println(result);
    }
}

集成 DeepSeek 大模型

Spring AI 集成 DeepSeek的代码示例：https://github.com/Fj-ivy/spring-ai-examples/tree/main/spring-ai-deepseek-examples

DeepSeek 官方文档：https://api-docs.deepseek.com/zh-cn/

引入依赖：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

配置：

spring:
  ai:
    openai:
      api-key: sk-xxx   // 填写自己申请的key
      base-url: https://api.deepseek.com
      chat:
        options:
          model: deepseek-chat

简单示例：

@RestController
public class ChatController {

    private final OpenAiChatModel chatModel;

    @Autowired
    public ChatController(OpenAiChatModel chatModel) {
        this.chatModel = chatModel;
    }

    /**
     * 让用户输入一个prompt，然后返回一个结果
     */
    @GetMapping("/ai/generate")
    public Map<String,String> generate(@RequestParam(value = "message") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }
}

角色预设

@Configuration
public class AIConfig {
    @Bean
    public ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("你是一名老师，你精通Java开发，你的名字叫考拉AI。").build();
    }
}

流式响应

Call和Stream的区别：

非流式输出 Call：等待大模型把回答结果全部生成后输出给用户。

流式输出 Stream：逐个字符输出。

一方面符合大模型生成方式的本质，另一方面当模型推理效率不是很高时，流式输出比起全部生成后再输出大大提高用户体验。

@GetMapping(value = "/chat/stream",produces="text/html;charset=UTF-8")
public Flux<String> chatStream(@RequestParam(value = "msg") String message) {
    return chatClient.prompt().user(message).stream().content();
}

图像模型(文生图)

属性配置官网：https://docs.spring.io/spring-ai/reference/api/image/openai-image.html

@RequestMapping("/image")
@RestController
@RequiredArgsConstructor
public class ImageModelController {

    private final OpenAiImageModel openaiImageModel;

    @GetMapping
    public String getImage(@RequestParam(value = "msg",defaultValue = "生成一直小猫")String msg) {
        ImageResponse response = openaiImageModel.call(
                new ImagePrompt(
                        msg,
                        OpenAiImageOptions.builder()
                                .withQuality("hd")//将生成的图像的质量。HD 创建的图像具有更精细的细节和更高的图像一致性。只有 dall-e-3 支持此参数。
                                .withModel(OpenAiImageApi.DEFAULT_IMAGE_MODEL)
                                .withN(1)//要生成的图像数。必须介于 1 和 10 之间。对于 dall-e-3，仅支持 n=1。
                                .withHeight(1024)//生成的图像的高宽度。必须是 dall-e-2 的 256、512 或 1024 之一。
                                .withWidth(1024).build())
        );
        return response.getResult().getOutput().getUrl();
    }

}

文字转语音(文生语音)

@RequestMapping("/audio")
@RequiredArgsConstructor
@RestController
public class AudioModelController {

    private final OpenAiAudioSpeechModel openAiAudioSpeechModel;

    @GetMapping
    public void text2audio() throws IOException {
        OpenAiAudioSpeechOptions speechOptions = OpenAiAudioSpeechOptions.builder()
                .withModel("tts-1")//要使用的模型的 ID。目前只有 tts-1 可用。
                .withVoice(OpenAiAudioApi.SpeechRequest.Voice.ALLOY)//用于 TTS 输出的语音。可用选项包括：alloy, echo, fable, onyx, nova, and shimmer.
                .withResponseFormat(OpenAiAudioApi.SpeechRequest.AudioResponseFormat.MP3)//音频输出的格式。支持的格式包括 mp3、opus、aac、flac、wav 和 pcm。
                .withSpeed(1.0f)//语音合成的速度。可接受的范围是从 0.0（最慢）到 1.0（最快）
                .build();

        //要转换的语音内容
        SpeechPrompt speechPrompt = new SpeechPrompt("你好，这是一个文本到语音的例子。", speechOptions);
        SpeechResponse response = openAiAudioSpeechModel.call(speechPrompt);
        byte[] output = response.getResult().getOutput();
        //将文件输出到指定位置
        writeByteArrayToMp3(output,"/Users/mac/Desktop/project/java");
    }

    public static void writeByteArrayToMp3(byte[] audioBytes, String outputFilePath) throws IOException {
        // 创建FileOutputStream实例
        FileOutputStream fos = new FileOutputStream(outputFilePath+"/audio_demo.mp3");

        // 将字节数组写入文件
        fos.write(audioBytes);

        // 关闭文件输出流
        fos.close();
    }
}

语音转文字

@RequestMapping("/audio")
@RequiredArgsConstructor
@RestController
public class AudioModelController {

    private final OpenAiAudioTranscriptionModel openAiTranscriptionModel;

    @GetMapping("/audio2text")
    public String audio2text(){
        //脚本输出的格式，位于以下选项之一中：json、text、srt、verbose_json 或 vtt。
        OpenAiAudioApi.TranscriptResponseFormat responseFormat = OpenAiAudioApi.TranscriptResponseFormat.TEXT;

        OpenAiAudioTranscriptionOptions transcriptionOptions = OpenAiAudioTranscriptionOptions.builder()
                .withLanguage("en")//输入音频的语言。以 ISO-639-1 格式提供输入语言将提高准确性和延迟。
                .withPrompt("Ask not this, but ask that")//用于指导模型样式或继续上一个音频片段的可选文本。提示应与音频语言匹配。
                .withTemperature(0f)//采样温度，介于 0 和 1 之间。较高的值（如 0.8）将使输出更具随机性，而较低的值（如 0.2）将使其更加集中和确定。如果设置为 0，模型将使用对数概率自动提高温度，直到达到某些阈值。
                .withResponseFormat(responseFormat)//输出格式
                .build();
        //获取当前语音文件
        ClassPathResource audioFile = new ClassPathResource("audio_demo.mp3");
        AudioTranscriptionPrompt transcriptionRequest = new AudioTranscriptionPrompt(audioFile, transcriptionOptions);
        AudioTranscriptionResponse response = openAiTranscriptionModel.call(transcriptionRequest);
        return response.getResult().getOutput();
    }
}

多模态

多模态是指模型同时理解和处理来自各种来源的信息的能力，包括文本、图像、音频和其他数据格式。

/**
    * 多模态是指模型同时理解和处理来自各种来源的信息的能力，包括文本、图像、音频和其他数据格式。 
    * 仅支持 chatGPT4.0
    * @param msg
    * @return
    */
   @GetMapping(value = "/openai/multimodal",produces="text/html;charset=UTF-8")
   public String multimodal(@RequestParam("msg")String msg) throws IOException {
       byte[] imageData = new ClassPathResource("/multimodal.test.png").getContentAsByteArray();
       var userMessage = new UserMessage(msg,
               List.of(new Media(
                       MimeTypeUtils.IMAGE_PNG,
                       new URL("https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png")
               )));

       ChatResponse response = chatModel.call(new Prompt(List.of(userMessage),
               OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_O.getValue()).build()));

       return response.getResult().getOutput().getContent();
   }

提示词模板

在 Spring AI 与大模型交互的过程中，处理提示词的方式与 Spring MVC 中管理视图View 的方式有些相似。

首先要创建包含动态内容占位符的模板，然后，这些占位符会根据用户请求或应用程序中的其他代码进行替换。

另一个类比是JdbcTemplate中的语句，它包含可动态替换的占位符。

在提示词模板中，{占位符} 可以用 Map 中的变量动态替换。

@GetMapping("/prompt")
public String prompt(@RequestParam("name")String name,@RequestParam("voice")String voice){
    String userText= """
            给我推荐上海的至少三个旅游景点
            """;
    UserMessage userMessage = new UserMessage(userText);
    String systemText= """
            你是一个旅游咨询助手，可以帮助人们查询旅游信息。
            你的名字是{name},
            你应该用你的名字和{voice}的风格回复用户的请求。
            """;
    SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(systemText);
    //替换占位符
    Message systemMessage = systemPromptTemplate.createMessage(Map.of("name", name, "voice", voice));
    Prompt prompt = new Prompt(List.of(userMessage, systemMessage));
    List<Generation> results = chatModel.call(prompt).getResults();
    return results.stream().map(x->x.getOutput().getContent()).collect(Collectors.joining(""));
}

函数调用

我们创建一个聊天机器人，通过调用我们自己的函数来回答问题。

为了支持聊天机器人的响应，我们将注册我们自己的函数，该函数获取一个位置并返回该位置的当前天气。

创建Functions包，创建LocationWeatherFunction实现Function接口：

import java.util.function.Function;

public class LocationWeatherFunction implements Function <LocationWeatherFunction.Request, LocationWeatherFunction.Response>{

    // 实现apply方法
    @Override
    public Response apply(Request request) {
        System.out.println(request);
        if(request==null){
            return new Response("request is null");
        }
        if(request.location==null){
            return new Response("地址是空的");
        }
        return new Response("天气一会下雨一会晴天" );
    }

    public record Request(String location){

    }
    public record Response(String msg) {
    }
}

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Description;

import java.util.function.Function;

@Configuration
public class AIConfig {
    @Bean
    @Description("某某地方天气怎么样")
    public Function<LocationWeatherFunction.Request, LocationWeatherFunction.Response> locationWeatherFunction(){
        return new LocationWeatherFunction();
    }
}

向量数据库

在向量数据库中，查询与传统关系型数据库有所不同。它不是执行精确匹配，而是执行相似性搜索。

当给定一个向量作为查询时，向量数据库会返回与查询向量相似的向量。

向量数据库用于将私有的数据与大模型集成。

使用它们的第一步是将您的数据加载到向量数据库中。

然后，当用户的查询要发送到AI模型时，首先会检索一组相似的文档。

这些文档随后将作为用户问题的上下文，与用户查询一起发送到大模型。

这种技术被称为检索增强生成（RAG）。

向量化

计算机无法读懂自然语言，只能处理数值，因此自然语言需要以一定的形式转化为数值。

向量化就是将自然语言中的词语映射为数值的一种方式。

然而对于丰富的自然语言来说，将它们映射为数值向量，使之包含更丰富的语义信息和抽象特征显然是一种更好的选择。

嵌入是浮点数的向量（列表），两个向量之间的距离衡量它们的相关性，小距离表示高相关性，大距离表示低相关性。

向量化可以将单词或短语表示为低维向量，这些向量具有丰富的语义信息，可以捕捉单词或短语的含义和上下文关系。

Embedding Client 旨在将大模型中的向量化功能直接集成。

它的主要功能是将文本转换为数字矢量，通常称为向量化。

向量化对于实现各种功能，如语义分析和文本分类，是至关重要的。

@RequestMapping("/embedding")
@RestController
@RequiredArgsConstructor
public class EmbeddingModelController {

    private final EmbeddingModel embeddingModel;

    @GetMapping()
    public Map embed(@RequestParam(value = "message", defaultValue = "给我讲个笑话") String message) {
        EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
        return Map.of("embedding", embeddingResponse);
    }
}

写入向量库

写入向量数据库前，首先要将文本用大模型向量化，因此在 Spring AI 中向量数据库与向量化方法是绑定在一起使用的。

@Bean
public VectorStore  createVectorStore(EmbeddingModel model){
    return new SimpleVectorStore(model);
}

写入向量库（包括向量化与写入向量库两步）并检索向量库：

@GetMapping
public void load(@RequestParam(value = "msg" ,defaultValue = "济南天气怎么样") String msg) {
    //写入向量库
    List<Document> documents = new ArrayList<>();
    documents.add(new Document("深圳天气热"));
    documents.add(new Document("北京天气冷"));
    documents.add(new Document("上海天气潮湿"));
    documents.add(new Document("济南天气一会热一会冷"));
    vectorStore.add(documents);
    //检索向量库
    List<Document> result = vectorStore.similaritySearch(msg);
    List<String> collect = result.stream().map(Document::getContent).toList();
    System.out.println(collect);
}

RAG检索增强生成

检索增强生成（RAG）技术旨在解决将外部数据输入纳入提示词以获取准确的大模型响应。

一个最简单的 RAG 的原理：

首先要从文档中读取非结构化数据，对其进行转换，变为结构化的数据，然后向量化，再将其写入向量数据库。

从高层次来看，这是一个ETL（提取、转换和加载）的Pipe，在 RAG 技术的检索部分中，也使用了向量数据库。

将非结构化数据加载到向量数据库时，最重要的转换之一是将原始文档拆分成较小的部分。

大模型的输入 Token 数有限，将全文全部输入给模型不现实，只能将最相关的部分输入模型。

将原始文档拆分成较小部分的过程包括两个重要步骤：

在保留内容语义边界的同时，将文档拆分成多个部分。

对于包含段落和表格的文档，应避免在段落或表格中间拆分文档。

对于代码，应避免在方法实现中间拆分代码。

要求被拆分成为的每一个文本块都占大模型模型输入 Token 限制很小的一部分。

RAG 的下一个阶段是处理用户输入。

当大模型需要回答用户的问题时，将问题和所有相似的文本块放入发送给大模型的提示词。

这就是使用向量数据库的原因，它非常擅长查找相似的内容。