使用Azure openai java SDK序列化/反序列化message

有个功能需要在数据库中配置模板将用户输入填入占位符, 因为要配置多轮message以及各个参数,使用自己配置的方式会比较麻烦,就想着把ChatCompletionsOptions整个序列化到数据库里,然后拿出来再填入占位符.
但是直接使用jackson会导致对于user的ChatRequestUserMessage会序列化和反序列化都失败.

序列化会变为:

 {
    "role": "user",
    "content": {
        "length": 22709,
        "replayable": true
    },
    "name": null
}

因为整个类型的定义:

public final class ChatRequestUserMessage extends ChatRequestMessage {

    /*
     * The contents of the user message, with available input types varying by selected model.
     */
    @Generated
    @JsonProperty(value = "content")
    private BinaryData content;

content是个BinaryData
自己写openai的格式反序列化也一样

解决方法很简单,因为引入了微软自己的BinaryData 使用这个进行序列化和反序列化就行了:
反序列化例子:

String template = """
        {
            "messages": [
                {
                    "role": "system",
                    "content": "system message",
                    "name": null
                },
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "text",
                            "text":"user message"
                        },
                        {
                            "type": "image_url",
                            "image_url": {"url": "%s"}
                        }
                    ]
                }
            ],
            "max_tokens": 2048,
            "temperature": null,
            "top_p": null,
            "logit_bias": null,
            "user": null,
            "n": null,
            "stop": null,
            "presence_penalty": null,
            "frequency_penalty": null,
            "stream": null,
            "model": "gpt-4o-2024-05-13",
            "functions": null,
            "function_call": null,
            "data_sources": null,
            "enhancements": null,
            "seed": null,
            "response_format": null,
            "tools": null,
            "tool_choice": null,
            "logprobs": null,
            "top_logprobs": null
        }
        """.formatted(imageBase64);

final ChatCompletionsOptions chatCompletionsOptions = BinaryData.fromString(template).toObject(ChatCompletionsOptions.class);

不过试验了下还是会有点问题…因为用了BinaryData所以反序列化它没法识别是List还是String,最后全部变成了String结构破坏了.
提了一个issue: [BUG] The result is inconsistent after serializing and deserializing ChatRequestUserMessage

弄了个不怎么优雅的临时解法:

private void fixMessage(ChatCompletionsOptions options) {
    boolean changed = false;
    List<ChatRequestMessage> messages = new ArrayList<>();
    for (ChatRequestMessage message : options.getMessages()) {
        if (message instanceof ChatRequestUserMessage u) {
            final BinaryData content = u.getContent();
            try {
                final List<ChatMessageContentItem> items = content.toObject(new TypeReference<>() {
                });
                messages.add(new ChatRequestUserMessage(items));
                changed = true;
                continue;
            } catch (Exception ignored) {
            }
        }
        messages.add(message);
    }
    if (changed) {
        try {
            Field messagesField = ChatCompletionsOptions.class.getDeclaredField("messages");
            messagesField.setAccessible(true);
            messagesField.set(options, messages);
        } catch (Exception e) {
            log.error("fixMessage - set message error", e);
        }
    }
}

就先这样吧… …

参考: [openAI] Serialized Storage Context