GPT-4o加钱能变快！新功能7秒完成原先23秒的任务

今天小编分享的科学经验：GPT-4o加钱能变快！新功能7秒完成原先23秒的任务，欢迎阅读。

OpenAI 出了个新功能，直接让 ChatGPT 输出的速度原地起飞！

这个功能叫做" 预测输出 "（Predicted Outputs），在它的加持之下，GPT-4o 可以比原先快至多 5 倍。

以编程为例，来感受一下这个 feel：

为啥会这么快？用一句话来总结就是：

跳过已知内容，不用从头开始重新生成。

因此，" 预测输出 " 就特别适合下面这些任务：

在文档中更新博客文章

迭代先前的响应

重写现有檔案中的代码

而且与 OpenAI 合作开发这个功能的 FactoryAI，也亮出了他们在编程任务上的数据：

从实验结果来看，" 预测输出 " 加持下的 GPT-4o 响应时间比之前快了 2-4 倍，同时保持高精度。

并且官方还表示：

原先需要 70 秒完成的编程任务，现在只需要 20 秒。

值得注意的是，目前 " 预测输出 " 功能仅支持 GPT-4o 和 GPT-4o mini 两个模型，且是以 API 的形式。

对于开发者而言，这可以说是个利好消息了。

网友们在线实测

消息一出，众多网友也是坐不住了，反手就是实测一波。

例如Firecrawl 创始人Eric Ciarla 就用 " 预测输出 " 体验了一把将博客文章转为 SEO（搜索引擎优化）的内容，然后他表示：

速度真的超级快。

它就像在 API 调用中添加一个预测参数一样简单。

另一位网友则是在已有的代码之上，" 喂 " 了一句 Prompt：

change the details to be random pieces of text.

将详细信息更改为随机文本片段。

来感受一下这个速度：

也有网友晒出了自己实测的数据：

总而言之，快，是真的快。

怎么做到的？

对于 " 预测输出 " 的技术细节，OpenAI 在官方文档中也有所介绍。

OpenAI 认为，在某些情况下，LLM 的大部分输出都是提前知道的。

如果你要求模型仅对某些文本或代码进行细微修改，就可以通过 " 预测输出 "，将现有内容作为预测输入，让延迟明显降低。

例如，假设你想重构一段 C# 代码，将 Username 属性更改为 Email ：

/// <summary>/// Represents a user with a first name, last name, and username./// </summary>public class User{ /// <summary> /// Gets or sets the user's first name. /// </summary> public string FirstName { get; set; }

/// <summary> /// Gets or sets the user's last name. /// </summary> public string LastName { get; set; }

/// <summary> /// Gets or sets the user's username. /// </summary> public string Username { get; set; }}

你可以合理地假设檔案的大部分内容将不会被修改（例如类的文档字元串、一些现有的属性等）。

通过将现有的类檔案作为预测文本传入，你可以更快地重新生成整个檔案。

import OpenAI from "openai";

const code = `/// <summary>/// Represents a user with a first name, last name, and username./// </summary>public class User{ /// <summary> /// Gets or sets the user's first name. /// </summary> public string FirstName { get; set; }

/// <summary> /// Gets or sets the user's last name. /// </summary> public string LastName { get; set; }

/// <summary> /// Gets or sets the user's username. /// </summary> public string Username { get; set; }}`;

const openai = new OpenAI ( ) ;

const completion = await openai.chat.completions.create ( { model: "gpt-4o", messages: [ { role: "user", content: "Replace the Username property with an Email property. Respond only with code, and with no markdown formatting." }, { role: "user", content: code } ] , prediction: { type: "content", content: code }} ) ;

// Inspect returned dataconsole.log ( completion ) ;

使用 " 预测输出 " 生成 tokens 会大大降低这些类型请求的延迟。

不过对于 " 预测输出 " 的使用，OpenAI 官方也给出了几点注意事项。

首先就是我们刚才提到的仅支持 GPT-4o 和 GPT-4o-mini 系列模型。

其次，以下 API 参数在使用预测输出时是不受支持的：

n values greater than 1

logprobs

presence_penalty greater than 0

frequency_penalty greater than 0

audio options

modalities other than text

max_completion_tokens