GPT-4o加錢能變快！新功能7秒完成原先23秒的任務

今天小編分享的科學經驗：GPT-4o加錢能變快！新功能7秒完成原先23秒的任務，歡迎閱讀。

OpenAI 出了個新功能，直接讓 ChatGPT 輸出的速度原地起飛！

這個功能叫做" 預測輸出 "（Predicted Outputs），在它的加持之下，GPT-4o 可以比原先快至多 5 倍。

以編程為例，來感受一下這個 feel：

為啥會這麼快？用一句話來總結就是：

跳過已知内容，不用從頭開始重新生成。

因此，" 預測輸出 " 就特别适合下面這些任務：

在文檔中更新博客文章

迭代先前的響應

重寫現有檔案中的代碼

而且與 OpenAI 合作開發這個功能的 FactoryAI，也亮出了他們在編程任務上的數據：

從實驗結果來看，" 預測輸出 " 加持下的 GPT-4o 響應時間比之前快了 2-4 倍，同時保持高精度。

并且官方還表示：

原先需要 70 秒完成的編程任務，現在只需要 20 秒。

值得注意的是，目前 " 預測輸出 " 功能僅支持 GPT-4o 和 GPT-4o mini 兩個模型，且是以 API 的形式。

對于開發者而言，這可以說是個利好消息了。

網友們在線實測

消息一出，眾多網友也是坐不住了，反手就是實測一波。

例如Firecrawl 創始人Eric Ciarla 就用 " 預測輸出 " 體驗了一把将博客文章轉為 SEO（搜索引擎優化）的内容，然後他表示：

速度真的超級快。

它就像在 API 調用中添加一個預測參數一樣簡單。

另一位網友則是在已有的代碼之上，" 喂 " 了一句 Prompt：

change the details to be random pieces of text.

将詳細信息更改為随機文本片段。

來感受一下這個速度：

也有網友曬出了自己實測的數據：

總而言之，快，是真的快。

怎麼做到的？

對于 " 預測輸出 " 的技術細節，OpenAI 在官方文檔中也有所介紹。

OpenAI 認為，在某些情況下，LLM 的大部分輸出都是提前知道的。

如果你要求模型僅對某些文本或代碼進行細微修改，就可以通過 " 預測輸出 "，将現有内容作為預測輸入，讓延遲明顯降低。

例如，假設你想重構一段 C# 代碼，将 Username 屬性更改為 Email ：

/// <summary>/// Represents a user with a first name, last name, and username./// </summary>public class User{ /// <summary> /// Gets or sets the user's first name. /// </summary> public string FirstName { get; set; }

/// <summary> /// Gets or sets the user's last name. /// </summary> public string LastName { get; set; }

/// <summary> /// Gets or sets the user's username. /// </summary> public string Username { get; set; }}

你可以合理地假設檔案的大部分内容将不會被修改（例如類的文檔字元串、一些現有的屬性等）。

通過将現有的類檔案作為預測文本傳入，你可以更快地重新生成整個檔案。

import OpenAI from "openai";

const code = `/// <summary>/// Represents a user with a first name, last name, and username./// </summary>public class User{ /// <summary> /// Gets or sets the user's first name. /// </summary> public string FirstName { get; set; }

/// <summary> /// Gets or sets the user's last name. /// </summary> public string LastName { get; set; }

/// <summary> /// Gets or sets the user's username. /// </summary> public string Username { get; set; }}`;

const openai = new OpenAI ( ) ;

const completion = await openai.chat.completions.create ( { model: "gpt-4o", messages: [ { role: "user", content: "Replace the Username property with an Email property. Respond only with code, and with no markdown formatting." }, { role: "user", content: code } ] , prediction: { type: "content", content: code }} ) ;

// Inspect returned dataconsole.log ( completion ) ;

使用 " 預測輸出 " 生成 tokens 會大大降低這些類型請求的延遲。

不過對于 " 預測輸出 " 的使用，OpenAI 官方也給出了幾點注意事項。

首先就是我們剛才提到的僅支持 GPT-4o 和 GPT-4o-mini 系列模型。

其次，以下 API 參數在使用預測輸出時是不受支持的：

n values greater than 1

logprobs

presence_penalty greater than 0

frequency_penalty greater than 0

audio options

modalities other than text

max_completion_tokens

tools - function calling is not supported

除此之外，在這份文檔中，OpenAI 還總結了除 " 預測輸出 " 之外的幾個延遲優化的方法。

包括 " 加速處理 token"、" 生成更少的 token"、" 使用更少的輸入 token"、" 減少請求 "、" 并行化 " 等等。

文檔鏈接放在文末了，感興趣的小夥伴可以查閱哦 ~

One More Thing

雖然輸出的速度變快了，但 OpenAI 還有一個注意事項引發了網友們的讨論：

When providing a prediction, any tokens provided that are not part of the final completion are charged at completion token rates.

在提供預測時，所提供的任何非最終完成部分的 tokens 都按完成 tokens 費率收費。

有網友也曬出了他的測試結果：

未采用 " 預測輸出 "：5.2 秒，0.1555 美分

采用了 " 預測輸出 "：3.3 秒，0.2675 美分

嗯，快了，也貴了。

OpenAI 官方文檔：

https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs

參考鏈接：

[ 1 ] https://x.com/OpenAIDevs/status/1853564730872607229

[ 2 ] https://x.com/romainhuet/status/1853586848641433834

[ 3 ] https://x.com/GregKamradt/status/1853620167655481411