【學英文看科技】Anthropic Unleashes Claude 4: The AI That Thinks Deeper!

type

status

date

slug

summary

📢【新聞標題】

Anthropic Launches New Claude 4 AI Models Capable of Reasoning Over Many Steps

Anthropic推出全新Claude 4 AI模型，具備多步驟推理能力

📰【摘要】

Anthropic launched two new AI models, Claude Opus 4 and Claude Sonnet 4, claiming they are among the industry's best in benchmarks. These models can analyze large datasets, execute long-horizon tasks, and perform complex actions, particularly in programming tasks. Sonnet 4 will be available to both free and paying users, while Opus 4 will be exclusive to paying users. The company aims for significant revenue growth, projecting $12 billion in earnings by 2027.

Anthropic推出了兩個新的AI模型，Claude Opus 4和Claude Sonnet 4，聲稱它們在基準測試中是業界最佳之一。這些模型可以分析大型數據集，執行長期任務，並執行複雜的操作，尤其是在編程任務中。Sonnet 4將提供給免費和付費用戶，而Opus 4將專供付費用戶使用。該公司旨在實現顯著的收入增長，預計到2027年收入將達到120億美元。

🗝️【關鍵詞彙表】

📝 inaugural (adj.)

首次的、開幕的

例句: During its inaugural developer conference Thursday...

翻譯: 在週四的首次開發者大會上...

📝 benchmarks (n.)

基準、基準測試

例句: ...at least in terms of how they score on popular benchmarks.

翻譯: ...至少在它們在流行的基準測試中的得分方面。

📝 tokens (n.)

符記、代幣

例句: Opus 4 will be priced at $15/$75 per million tokens (input/output).

翻譯: Opus 4的價格為每百萬個符記15/75美元（輸入/輸出）。

📝 safeguard (n.)

安全措施、保護措施

例句: Still, Anthropic is releasing Opus 4 under stricter safeguards.

翻譯: 儘管如此，Anthropic正在更嚴格的安全措施下發布Opus 4。

📝 vulnerabilities (n.)

漏洞、弱點

例句: Code-generating AI tends to introduce security vulnerabilities and errors.

翻譯: 代碼生成的AI傾向於引入安全漏洞和錯誤。

📝 tacit knowledge (n.)

隱性知識

例句: ...building what Anthropic describes as “tacit knowledge” over time.

翻譯: ...隨著時間的推移，建立起Anthropic所描述的「隱性知識」。

📝 loophole (n.)

漏洞、法律漏洞

例句: Reward hacking, also known as specification gaming, is a behavior where models take shortcuts and loopholes to complete tasks.

翻譯: 獎勵駭客，也稱為規格遊戲，是一種模型採用捷徑和漏洞來完成任務的行為。

📝 agentic (adj.)

主動的、能動的

例句: While Anthropic launched a new flagship AI model earlier this year, Claude Sonnet 3.7, alongside an agentic coding tool called Claude Code...

翻譯: 雖然Anthropic今年早些時候推出了一款新的旗艦AI模型，Claude Sonnet 3.7，以及一個名為Claude Code的能動編碼工具...

✍️【文法與句型】

📝 be tuned to

說明: To be adjusted or optimized for a specific purpose or function.

翻譯: 為了特定目的或功能而調整或優化。

例句: Both models were tuned to perform well on programming tasks.

翻譯: 這兩種模型都經過調整，以在編程任務中表現良好。

📝 owing to

說明: Because of; due to.

翻譯: 由於、因為。

例句: AI models still struggle to code quality software owing to weaknesses in areas like the ability to understand programming logic.

翻譯: AI模型仍然難以編寫高品質的軟體，因為它們在理解編程邏輯等方面的能力存在缺陷。

📝 play for keeps

說明: To compete or act in a serious and determined way, with the intention of winning or succeeding.

翻譯: 認真且堅定地競爭或行動，旨在獲勝或成功。

例句: Anthropic is playing for keeps with Claude 4.

翻譯: Anthropic正在認真地對待Claude 4。

📖【全文與翻譯】

During its inaugural developer conference Thursday, Anthropic launched two new AI models that the startup claims are among the industry’s best, at least in terms of how they score on popular benchmarks. 在其週四的首次開發者大會上，Anthropic推出了兩款新的AI模型，該初創公司聲稱它們是業界最佳之一，至少在它們在流行的基準測試中的得分方面。

Claude Opus 4 and Claude Sonnet 4, part of Anthropic’s new Claude 4 family of models, can analyze large datasets, execute long-horizon tasks, and take complex actions, according to the company. 根據該公司，Claude Opus 4和Claude Sonnet 4是Anthropic新的Claude 4系列模型的一部分，可以分析大型數據集，執行長期任務並採取複雜的動作。

Both models were tuned to perform well on programming tasks, Anthropic says, making them well-suited for writing and editing code. Anthropic表示，這兩種模型都經過調整，以在編程任務中表現良好，使其非常適合編寫和編輯代碼。

Both paying users and users of the company’s free chatbot apps will get access to Sonnet 4 but only paying users will get access to Opus 4. 付費用戶和該公司免費聊天機器人應用程序的用戶都可以訪問Sonnet 4，但只有付費用戶才能訪問Opus 4。

For Anthropic’s API, via Amazon’s Bedrock platform and Google’s Vertex AI, Opus 4 will be priced at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15 per million tokens (input/output). 對於Anthropic的API，通過Amazon的Bedrock平台和Google的Vertex AI，Opus 4的價格為每百萬個符記15/75美元（輸入/輸出），Sonnet 4的價格為每百萬個符記3/15美元（輸入/輸出）。

Tokens are the raw bits of data that AI models work with. A million tokens is equivalent to about 750,000 words — roughly 163,000 words longer than “War and Peace.” 符記是AI模型處理的原始數據位。一百萬個符記約等於750,000個單詞——比“戰爭與和平”長約163,000個單詞。

Anthropic’s Claude 4 models arrive as the company looks to substantially grow revenue. Anthropic推出Claude 4模型之際，該公司正尋求大幅增加收入。

Reportedly, the outfit, founded by ex-OpenAI researchers, aims to notch $12 billion in earnings in 2027, up from a projected $2.2 billion this year. 據報導，這家由前OpenAI研究人員創立的公司，目標是在2027年實現120億美元的收入，高於今年預計的22億美元。

Anthropic recently closed a $2.5 billion credit facility and raised billions of dollars from Amazon and other investors in anticipation of the rising costs associated with developing frontier models. Anthropic最近完成了25億美元的信貸額度，並從亞馬遜和其他投資者那裡籌集了數十億美元，以應對開發前沿模型相關的成本上升。

Rivals haven’t made it easy to maintain pole position in the AI race. 競爭對手並未使在AI競賽中保持領先地位變得容易。

While Anthropic launched a new flagship AI model earlier this year, Claude Sonnet 3.7, alongside an agentic coding tool called Claude Code, competitors — including OpenAI and Google — have raced to outdo the company with powerful models and dev tooling of their own. 雖然Anthropic今年早些時候推出了一款新的旗艦AI模型，Claude Sonnet 3.7，以及一個名為Claude Code的能動編碼工具，但包括OpenAI和Google在內的競爭對手，都競相以自己強大的模型和開發工具超越該公司。

Anthropic is playing for keeps with Claude 4. Anthropic正在認真地對待Claude 4。

The more capable of the two models introduced today, Opus 4, can maintain “focused effort” across many steps in a workflow, Anthropic says. Anthropic表示，今天推出的兩款模型中，功能更強大的Opus 4可以在工作流程的許多步驟中保持“專注的努力”。

Meanwhile, Sonnet 4 — designed as a “drop-in replacement” for Sonnet 3.7 — improves in coding and math compared to Anthropic’s previous models and more precisely follows instructions, according to the company. 同時，根據該公司，Sonnet 4——被設計為Sonnet 3.7的“直接替代品”——與Anthropic之前的模型相比，在編碼和數學方面有所改進，並且更精確地遵循指令。

The Claude 4 family is also less likely than Sonnet 3.7 to engage in “reward hacking,” claims Anthropic. Anthropic聲稱，Claude 4系列也不太可能像Sonnet 3.7那樣參與“獎勵駭客”。

Reward hacking, also known as specification gaming, is a behavior where models take shortcuts and loopholes to complete tasks. 獎勵駭客，也稱為規格遊戲，是一種模型採用捷徑和漏洞來完成任務的行為。

To be clear, these improvements haven’t yielded the world’s best models by every benchmark. 需要明確的是，這些改進並未在每個基準測試中都產生世界上最好的模型。

For example, while Opus 4 beats Google’s Gemini 2.5 Pro and OpenAI’s o3 and GPT-4.1 on SWE-bench Verified, which is designed to evaluate a model’s coding abilities, it can’t surpass o3 on the multimodal evaluation MMMU or GPQA Diamond, a set of PhD-level biology-, physics-, and chemistry-related questions. 例如，雖然Opus 4在SWE-bench Verified上擊敗了Google的Gemini 2.5 Pro和OpenAI的o3和GPT-4.1，該基準旨在評估模型的編碼能力，但在多模態評估MMMU或GPQA Diamond（一組博士級生物學、物理學和化學相關問題）上，它無法超越o3。

Still, Anthropic is releasing Opus 4 under stricter safeguards, including beefed-up harmful content detectors and cybersecurity defenses. 儘管如此，Anthropic正在更嚴格的安全措施下發布Opus 4，包括加強有害內容檢測器和網路安全防禦。

The company claims its internal testing found that Opus 4 may “substantially increase” the ability of someone with a STEM background to obtain, produce, or deploy chemical, biological, or nuclear weapons, reaching Anthropic’s “ASL-3” model specification. 該公司聲稱，其內部測試發現，Opus 4可能會“大幅提高”具有STEM背景的人員獲取、生產或部署化學、生物或核武器的能力，達到Anthropic的“ASL-3”模型規範。

Both Opus 4 and Sonnet 4 are “hybrid” models, Anthropic says — capable of near-instant responses and extended thinking for deeper reasoning (to the extent AI can “reason” and “think” as humans understand these concepts). Anthropic表示，Opus 4和Sonnet 4都是“混合”模型——能夠實現近乎即時的響應和擴展的思維，以進行更深入的推理（在AI可以像人類理解這些概念那樣“推理”和“思考”的範圍內）。

With reasoning mode switched on, the models can take more time to consider possible solutions to a given problem before answering. 開啟推理模式後，模型可以花更多時間考慮給定問題的可能解決方案，然後再回答。

As the models reason, they’ll show a “user-friendly” summary of their thought process, Anthropic says. Anthropic表示，當模型進行推理時，它們將顯示一個“用戶友好”的思維過程摘要。

Why not show the whole thing? Partially to protect Anthropic’s “competitive advantages,” the company admits in a draft blog post provided to TechCrunch. 為什麼不顯示整個過程？該公司在提供給TechCrunch的部落格草稿文章中承認，部分原因是为了保護Anthropic的“競爭優勢”。

Opus 4 and Sonnet 4 can use multiple tools, like search engines, in parallel, and alternate between reasoning and tools to improve the quality of their answers. Opus 4和Sonnet 4可以並行使用多種工具，例如搜索引擎，並在推理和工具之間交替使用，以提高答案的質量。

They can also extract and save facts in “memory” to handle tasks more reliably, building what Anthropic describes as “tacit knowledge” over time. 它們還可以提取事實並將其保存在“記憶”中，以更可靠地處理任務，從而建立Anthropic隨著時間推移所描述的“隱性知識”。

To make the models more programmer-friendly, Anthropic is rolling out upgrades to the aforementioned Claude Code. 為了使模型對程式設計師更友好，Anthropic正在對上述Claude Code進行升級。

Claude Code, which lets developers run specific tasks through Anthropic’s models directly from a terminal, now integrates with IDEs and offers an SDK that lets devs connect it with third-party applications. Claude Code允許開發人員直接從終端通過Anthropic的模型運行特定任務，現在它與IDE集成，並提供一個SDK，使開發人員可以將其與第三方應用程式連接。

The Claude Code SDK, announced earlier this week, enables running Claude Code as a subprocess on supported operating systems, providing a way to build AI-powered coding assistants and tools that leverage Claude models’ capabilities. 本週早些時候宣布的Claude Code SDK，可以在支援的作業系統上將Claude Code作為子進程運行，從而提供了一種構建AI驅動的編碼助手和工具的方法，這些助手和工具可以利用Claude模型的功能。

Anthropic has released Claude Code extensions and connectors for Microsoft’s VSCode, JetBrains, and GitHub. Anthropic發布了適用於Microsoft的VSCode、JetBrains和GitHub的Claude Code擴展和連接器。

The GitHub connector allows developers to tag Claude Code to respond to reviewer feedback, as well as to attempt to fix errors in — or otherwise modify — code. GitHub連接器允許開發人員標記Claude Code以響應審閱者的反饋，並嘗試修復錯誤或以其他方式修改代碼。

AI models still struggle to code quality software. AI模型仍然難以編寫高品質的軟體。

Code-generating AI tends to introduce security vulnerabilities and errors, owing to weaknesses in areas like the ability to understand programming logic. 由於在理解編程邏輯等方面的弱點，代碼生成的AI往往會引入安全漏洞和錯誤。

Yet their promise to boost coding productivity is pushing companies — and developers — to rapidly adopt them. 然而，它們提高編碼生產力的承諾正在推動公司和開發人員迅速採用它們。

Anthropic, acutely aware of this, is promising more frequent model updates. Anthropic敏銳地意識到這一點，並承諾提供更頻繁的模型更新。

“We’re … shifting to more frequent model updates, delivering a steady stream of improvements that bring breakthrough capabilities to customers faster,” wrote the startup in its draft post. “我們正在……轉向更頻繁的模型更新，提供穩定的改進流，從而更快地為客戶帶來突破性的功能，”該初創公司在其草稿文章中寫道。

“This approach keeps you at the cutting edge as we continuously refine and enhance our models.” “隨著我們不斷改進和增強我們的模型，這種方法使您始終處於最前沿。”

🔗【資料來源】

文章連結：https://techcrunch.com/2025/05/22/anthropics-new-claude-4-ai-models-can-reason-over-many-steps/"