每日简报

2026年4月14日 (周二)

对最重要的AI,公共市场和密码进行实际的,与源相连的综述在过去的24小时内。

TL;DR

今天的AI饲料将治理风险和计量分开:一份报告说,官员们可能正在推动银行测试Anthropic模型,而新的论文和社区项目试图使LLM评价更加现实,从能感推断基准到模型能否在真正的代码库中找到真正的缺陷. 实用信息:将模型选择视为风险决定,并将基准视为不完整,直到可以在自己的环境中复制.

01 Deep Dive

报告:官员可能鼓励银行测试Anthropic的神话模式

What Happened

TechCrunch报道称,特朗普政府官员可能鼓励银行试行名为Mythos的Anthropic模型,尽管近期政府担心Anthropic是供应链风险.

Why It Matters

如果准确,它显示AI供应商的选择可以被政策信号塑造,而不仅仅是模型质量. 对受监管的公司来说,这增加了业务风险:飞行员可以在一夜之间变得具有政治敏感性,而供应商集中化的速度比内部控制能够跟上的速度要快。

Key Takeaways

01 Model adoption in regulated industries is becoming a governance exercise (security, compliance, regulators, and public scrutiny), not a simple product decision.
02 A ‘preferred vendor’ narrative can flip quickly, so portability (prompts, evals, and audit trails) matters as much as raw capability.
03 Treat early pilots as evidence-gathering, with clear exit criteria, so you can switch providers without restarting from zero.

Practical Points

Create a portable model-evaluation packet for every AI feature: your test prompts, success metrics, red-team cases, and privacy requirements. Re-run the same packet on every candidate model and keep the artifacts ready for audit.

Sources

Trump officials may be encouraging banks to test Anthropic’s Mythos model

The report is particularly surprising since the Department of Defense recently declared Anthropic a supply-chain risk.

techcrunch.com →

02 Deep Dive

Watt伯爵为LLM推论提出了一个能感基准

What Happened

一份新的arXiv论文介绍了Watt Counts,这是一套数据集和基准,重点是衡量不同GPU设置的LLM推论的能耗。

Why It Matters

推论成本不仅仅是每个象征性的美元,只有动力和冷却限制才能限制吞吐量。如果你在规模上运行模型, 能量感知剖面可以改变哪个模型,量化, 和硬件组合是实际可行的。

Key Takeaways

01 Energy, latency, and throughput trade off differently across GPUs, so ‘fastest’ is not necessarily ‘most efficient’ for your workload.
02 Benchmarks that include energy measurements help operators avoid surprises when scaling from a demo to production.
03 Sustainable inference is increasingly a competitive lever for providers and an internal constraint for teams running on-prem or at the edge.

Practical Points

Add power and cost-per-1K-tokens to your internal eval dashboard. If you cannot measure it directly, start by comparing GPU utilization, latency percentiles, and batch size sensitivity for your real traffic.

Sources

Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures

Introduces an open-access dataset of energy consumption for LLM inference across GPUs.

arxiv.org →

03 Deep Dive

N- Day- Bench 询问 LLMS 在真实代码库中是否能找到真正的弱点

What Happened

一个名为N-Day-Bench的社区项目收集现实世界的脆弱性案例,并评价LLMS是否能够在原始代码库中识别它们.

Why It Matters

安全评价常常因为任务是合成的而失败. 实事求是的bug搜索测试帮助您理解一个代理是否对分解和审查有用,或者它是否主要产生自信的噪音.

Key Takeaways

01 Real-code evaluation surfaces failure modes that toy benchmarks hide: dependency context, build systems, and ambiguous intent.
02 Vulnerability-finding is high-risk because false positives waste time and false negatives create a dangerous sense of coverage.
03 The most valuable outcome may be process improvements (better checklists and review workflows), not just model scores.

Practical Points

If you use LLMs for security review, run them in a constrained workflow: require citations to specific files and lines, force a minimal reproducer or proof sketch, and gate any automated patching behind human review.

Sources

N-Day-Bench – Can LLMs find real vulnerabilities in real codebases?

Benchmark project page.

ndaybench.winfunc.com →

更多阅读

04.

对照LLMS的卡片:设定幽默比对标准

研究人员在“反人类卡”式设置上测试前沿模型,以衡量与人类基线相比的幽默偏好。

Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models →

05.

Bench复制者:评估社会和行为科学中的可复制性

在数据提供不一致的情况下,一个衡量LLM代理是否能够支持复制工作的基准.

ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences →

06.

NVIDIA PhysicsNeMo 教程:达西流,FNOs,PINNs,代建模

Colab上的PhysicsNeMo逐步走过,为物理知情的ML和基准推论构建了工作流程.

A Step-by-Step Coding Tutorial on NVIDIA PhysicsNeMo: Darcy Flow, FNOs, PINNs, Surrogate Models, and Inference Benchmarking →

关键词

#Anthropic #model governance #benchmarking #energy-aware inference #security eval

股票

股票详情 →

TL;DR

随着S&P 500在收入季节的开始,股权重新追踪了最近的地缘政治损失,但背景仍然脆弱:利率预期、能源头条和政策人员流动都能够迅速重现风险。近期的重点是指导质量,而不仅仅是上季度的指纹。

01 Deep Dive

随着收入季节的开始,S&P 500消除了最近战争引起的损失

What Happened

Bloomberg报导说, 自伊朗战争开始以来,

Why It Matters

当市场迅速回升时,它可以隐藏对头条新闻的敏感定位。收入季节随后成为波动放大器,因为宏观风险和公司指导同时受到冲击.

Key Takeaways

01 A fast rebound can reflect short covering and positioning, not necessarily a durable change in fundamentals.
02 Earnings guidance will be read through the lens of macro uncertainty, so risk language and outlook ranges matter.
03 Correlation can spike during geopolitical weeks, reducing the benefit of diversification inside equities.

Practical Points

Go into earnings with a written decision rule: what would make you add, trim, or do nothing. If the stock gaps on headline risk rather than company-specific news, avoid impulsive trades and re-check your time horizon.

Sources

S&P 500 Erases Iran War-Driven Losses as Earnings Season Begins

The S&P 500 Index rallied to erase all of its losses since the start of the Iran war as US earnings season gets underway.

bloomberg.com →

02 Deep Dive

联邦主席提名的沃什为参议院听证会扫清了障碍

What Happened

CNBC的报导Kevin Warsh提交了要求的道德操守文件,为参议院确认意见听证会迈出了一步。

Why It Matters

货币政策对人事信号的预期可以转变,特别是在市场已经对通货膨胀和供资条件敏感的情况下。即使对预期政策路径的微小改变,也可能重新定价持续时间重的资产。

Key Takeaways

01 Policy credibility and communication can move markets as much as a single data print.
02 Uncertainty about the policy path raises equity risk premia and can widen credit spreads.
03 Rate-sensitive sectors (banks, real estate, high-multiple tech) will react first to changing Fed expectations.

Practical Points

If you are exposed to rate risk, map your portfolio by duration sensitivity (who benefits from lower yields, who gets hurt). Use that map to size positions before policy events rather than reacting after the move.

Sources

Fed nominee Warsh clears a hurdle to Senate hearing

Kevin Warsh submitted required ethics paperwork to the Senate Monday.

cnbc.com →

03 Deep Dive

美联储宣布在国库券购买中出现比信号更尖锐的退缩

What Happened

彭博社报道美联储表示,每月会购买约250亿美元的T-bills,比一些预期的更大规模的倒闭.

Why It Matters

流动性条件对风险资产很重要。更快的资产负债表变动可以收紧短期供资条件,并蔓延到更广泛的风险偏好。

Key Takeaways

01 Changes in Fed purchase pace can influence front-end rates and money-market conditions.
02 When liquidity is tightening, high-volatility assets typically re-price first.
03 Market narratives can shift quickly from ‘growth’ to ‘funding’ when policy mechanics move.

Practical Points

Watch short-term funding indicators (front-end yield moves, dollar liquidity proxies) alongside earnings. If liquidity tightens, reduce leverage and avoid forcing trades into low-volume sessions.

Sources

Fed Slashes T‑Bill Purchases in Sharper Than Signaled Pullback

The Federal Reserve said it will buy about $25 billion of Treasury bills each month, a greater wind down than anticipated.

bloomberg.com →

更多阅读

04.

星期二开放前的主要收入

快速的日历式清单显著的市场前收入报告。

Here are the major earnings before the open Tuesday →

关键词

#earnings season #S&P 500 #Fed policy #geopolitics #rates

加密货币

加密货币详情 →

TL;DR

Crypto流量和监管正在做大部分的谈话:据报道,资金在数月内看到了最佳的流入周,SEC发布了新的指南,将行业玩家解读为DeFi接口的友情,一个桥梁利用叙事提醒大家,基础设施风险有多快可以造成头条驱动的波动。

01 Deep Dive

在BTC和ETH需求的推动下,加密资金在1月以来的每周流入量最高

What Happened

在Bitcoin和Ethereum产品的领导下,解密报告显示,机构秘密基金自1月份以来的流入量达到了最好的一周。

Why It Matters

持续流入可以稳定价格行动,减少反向销售,但如果定位变得拥挤,也会提高对宏观风险和政策惊喜的敏感性。

Key Takeaways

01 Flows matter: ETF and fund demand can become a dominant driver of near-term price, especially during macro headline weeks.
02 Inflow-driven rallies can reverse quickly if risk sentiment flips, so risk controls matter even when the tape looks strong.
03 ETH and BTC leadership typically indicates broader market confidence more than meme-token spikes do.

Practical Points

If you manage a crypto book, track weekly flow data alongside funding rates and open interest. When all three rise together, tighten stop rules and reduce leverage because liquidation cascades become more likely.

Sources

Surging Bitcoin, Ethereum ETF Investments Drive Crypto Funds to Best Week Since January

Institutional crypto investors posted their strongest weekly inflows since January.

decrypt.co →

02 Deep Dive

Polkadot桥的开发突出跨链基础设施的尾端风险

What Happened

解密报告一名黑客利用了波尔卡多特桥,通过埃特鲁姆桥机制铸造了大量DOT,但只意识到了少量现金退出.

Why It Matters

即使已实现的损失有限,桥梁事件也可能通过破坏信任和引发跨生态系统的保护性风险而移动市场。

Key Takeaways

01 Bridges remain a high-frequency failure point because they aggregate complexity and large TVL into single contracts.
02 Headline severity and actual economic damage can diverge, but sentiment impact can still be large.
03 Operational playbooks (pauses, monitoring, communications) are part of protocol security, not an afterthought.

Practical Points

If you use bridges operationally, diversify routes and set per-bridge exposure limits. For treasury operations, prefer slower, safer settlement paths when urgency is low.

Sources

Crypto Hacker Mints $1.1 Billion in Polkadot via Ethereum Bridge, But Can Only Cash Out $237K

A hacker exploited a Polkadot bridge, minting $1.1 billion worth of DOT tokens before selling a small fraction.

decrypt.co →

03 Deep Dive

证监会工作人员指导认为,一些DeFi接口可能避免经纪人-交易商登记

What Happened

解密报告证监会发布 DeFi 界面更宽松的政策观点业界领袖欢迎.

Why It Matters

监管清晰度(即使是狭义的)可以改变建设者的设计前端如何结束以及机构如何思考合规风险. 但`指导 ' 不同于持久的规则制定。

Key Takeaways

01 Policy signals can shift quickly, so compliance strategy should be adaptable, not pinned to one interpretation.
02 Interfaces and control surfaces matter: the line between software and intermediary behavior is where enforcement risk concentrates.
03 Markets can overreact to regulatory headlines, so separate legal durability from short-term sentiment.

Practical Points

If you operate a DeFi product, document what you control (routing, custody, fees, and execution). Use that map to identify which changes reduce intermediary-like behavior and which changes increase it.

Sources

New Pro-DeFi Policies Show the SEC Isn't Waiting for Congress to Act on Crypto

The SEC released a new, permissive policy on DeFi interfaces Monday.

decrypt.co →

更多阅读

04.

CoinDesk:证监会说,一些加密钱包交易软件不被视为经纪人

CoinDesk覆盖了SEC相关的政策观点,即某些允许钱包交易的软件不会被当作经纪人处理.

U.S. SEC says software allowing crypto wallet transactions not considered broker →

关键词

#ETF flows #SEC guidance #DeFi interfaces #bridge exploit #Bitcoin

报告:官员可能鼓励银行测试Anthropic的神话模式

Trump officials may be encouraging banks to test Anthropic’s Mythos model

Watt伯爵为LLM推论提出了一个能感基准

Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures

N- Day- Bench 询问 LLMS 在真实代码库中是否能找到真正的弱点

N-Day-Bench – Can LLMs find real vulnerabilities in real codebases?

对照LLMS的卡片:设定幽默比对标准

Bench复制者:评估社会和行为科学中的可复制性

NVIDIA PhysicsNeMo 教程:达西流,FNOs,PINNs,代建模

随着收入季节的开始,S&P 500消除了最近战争引起的损失

S&P 500 Erases Iran War-Driven Losses as Earnings Season Begins

联邦主席提名的沃什为参议院听证会扫清了障碍

Fed nominee Warsh clears a hurdle to Senate hearing

美联储宣布在国库券购买中 出现比信号更尖锐的退缩

Fed Slashes T‑Bill Purchases in Sharper Than Signaled Pullback

星期二开放前的主要收入

在BTC和ETH需求的推动下,加密资金在1月以来的每周流入量最高

Surging Bitcoin, Ethereum ETF Investments Drive Crypto Funds to Best Week Since January

Polkadot桥的开发突出跨链基础设施的尾端风险

Crypto Hacker Mints $1.1 Billion in Polkadot via Ethereum Bridge, But Can Only Cash Out $237K

证监会工作人员指导认为,一些DeFi接口可能避免经纪人-交易商登记

New Pro-DeFi Policies Show the SEC Isn't Waiting for Congress to Act on Crypto

CoinDesk:证监会说,一些加密钱包交易软件不被视为经纪人

美联储宣布在国库券购买中出现比信号更尖锐的退缩