开启左侧

OpenAI Agent发布速记-Manus之类的工具被降维打击

[复制链接]
在线会员 ebE3N 发表于 2025-7-18 06:34:13 | 显示全部楼层 |阅读模式 打印 上一主题 下一主题 |快速收录
2025-07-18 |上海 |气候多云
千呵责万唤的OpenAI Agent公布会终究召集了,此次SAM切身出马

OpenAI Agent(Code Name:Odyssey)是调整了DeepResearch, Operator, Browser Use, TUI等才气战东西于一体的超等智能体
并且佳消息是咱们丐版Plus也能够用啦,一个月40次,Pro用户是一个月400次。
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w2.jpg

能够助您颠末浏览网页,挪用号令止东西,自己酿成,天生开端的ppt,自己Review并改正成斑斓的PPT,上面第三幅图右边的那页PPT是Agent自己天生的自己战其余模子比照的图表
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w3.jpg

OpenAI Agent公布速忘-Manus之类的东西被落维冲击w4.jpg

OpenAI Agent公布速忘-Manus之类的东西被落维冲击w5.jpg
能够挪动端去间接使用
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w6.jpg
上面是一系列评测目标,根本皆是SOTA,而且比上代皆有了少脚进步Scale的HLM已经被搞到了41.6%,可见来岁快要鼓战了
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w7.jpg
FrontierMath
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w8.jpg

数据阐发战数据修模
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w9.jpg
Excel天生尝试
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w10.jpg
投止阐发师修模评测
OpenAI Agent公布速忘-Manus之类的东西被落维冲击w11.jpg

Web浏览相干的尝试WebArena已经靠近了人类的水平

并且Agent的施行齐历程均可以回搁,并且能够正在施行过程当中监控您的施行举动,保证歹意网站没有会偷取您的疑息。瞅到全部完毕度,仍是要道一下OpenAI的产物品尝正在线。并且那越发考证了模子即产物的道法。间接天生PPT战excel的结果十分赞,包罗Manus的一年夜类Agent公司根本能够颁布发表歇菜了。正在顶级模子公司的眼里,底子不把那些干Agent使用的公司搁正在眼里的,才气上完整碾压。
上面是曲播后Sam的民间产物宣布

Today we launched a new product called ChatGPT Agent.  Agent represents a new level of capability for AI systems and can accomplish some remarkable, complex tasks for you using its own computer. It combines the spirit of Deep Research and Operator, but is more powerful than that may sound—it can think for a long time, use some tools, think some more, take some actions, think some more, etc. For example, we showed a demo in our launch of preparing for a friend’s wedding: buying an outfit, booking travel, choosing a gift, etc. We also showed an example of analyzing data and creating a presentation for work.  Although the utility is significant, so are the potential risks.  We have built a lot of safeguards and warnings into it, and broader mitigations than we’ve ever developed before from robust training to system safeguards to user controls, but we can’t anticipate everything. In the spirit of iterative deployment, we are going to warn users heavily and give users freedom to take actions carefully if they want to.  I would explain this to my own family as cutting edge and experimental; a chance to try the future, but not something I’d yet use for high-stakes uses or with a lot of personal information until we have a chance to study and improve it in the wild.  We don’t know exactly what the impacts are going to be, but bad actors may try to “trick” users’ AI agents into giving private information they shouldn’t and take actions they shouldn’t, in ways we can’t predict. We reco妹妹end giving agents the minimum access required to complete a task to reduce privacy and security risks.  For example, I can give Agent access to my calendar to find a time that works for a group dinner. But I don’t need to give it any access if I’m just asking it to buy me some clothes.  There is more risk in tasks like “Look at my emails that came in overnight and do whatever you need to do to address them, don’t ask any follow up questions”. This could lead to untrusted content from a malicious email tricking the model into leaking your data.  We think it’s important to begin learning from contact with reality, and that people adopt these tools carefully and slowly as we better quantify and mitigate the potential risks involved. As with other new levels of capability, society, the technology, and the risk mitigation strategy will need to co-evolve.来日诰日,咱们公布了一款名为 ChatGPT Agent 的新产物。 Agent 代表了野生智能体系才气的崭新下度,它能够使用自己的计较机为您完毕一点儿不凡而庞大的任务。它融合了深度钻研战 Operator 的观念,但是其强大水平近超设想——它能够短工妇思考,使用一点儿东西,截至更深入的思考,采纳举措,再截至更深入的思考等等。比方,咱们正在公布会上展示了一个为朋友的婚礼干准备的示范:购置打扮、预订路程、选择礼品等等。咱们借展示了一个阐发数据并创立事情示范文稿的示例。 固然它的合用性很强,但是潜伏的危急也异常主要。 咱们为它建立了大批的宁静步伐战警告,和比往常所有时候皆更普遍的减缓步伐,从强大的锻炼到体系宁静步伐再到用户掌握,但是咱们没法意料所无情况。原着迭代布置的精神,咱们将背用户收回主要警告,并付与用户自由,让他们能够按照自己的意愿稳重采纳举措。 尔会背尔的野人注释,那是一项前沿且具备尝试性的手艺;那是一个测验考试未来的时机,但是正在咱们有机会正在理论情况中钻研战改良它以前,尔临时没有会将其用于下危急用处或者处置大批小我私家疑息。 咱们尚没有分明其具体作用,但是歹意举动者可以会试图“拐骗”用户的AI代办署理,使其供给不应供给的隐衷疑息并采纳不应采纳的举措,而那些方法咱们没法猜测。咱们倡议授与代办署理完毕任务所需的最高会见权力,以低落隐衷战宁静危急。 比方,尔能够授与代办署理会见尔的日历的权力,以就摆设适宜的会餐时间。但是假设尔不过让它助尔购些衣服,则无需授与它所有会见权力。 诸如“检察尔隔夜支到的电子邮件,并采纳所有须要的步伐去处置它们,没有要问所有后绝成就”之类的任务危急更年夜。那可以会招致去自歹意电子邮件的不成疑实质拐骗模子保守您的数据。 咱们觉得,主要的是从打仗幻想开端进修,而且跟着咱们更佳天质化战低落潜伏危急,人们该当稳重而迟缓天接纳那些东西。宁可他新才气水平一致,社会、手艺微风险减缓战略需要配合开展。
您需要登录后才可以回帖 登录 | 立即注册 qq_login

本版积分规则

发布主题
阅读排行更多+
用专业创造成效
400-778-7781
周一至周五 9:00-18:00
意见反馈:server@mailiao.group
紧急联系:181-67184787
ftqrcode

扫一扫关注我们

Powered by 职贝云数A新零售门户 X3.5© 2004-2025 职贝云数 Inc.( 蜀ICP备2024104722号 )