Amazon Bedrock Introduces Programmatic Tool Calling and Custom Evaluators
In brief
- Amazon Bedrock has unveiled a new approach to how large language models (LLMs) interact with external tools.
- Programmatic tool calling (PTC) allows models to write code, typically in Python, that can invoke multiple tools programmatically within a sandboxed environment.
- This method reduces latency and token usage by executing all intermediate results outside the model's context window.
- For instance, tasks like analyzing thousands of expense records become more efficient as only the final result returns to the model.
- Additionally, Bedrock now offers custom code-based evaluators through AWS Lambda functions.
- These evaluators enable precise checks for structured data, such as JSON outputs from APIs or financial metrics.
- By using deterministic code, users can validate tool responses without relying on costly LLM evaluations.
- This feature is particularly useful in domains like finance, where accuracy and compliance are critical.
- These advancements aim to streamline development workflows and improve the reliability of agentic applications.
- As more tools become available, developers can expect even greater efficiency and control over their AI-driven systems.
Terms in this brief
- Programmatic Tool Calling
- A method where large language models write code to invoke tools programmatically, improving efficiency by executing intermediate results outside the model's context window. This reduces latency and token usage, making tasks like analyzing expense records more efficient.
Read full story at AWS ML Blog →
More briefs
OpenAI Files for Public Stock Offering
OpenAI has filed to offer its stock on public markets. The company made this announcement just a week after its rival Anthropic did the same. The move is important because OpenAI's valuation is $852 billion. The company has hundreds of millions of downloads for its ChatGPT app. OpenAI will use the public offering to gain more funds for its AI models and data centers. The company will soon go public.
NVIDIA Enhances AI Collaboration Through Federated Learning Breakthroughs
NVIDIA has introduced a new approach in federated learning, enabling AI models to collaborate more effectively without sharing sensitive data. This advancement allows institutions to work together on tasks like genomics research, improving patient outcomes while maintaining privacy. By using innovative aggregation techniques, the method addresses challenges that previously hindered progress in fields such as healthcare and finance. This development is significant because it balances the need for collaborative AI advancements with strict data security requirements. Federated learning typically struggles with varying participant resources and communication efficiency, but NVIDIA's solution streamlines these processes, making large-scale collaborations more feasible. For instance, researchers can now train models on decentralized data from multiple sources without compromising performance or privacy. Looking ahead, this breakthrough could pave the way for more secure and efficient AI research across industries. Developers and researchers should watch for further refinements in how these techniques are applied to real-world problems, potentially leading to faster discoveries and better decision-making tools.
AI Showcases Strong Potential for Automating Data Extraction from Dutch Neuroradiology Reports
AI has demonstrated impressive ability to extract data from complex medical reports, a breakthrough that could transform how radiologists handle their work. In a recent study, researchers tested the LLaMA 3.1 model on over 947 brain MRI reports in Dutch, focusing on variables like atrophy and microbleeds. The AI achieved near-perfect accuracy for categorical data-96% for medial temporal atrophy on the right and 87% for global cortical atrophy-while showing room for improvement with numerical data. The study highlights how few-shot prompting can enhance AI performance, boosting its ability to handle numbers by nearly 12 percentage points. This suggests that with the right strategies, AI could significantly reduce the time doctors spend on repetitive tasks like data extraction. However, challenges remain, particularly in accurately identifying specific lesion locations. Looking ahead, researchers will likely focus on refining these techniques to address remaining gaps. The potential for AI to automate data extraction from medical reports is enormous, offering a clearer picture of how these technologies can support healthcare professionals in the future.
OpenAI Prepares Biggest Update to ChatGPT
OpenAI is preparing a major update to ChatGPT. The update will transform the chatbot into a super app. It will combine coding tools and AI agents. The update is meant to push ChatGPT beyond conversation and into autonomous action. OpenAI's coding product Codex will get more resources. The company believes task-performing agents are more valuable than chatbots. The AI coding market is projected to reach $30 billion by 2031. OpenAI faces pressure to drive revenues higher and forge a path to profitability. The update will roll out in coming weeks. It will change ChatGPT's website and mobile apps to encourage users to try coding and other tools.
AI Agents Reject Traditional Sales Tools
AI agents prefer not to use traditional sales tools like Marketo, Outreach, and Salesloft. These tools are designed for humans to manage large volumes of emails and sales calls. AI agents can handle these tasks on their own and generate personalized emails and follow-ups in real time. By 2026, agents may handle 30% of these workflows, making traditional sales tools less necessary, and changing the way businesses operate will be next.