Multi-Agentic RAG with Hugging Face Code Agents | by Gabriele Sgroi, PhD | Dec, 2024


Using Qwen2.5–7B-Instruct powered code agents to create a local, open source, multi-agentic RAG system

Towards Data Science
Photo by Jaredd Craig on Unsplash

Large Language Models have shown impressive capabilities and they are still undergoing steady improvements with each new generation of models released. Applications such as chatbots and summarisation can directly exploit the language proficiency of LLMs as they are only required to produce textual outputs, which is their natural setting. Large Language Models have also shown impressive abilities to understand and solve complex tasks, but as long as their solution stays “on paper”, i.e. in pure textual form, they need an external user to act on their behalf and report back the results of the proposed actions. Agent systems solve this problem by letting the models act on their environment, usually via a set of tools that can perform specific operations. In this way, an LLM can find solutions iteratively by trial and error while interacting with the environment.

An interesting situation is when the tools that an LLM agent has access to are agents themselves: this is the core concept of multi-agentic systems. A multi-agentic system solves tasks by distributing and delegating duties to specialized models and putting their output together like puzzle pieces. A common way to implement such systems is by using a manager agent to orchestrate and coordinate other agents’ workflow.

Agentic systems, and in particular multi-agentic systems, require a powerful LLM as a backbone to perform properly, as the underlying model needs to be able to understand the purpose and applicability of the various tools as well as break up the original problem into sub-problems that can be tackled by each tool. For this reason, proprietary models like ChatGpt or Anthropic’s Claude are generally the default go-to solution for agentic systems. Fortunately, open-source LLMs have continued to see huge improvements in performance so much so that some of them now rival proprietary models in some instances. Even more interestingly, modestly-sized open LLMs can now perform complex tasks that were unthinkable a couple of years ago.

In this blog post, I will show how a “small” LLM that can run on consumer hardware is capable enough to power a multi-agentic system with good results. In particular, I will give a tutorial on how you can use Qwen2.5–7B-Instruct to create a multi-agentic RAG system. You can find the code implementation in the following GitHub repo and an illustrative Colab notebook.

Before diving into the details of the system architecture, I will recall some basic notions regarding LLM agents that will be useful to better understand the framework.

ReAct, proposed in ReAct: Synergizing Reasoning and Acting in Language Models, is a popular framework for building LLM agents. The main idea of the method is to incorporate the effectiveness of Chain of Thought prompting into an agent framework. ReACT consists of interleaved reasoning and action steps: the Large Language Model is prompted to provide a thought sequence before emitting an action. In this way the model can create dynamic reasoning traces to steer actions and update the high-level plan while incorporating information coming from the interaction with the environment. This allows for an iterative and incremental approach to solving the given task. In practice, the workflow of a ReAct agent is made up of Thought, Action, and Observation sequences: the model produces reasoning for a general plan and specific tool usage in the Thought step, then invokes the relevant tool in the Action step, and finally receives feedback from the environment in the Observation.

Below is an example of what the ReACT framework looks like.

Comparison between the ReACT, Chain-of-Thought, and Act-Only frameworks for a Question Answering task. Image from ReAct: Synergizing Reasoning and Acting in Language Models.

Code agents are a particular type of LLM agents that use executable Python code to interact with the environment. They are based on the CodeAct framework proposed in the paper Executable Code Actions Elicit Better LLM Agents. CodeAct is very similar to the ReAct framework, with the difference that each action consists of arbitrary executable code that can perform multiple operations. Hand-crafted tools are provided to the agent as regular Python functions that it can call in the code.

Code agents come with a unique set of advantages over more traditional agents using JSON or other text formats to perform actions:

  • They can leverage existing software packages in combination with hand-crafted task-specific tools.
  • They can self-debug the generated code by using the error messages returned after an error is raised.
  • LLMs are familiar with writing code as it is generally widely present in their pre-training data, making it a more natural format to write their actions.
  • Code naturally allows for the storage of intermediate results and the composition of multiple operations in a single action, while JSON or other text formats may need multiple actions to accomplish the same.

For these reasons, Code Agents can offer improved performance and faster execution speed than agents using JSON or other text formats to execute actions.

Comparison between code agents and agents using JSON or text as actions. Image from Executable Code Actions Elicit Better LLM Agents.

Below is a concrete example from the original paper that showcases how code agents can require fewer actions to solve certain tasks.

Code agents vs agents using JSON/text action format. Code agents can execute multiple operations in one action. Image from Executable Code Actions Elicit Better LLM Agents. [RIVEDERE]

The Hugging Face transformers library provides useful modules to build agents and, in particular, code agents. The Hugging Face transformer agents framework focuses on clarity and modularity as core design principles. These are particularly important when building an agent system: the complexity of the workflow makes it essential to have control over all the interconnected parts of the architecture. These design choices make Hugging Face agents a great tool for building custom and flexible agent systems. When using open-source models to power the agent engine, the Hugging Face agents framework has the further advantage of allowing easy access to the models and utilities present in the Hugging Face ecosystem.

Hugging Face code agents also tackle the issue of insecure code execution. In fact, letting an LLM generate code unrestrained can pose serious risks as it could perform undesired actions. For example, a hallucination could cause the agent to erase important files. In order to mitigate this risk, Hugging Face code agents implementation uses a ground-up approach to secure code execution: the code interpreter can only execute explicitly authorized operations. This is in contrast to the usual top-down paradigm that starts with a fully functional Python interpreter and then forbids actions that may be dangerous. The Hugging Face implementation includes a list of safe, authorized functions that can be executed and provides a list of safe modules that can be imported. Anything else is not executable unless it has been preemptively authorized by the user. You can read more about Hugging Face (code) agents in their blog posts:

Retrieval Augmented Generation has become the de facto standard for information retrieval tasks involving Large Language Models. It can help keep the LLM information up to date, give access to specific information, and reduce hallucinations. It can also enhance human interpretability and supervision by returning the sources the model used to generate its answer. The usual RAG workflow, consisting of a retrieval process based on semantic similarity to a user’s query and a model’s context enhancement with the retrieved information, is not adequate to solve some specific tasks. Some situations that are not suited for traditional RAG include tasks that need interactions with the information sources, queries needing multiple pieces of information to be answered, and complex queries requiring non-trivial manipulation to be connected with the actual information contained in the sources.

A concrete challenging example for traditional RAG systems is multi-hop question answering (MHQA). It involves extracting and combining multiple pieces of information, possibly requiring several iterative reasoning processes over the extracted information and what is still missing. For instance, if the model has been asked the question “Does birch plywood float in ethanol?”, even if the sources used for RAG contained information about the density of both materials, the standard RAG framework could fail if these two pieces of information are not directly linked.

A popular way to enhance RAG to avoid the abovementioned shortcomings is to use agentic systems. An LLM agent can break down the original query into a series of sub-queries and then use semantic search as a tool to retrieve passages for these generated sub-queries, changing and adjusting its plan as more information is collected. It can autonomously decide whether it has collected enough information to answer each query or if it should continue the search. The agentic RAG framework can be further enhanced by extending it to a multi-agentic system in which each agent has its own defined tasks and duties. This allows, for example, the separation between the high-level task planning and the interaction with the document sources. In the next section, I will describe a practical implementation of such a system.

In this section, I will discuss the general architectural choices I used to implement a Multi-Agentic RAG system based on code agents following the ReAct framework. You can find the remaining details in the full code implementation in the following GitHub repo.

The goal of the multi-agentic system is to answer a question by searching the necessary information on Wikipedia. It is made up of 3 agents:

  • A manager agent whose job is to break down the task into sub-tasks and use their output to provide a final answer.
  • A Wikipedia search agent that finds relevant pages on Wikipedia and combines the information extracted from them.
  • A page search agent to retrieve and summarize information relevant to a given query from the provided Wikipedia page.

These three agents are organized in a hierarchical fashion: each agent can use the agent immediately below in the hierarchy as a tool. In particular, the manager agent can call the Wikipedia search agent to find information about a query which, in turn, can use the page search agent to extract particular information from Wikipedia pages.

Below is the diagram of the architecture, specifying which hand-crafted tools (including tools wrapping other agents) each agent can call. Notice that since code agents act using code execution, these are not actually the only tools they can use as any native Python operation and function (as long as it is authorized) can be used as well.

Architecture diagram showing agents and hand-crafted tools. Image by the author.

Let’s dive into the details of the workings of the agents involved in the architecture.

Manager agent

This is the top-level agent, it receives the user’s question and it is tasked to return an answer. It can use the Wikipedia search agent as a tool by prompting it with a query and receiving the final results of the search. Its purpose is to collect the necessary pieces of information from Wikipedia by dividing the user question into a series of sub-queries and putting together the result of the search.

Below is the system prompt used for this agent. It is built upon the default Hugging Face default prompt template. Notice that the examples provided in the prompt follow the chat template of the model powering the agent, in this case, Qwen2.5–7B-Instruct.

You are an expert assistant who can find answer on the internet using code blobs and tools. To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.
You will be given the task of answering a user question and you should answer it by retrieving the necessary information from Wikipedia. Use and trust only the information you retrieved, don't make up false facts.
To help you, you have been given access to a search agent you can use as a tool. You can use the search agent to find information on Wikipedia. Break down the task into smaller sub-tasks and use the search agent to find the necessary information for each sub-task.
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_action>' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need. These print outputs will be provided back to you by the user in the 'Observation:' field, which will be available as input for the next steps. Always print the output of tools, don't process it or try to extract information before inspecting it.
If an error rise while executing the code, it will be shown in the 'Observation:' field. In that case, fix the code and try again.

In the end you have to return a final answer using the `final_answer` tool.

Here are a few notional examples:
---
<|im_start|>user
Task: When was the capital of Italy founded?<|im_end|>
<|im_start|>assistant
Thought: Let's break up the task: I first need to find the capital of Italy and then look at its foundation date. I will use the tool `wikipedia_search_agent` to get the capital of Italy. Code:
```py
result = wikipedia_search_agent("Italy capital")
print("Capital of Italy:", result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 0] -> Observation:
Capital of Italy:According to the information extracted from the Wikipedia page 'Rome', the capital of Italy is Rome.<|im_end|>
<|im_start|>assistant
Thought: Now that I know that the capital of Italy is Rome, I can use the `wikipedia_search_agent` tool to look for its foundation date.
Code:
```py
result = wikipedia_search_agent("Rome foundation date")
print("Rome foundation:", result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 1] -> Observation:
Rome foundation: According to the information from the Wikipedia page 'Natale di Roma', the traditional foundation date of Rome is April 21, 753 BC.<|im_end|>
<|im_start|>assistant
Thought: Now that I have retrieved the relevant information, I can use the `final_answer` tool to return the answer.
Code:
```py
final_answer("According to the legend Rome was founded on 21 April 753 BCE, but archaeological evidence dates back its development during the Bronze Age.")
```<end_action><|im_end|>
---
<|im_start|>user
Task: "What's the difference in population between Shanghai and New York?"<|im_end|>
<|im_start|>assistant
Thought: I need to get the populations for both cities and compare them: I will use the tool `search_agent` to get the population of both cities.
Code:
```py
population_guangzhou_info = wikipedia_search_agent("New York City population")
population_shanghai_info = wikipedia_search_agent("Shanghai population")
print("Population Guangzhou:", population_guangzhou)
print("Population Shanghai:", population_shanghai)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 0] -> Observation:
Population Guangzhou: The population of New York City is approximately 8,258,035 as of 2023.
Population Shanghai: According to the information extracted from the Wikipedia page 'Shanghai', the population of the city proper is around 24.87 million inhabitants in 2023.<|im_end|>
<|im_start|>assistant
Thought: Now I know both the population of Shanghai (24.87 million) and of New York City (8.25 million), I will calculate the difference and return the result.
Code:
```py
population_difference = 24.87*1e6 - 8.25*1e6
answer=f"The difference in population between Shanghai and New York is population_difference inhabitants."
final_answer(answer)
```<end_action><|im_end|>
---

On top of performing computations in the Python code snippets that you create, you have access to those tools (and no other tool):

<<tool_descriptions>>

<<managed_agents_descriptions>>

You can use imports in your code, but exclusively from the following list of modules: <<authorized_imports>>. Do not try to import other modules or else you will get an error.
Now start and solve the task!

Wikipedia search agent

This agent reports to the manager agent, it receives a query from it and it is tasked to return the information it has retrieved from Wikipedia. It can access two tools:

  • A Wikipedia search tool, using the built-in search function from the wikipedia package. It receives a query as input and returns a list of Wikipedia pages and their summaries.
  • A page search agent that retrieves information about a query from a specific Wikipedia page.

This agent collects the information to answer the query, dividing it into further sub-queries, and combining information from multiple pages if needed. This is accomplished by using the search tool of the wikipedia package to identify potential pages that can contain the necessary information to answer the query: the agent can either use the reported page summaries or call the page search agent to extract more information from a specific page. After enough data has been collected, it returns an answer to the manager agent.

The system prompt is again a slight modification of the Hugging Face default prompt with some specific examples following the model’s chat template.

You are an expert assistant that retrieves information from Wikipedia using code blobs and tools. To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.
You will be given a general query, your task will be of retrieving and summarising information that is relevant to the query from multiple passages retrieved from the given Wikipedia page. Use and trust only the information you retrieved, don't make up false facts. Try to summarize the information in a few sentences.
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_action>' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need. These print outputs will be provided back to you by the user in the 'Observation:' field, which will be available as input for the next steps. Always print the output of tools, don't process it or try to extract information before inspecting it.
If an error rise while executing the code, it will be shown in the 'Observation:' field. In that case, fix the code and try again.

In the end you have to return a final answer using the `final_answer` tool.

Here are a few notional examples:
---
<|im_start|>user
Task: Retrieve information about the query:"What's the capital of France?" from the Wikipedia page "France".<|im_end|>
<|im_start|>assistant
Thought: I need to find the capital of France. I will use the tool `retrieve_passages` to get the capital of France from the Wikipedia page.
Code:
```py
result = retrieve_passages("France capital")
print("Capital of France:", result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 0] -> Observation:
Retrieved passages for query "France capital":
Passage 0: ... population of nearly 68.4 million as of January 2024. France is a semi-presidential republic with its capital in Paris, the ...
Passage 1: ... France, officially the French Republic, is a country located primarily in Western Europe. Its overseas regions and territories ...
Passage 2: ... The vast majority of France's territory and population is situated in Western Europe and is called Metropolitan France. It is ...
Passage 3: ... France is a highly urbanised country, with its largest cities (in terms of metropolitan area population in 2021) being Paris ...
Passage 4: ... === Government ===\nFrance.fr – official French tourism site (in English)...<|im_end|>
<|im_start|>assistant
Thought: Now that I know that the capital of France is Paris, I can use the `final_answer` tool to return the answer.
Code:
```py
final_answer("The capital of France is Paris.")
```<end_action><|im_end|>
---
<|im_start|>user
Task: Retrieve information about the query:"Tallest mountain in the World" from the Wikipedia page "List of highest mountains on Earth"<|im_end|>
<|im_start|>assistant
Thought: I need to find the tallest mountain in the world. I will use the tool `retrieve_passages` to look for data on the Wikipedia page.
Code:
```py
result = retrieve_passages("highest mountain")
print(result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 1] -> Observation:
Retrieved passages for query "highest mountain":
Passage 0: ... above sea level) is the world's tallest mountain and volcano, rising about 10,203 m (33,474 ft) from the Pacific Ocean floor. ...
Passage 1: ... As of December 2018, the highest peaks on four of the mountains—Gangkhar Puensum, Labuche Kang III, Karjiang, and Tongshanjiabu, all located in Bhutan or China—have not been ascended. ...
Passage 2: ... The highest mountains above sea level are generally not the highest above the surrounding terrain. ...
Passage 3: ... The highest mountain outside of Asia is Aconcagua (6,961 m or 22,838 ft), the 189th highest in the world. ...
Passage 4: ... the southern summit of Peru's tallest mountain, Huascarán, is another contender. Both have elevations above sea level more than 2 km (1.2 mi) less than that of Everest....
<|im_end|>
<|im_start|>assistant
Thought: The results don't clearly specify a clear result for the world's tallest mountain, I will use the tool `web_results` with a different query.
Code:
```py
result = retrieve_passages("world's tallest mountain")
print(result)
```<end_action><|im_end|>
<|im_start|>user
Passages retrieved from page List of highest mountains on Earth:
Passage 0: ... The highest mountain outside of Asia is Aconcagua (6,961 m or 22,838 ft), the 189th highest in the world....
Passage 1: ... above sea level) is the world's tallest mountain and volcano, rising about 10,203 m (33,474 ft) from the Pacific Ocean floor. ...
Passage 2: ... The bases of mountain islands are below sea level, and given this consideration Mauna Kea (4,207 m (13,802 ft) above sea level) is the world's tallest mountain and volcano, rising about 10,203 m (33,474 ft) from the Pacific Ocean floor. ...
Passage 3: ... the southern summit of Peru's tallest mountain, Huascarán, is another contender. Both have elevations above sea level more than 2 km (1.2 mi) less than that of Everest. ...
Passage 4: ... The highest mountains are also not generally the most voluminous. Mauna Loa (4,169 m or 13,678 ft) is the largest mountain on Earth in terms of base area (about 5,200 km2 or 2,000 sq mi) and volume (about 42,000 km3 or 10,000 cu mi)...<|im_end|>
<|im_start|>assistant
Thought: I have found that Mauna Kea is the world's tallest mountain rising about 10,203 m (33,474 ft) from the Pacific Ocean floor. I can use the `final_answer` tool to return the relevant information.
Code:
```py
final_answer("Mauna Kea is the world's tallest mountain, rising about 10,203 m (33,474 ft) from the Pacific Ocean floor.")
```<end_action><|im_end|>
___
On top of performing computations in the Python code snippets that you create, you have access to those tools (and no other tool):

<<tool_descriptions>>

<<managed_agents_descriptions>>

You can use imports in your code, but only from the following list of modules: <<authorized_imports>>. Do not try to import other modules or else you will get an error.
Now start and solve the task!

Page search agent

This agent reports to the Wikipedia search agent, which provides it with a query and the title of a Wikipedia page, and it is tasked to retrieve the relevant information to answer the query from that page. This is, in essence, a single-agent RAG system. To perform the task, this agent generates custom queries and uses the semantic search tool to retrieve the passages that are more similar to them. The semantic search tool follows a simple implementation that splits the page contents into chunks and embeds them using the FAISS vector database provided by LangChain.

Below is the system prompt, still built upon the one provided by default by Hugging Face

You are an expert assistant that finds answers to questions by consulting Wikipedia, using code blobs and tools. To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.
You will be given a general query, your task will be of finding an answer to the query using the information you retrieve from Wikipedia. Use and trust only the information you retrieved, don't make up false facts. Cite the page where you found the information.
You can search for pages and their summaries from Wikipedia using the `search_wikipedia` tool and look for specific passages from a page using the `search_info` tool. You should decide how to use these tools to find an appropriate answer:some queries can be answered by looking at one page summary, others can require looking at specific passages from multiple pages.
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_action>' sequence.
During each intermediate step, you can use 'print()' to save whatever important information you will then need. These print outputs will be provided back to you by the user in the 'Observation:' field, which will be available as input for the next steps. Always print the output of tools, don't process it or try to extract information before inspecting it.
If an error rise while executing the code, it will be shown in the 'Observation:' field. In that case, fix the code and try again.

In the end you have to return a final answer using the `final_answer` tool.

Here are a few notional examples:
---
<|im_start|>user
Task: When was the ancient philosopher Seneca born?<|im_end|>
<|im_start|>assistant
Thought: I will use the tool `search_wikipedia` to search for Seneca's birth on Wikipedia. I will specify I am looking for the philosopher for disambiguation.
Code:
```py
result = search_wikipedia("Seneca philosopher birth")
print("result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 0] -> Observation:
Pages found for query 'Seneca philosopher birth':
Page: Seneca the Younger
Summary: Lucius Annaeus Seneca the Younger ( SEN-ik-ə; c.4 BC – AD 65), usually known mononymously as Seneca, was a Stoic philosopher of Ancient Rome, a statesman, dramatist, and in one work, satirist, from the post-Augustan age of Latin literature.
Seneca was born in Colonia Patricia Corduba in Hispania, a
Page: Phaedra (Seneca)
Summary: Phaedra is a Roman tragedy written by philosopher and dramatist Lucius Annaeus Seneca before 54 A.D. Its 1,280 lines of verse tell the story of Phaedra, wife of King Theseus of Athens and her consuming lust for her stepson Hippolytus. Based on Greek mythology and the tragedy Hippolytus by Euripides,
Page: Seneca the Elder
Summary: Lucius Annaeus Seneca the Elder ( SEN-ik-ə; c.54 BC – c. AD 39), also known as Seneca the Rhetorician, was a Roman writer, born of a wealthy equestrian family of Corduba, Hispania. He wrote a collection of reminiscences about the Roman schools of rhetoric, six books of which are extant in a more or
Page: AD 1
Summary: AD 1 (I) or 1 CE was a common year starting on Saturday or Sunday, a common year starting on Saturday by the proleptic Julian calendar, and a common year starting on Monday by the proleptic Gregorian calendar. It is the epoch year for the Anno Domini (AD) Christian calendar era, and the 1st year of
Page: Seneca Falls Convention
Summary: The Seneca Falls Convention was the first women's rights convention. It advertised itself as "a convention to discuss the social, civil, and religious condition and rights of woman". Held in the Wesleyan Chapel of the town of Seneca Falls, New York, it spanned two days over July 19–20, 1848. Attrac
<|im_start|>assistant
Thought: From the summary of the page "", I can see that Seneca was born in . I can use the `final_answer` tool to return the answer.
Code:
```py
final_answer("According to the Wikipedia page 'Seneca the Younger', Seneca was born in 4 BC.")
```<end_action><|im_end|>
---
<|im_start|>user
Task: Who was Charlemagne predecessor?<|im_end|>
<|im_start|>assistant
Thought: I will use the tool `search_wikipedia` to search for Charlemagne reign duration.
Code:
```py
result = search_wikipedia("Charlemagne predecessor")
print(result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 0] -> Observation:
Pages found for query 'Charlemagne predecessor':
Page: Charlemagne
Summary: Charlemagne ( SHAR-lə-mayn; 2 April 748 – 28 January 814) was King of the Franks from 768, King of the Lombards from 774, and Emperor of what is now known as the Carolingian Empire from 800, holding these titles until his death in 814. He united most of Western and Central Europe, and was the first
Page: Pope Leo III
Summary: Pope Leo III (Latin: Leo III; died 12 June 816) was bishop of Rome and ruler of the Papal States from 26 December 795 to his death. Protected by Charlemagne from the supporters of his predecessor, Adrian I, Leo subsequently strengthened Charlemagne's position by crowning him emperor. The coronation
Page: Throne of Charlemagne
Summary: The Throne of Charlemagne (German: Karlsthron or Aachener Königsthron, "Royal Throne of Aachen") is a throne erected in the 790s by Charlemagne, as one of the fittings of his palatine chapel in Aachen (today's Aachen Cathedral) and placed in the Octagon of the church. Until 1531, it served as the co
Page: Louis the Pious
Summary: Louis the Pious (Latin: Hludowicus Pius; French: Louis le Pieux; German: Ludwig der Fromme; 16 April 778 – 20 June 840), also called the Fair and the Debonaire, was King of the Franks and co-emperor with his father, Charlemagne, from 813. He was also King of Aquitaine from 781. As the only surviving
Page: Holy Roman Emperor
Summary: The Holy Roman Emperor, originally and officially the Emperor of the Romans (Latin: Imperator Romanorum; German: Kaiser der Römer) during the Middle Ages, and also known as the Romano-German Emperor since the early modern period (Latin: Imperator Germanorum; German: Römisch-deutscher Kaiser, lit. 'R
<|im_end|>
<|im_start|>assistant
Thought: The results don't contain explicit information about Charlemagne predecessor, I will search for more information on the page 'Charlemagne' using the 'search_info' tool.
Code:
```py
result = search_info("Charlemagne predecessor", "Charlemagne")
print(result)
```<end_action><|im_end|>
<|im_start|>user
[OUTPUT OF STEP 1] -> Observation:
Information retrieved from the page 'Charlemagne' for the query 'Charlemagne predecessor':
Charlemagne's predecessor was Pepin the Short.
<|im_end|>
<|im_start|>assistant
Thought: I have found that, according to the Wikipedia page 'Charlemagne', Pepin the Short was Charlemagne predecessor. I will return the results using the `final_answer` tool.
Code:
```py
final_answer("According to the information extracted from the Wikipedia page 'Charlemagne', his predecessor was Pepin the Short.")
```<end_action><|im_end|>
___
On top of performing computations in the Python code snippets that you create, you have access to those tools (and no other tool):

<<tool_descriptions>>

<<managed_agents_descriptions>>

You can use imports in your code, but only from the following list of modules: <<authorized_imports>>. Do not try to import other modules or else you will get an error.
Now start and solve the task!

Implementation choices

In this subsection, I will outline the main points that differ from what could be a straightforward implementation of the architecture using Hugging Face agents. These are the results of limited trial and error before obtaining a solution that works reasonably well. I haven’t performed extensive testing and ablations so they may not be the optimal choices.

  • Prompting: as explained in the previous sections, each agent has its own specialized system prompt that differs from the default one provided by Hugging Face Code Agents. I observed that, perhaps due to the limited size of the model used, the general standard system prompt was not giving good results. The model seems to work best with a system prompt that reflects closely the tasks it is asked to perform, including tailored examples of significant use cases. Since I used a chat model with the aim of improving instruction following behavior, the provided examples follow the model’s chat template to be as close as possible to the format encountered during a run.
  • Summarizing history: long execution histories have detrimental effects on both execution speed and task performance. The latter could be due to the limited ability of the model to retrieve the necessary information from a long context. Moreover, extremely long execution histories could exceed the maximum context length for the engine model. To mitigate these problems and speed up execution, I chose not to show all the details of the previous thought-action-observation steps, but instead collected only the previous observations. More specifically, at each step the model only receives the following chat history: the system message, the first message containing the task, its last action, and all the history of the previous observations. Furthermore, execution errors are present in the observation history only if they happen in the last step, previous errors that have been already solved are discarded.
  • Tools vs managed agents: Hugging Face agents implementation has native support for managed agents but wrapping them as tools allows for better control of the prompts and a more streamlined implementation. In particular, Hugging Face implementation adds particular prompts to both the managed agents and their managers. While I haven’t seen substantial differences in the ability to solve the given task, I preferred the second approach as it is more flexible and streamlined for the presented architecture, and it allows for easier control over the agents’ behavior. This also helps with reducing the prompt length, which is useful for speeding up computations.
  • Limit the maximum number of trials for the page search agent: sometimes the page search agent keeps looking for information on a given page that doesn’t contain it. Reducing the maximum number of trials mitigated this issue: after reaching that number the agent execution is stopped and the tool returns the last observation from code execution.
  • Changing tool response to user message: this is more of a technical, implementation-specific point. Since the only supported roles for the chat template of Qwen2.5–7B-Instruct are system, user, and assistant, observations are returned as user messages.

In this section, I will collect some examples produced with the described architecture. I will also provide the detailed execution of all the agents involved in the task to showcase how they interact in some real cases. You can explore more examples using the illustrative Colab notebook. All the extracted passages are taken from the Wikipedia page specified in the execution.

Let’s start exploring a simple example that requires combining information from multiple sources and some basic arithmetic understanding.

Task: How many years passed between the construction of the Eiffel Tower and the construction of the Empire State Building?

Below are the logs of the execution of the manager agent. It can correctly break up the task into finding the construction periods for both buildings and compute the year difference.

Agent:  Thought: First, I need to find out when the Eiffel Tower and the Empire State Building were constructed. I will use the `wikipedia_search_agent` tool to find this information.
Code:
```py
wikipedia_search_agent(query="Eiffel Tower construction year")
```<end_action>
Observation: Print outputs:
Last output from code snippet:
According to the information extracted from the Wikipedia page 'Eiffel Tower', the construction of the Eiffel Tower took place from 1887 to 1889.
Agent: Thought: Now I have the construction period of the Eiffel Tower. Next, I need to find the construction period of the Empire State Building. I will use the `wikipedia_search_agent` tool again.
Code:
```py
wikipedia_search_agent(query="Empire State Building construction year")
```<end_action>
Observation: Print outputs:
Last output from code snippet:
According to the information extracted from the Wikipedia page 'Empire State Building', the construction of the Empire State Building started in 1930 and ended in 1931.
Agent: Thought: Now I have the construction periods for both the Eiffel Tower and the Empire State Building. I can now calculate the number of years between their constructions.
Code:
```py
construction_eiffel_tower_start = 1887
construction_eiffel_tower_end = 1889
construction_empire_state_building_start = 1930
construction_empire_state_building_end = 1931

years_passed = construction_empire_state_building_start - construction_eiffel_tower_end
answer = f"The number of years passed between the construction of the Eiffel Tower and the Empire State Building is years_passed years."
final_answer(answer)
```<end_action>
Observation: Print outputs:
Last output from code snippet:
The number of years passed between the construction of the Eiffel Tower and the Empire State Building is 41 years.

Let’s now visualize the execution of the Wikipedia search agent. It correctly searches and reports the construction periods found in the summary of the pages, without needing to inspect the individual pages. It is interesting to note that despite being asked generically about the “construction year”, it reports the entire construction period as it is not clear if the year refers to the start or the end of the construction works.

TASK:  Eiffel Tower construction year
AGENT: Thought: I will use the `search_wikipedia` tool to find information about the Eiffel Tower construction year.
Code:
```py
search_wikipedia('Eiffel Tower construction year')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Pages found for query 'Eiffel Tower construction year':
Page: Eiffel Tower
Summary: The Eiffel Tower ( EYE-fəl; French: Tour Eiffel [tuʁ ɛfɛl] ) is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower from 1887 to 1889.
Locally nicknamed "La dame de fer" (French for "Iron Lady"), it was constructed as the centerpiece of the 1889 World's Fair, and to crown the centennial anniversary of the French Revolution. Although initially criticised by some of France's leading artists and intellectuals for its design, it has since become a global cultural icon of France and one of the most recognisable structures in the world. The tower received 5,889,000 visitors in 2022. The Eiffel Tower is the most visited monument with an entrance fee in the world: 6.91 million people ascended it in 2015. It was designated a monument historique in 1964, and was named part of a UNESCO World Heritage Site ("Paris, Banks of the Seine") in 1991.
The tower is 330 metres (1,083 ft) tall, about t
Page: Eiffel Tower (Paris, Texas)
Summary: Texas's Eiffel Tower is a landmark in the city of Paris, Texas. The tower was constructed in 1993. It is a scale model of the Eiffel Tower in Paris, France; at 65 feet in height, it is roughly one-sixteenth of the height of the original.

Page: Gustave Eiffel
Summary: Alexandre Gustave Eiffel ( EYE-fəl, French: [alɛksɑ̃dʁ ɡystav ɛfɛl]; né Bonickhausen dit Eiffel; 15 December 1832 – 27 December 1923) was a French civil engineer. A graduate of École Centrale des Arts et Manufactures, he made his name with various bridges for the French railway network, most famously the Garabit Viaduct. He is best known for the world-famous Eiffel Tower, designed by his company and built for the 1889 Universal Exposition in Paris, and his contribution to building the Statue of Liberty in New York. After his retirement from engineering, Eiffel focused on research into meteorology and aerodynamics, making significant contributions in both fields.
Page: Watkin's Tower
Summary: Watkin's Tower was a partially completed iron lattice tower in Wembley Park, London, England. Its construction was an ambitious project to create a 358-metre (1,175 ft)-high visitor attraction in Wembley Park to the north of the city, led by the railway entrepreneur Sir Edward Watkin. Marketed as the "Great Tower of London", it was designed to surpass the height of the Eiffel Tower in Paris, and it was part of Wembley Park's emergence as a recreational place. The tower was never completed and it was demolished in 1907. The site of the tower is now occupied by the English national football ground, Wembley Stadium.
Page: Eiffel Tower (Paris, Tennessee)
Summary: The Eiffel Tower is a landmark in the city of Paris, Tennessee. It is a 1:20 scale replica of the original located in Paris, France.

AGENT: Thought: From the summary of the page 'Eiffel Tower', I can see that the construction of the Eiffel Tower took place from 1887 to 1889. I will use the `final_answer` tool to return the answer.
Code:
```py
final_answer("According to the information extracted from the Wikipedia page 'Eiffel Tower', the construction of the Eiffel Tower took place from 1887 to 1889.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
According to the information extracted from the Wikipedia page 'Eiffel Tower', the construction of the Eiffel Tower took place from 1887 to 1889.
------------------------------
TASK: Empire State Building construction year
AGENT: Thought: I will use the `search_wikipedia` tool to find information about the construction year of the Empire State Building.
Code:
```py
search_wikipedia('Empire State Building construction year')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Pages found for query 'Empire State Building construction year':
Page: Empire State Building
Summary: The Empire State Building is a 102-story Art Deco skyscraper in the Midtown South neighborhood of Manhattan in New York City. The building was designed by Shreve, Lamb & Harmon and built from 1930 to 1931. Its name is derived from "Empire State", the nickname of the state of New York. The building has a roof height of 1,250 feet (380 m) and stands a total of 1,454 feet (443.2 m) tall, including its antenna. The Empire State Building was the world's tallest building until the first tower of the World Trade Center was topped out in 1970; following the September 11 attacks in 2001, the Empire State Building was New York City's tallest building until it was surpassed in 2012 by One World Trade Center. As of 2024, the building is the seventh-tallest building in New York City, the ninth-tallest completed skyscraper in the United States, and the 57th-tallest completed skyscraper in the world.
The site of the Empire State Building, on the west side of Fifth Avenue between West 33rd and 34th St
Page: British Empire Building
Summary: The British Empire Building, also known by its address 620 Fifth Avenue, is a commercial building at Rockefeller Center in the Midtown Manhattan neighborhood of New York City. Completed in 1933, the six-story structure was designed in the Art Deco style by Raymond Hood, Rockefeller Center's lead architect. The British Empire Building, along with the nearly identical La Maison Francaise to the south and the high-rise International Building to the north, comprise a group of retail-and-office structures known as the International Complex. La Maison Francaise and the British Empire Building are separated by Channel Gardens, a planted pedestrian esplanade running west to the complex's Lower Plaza.
The facade is made of limestone, with a main entrance along Fifth Avenue and secondary entrances on 50th Street and Channel Gardens. The top of the British Empire Building contains setbacks, a rooftop garden, and a partial seventh-story penthouse. The building's entrances contain ornate decoration
Page: 2012 Empire State Building shooting
Summary: On August 24, 2012, a gunman shot and killed a former co-worker outside the Empire State Building in New York City. Following the initial shooting, the gunman, 58-year-old Jeffrey T. Johnson, was fatally shot by police officers after raising his weapon at them. Nine bystanders were wounded by stray bullets fired by the officers and ricocheting debris, but none suffered life-threatening injuries.
Page: Empire State Building in popular culture
Summary: The Empire State Building is a 102-story skyscraper on Fifth Avenue between West 33rd and 34th Streets in Midtown, Manhattan, New York City. It has a roof height of 1,250 feet (381 m), and with its antenna included, it stands a total of 1,454 feet (443.2 m) tall. The Empire State Building stood as the world's tallest building for nearly 40 years, from its completion in early 1931 until the topping out of the original World Trade Center's North Tower in late 1970. It has been featured in numerous films, TV shows, songs, video games, books, photographs, and artwork.
Page: Empire State Plaza
Summary: The Governor Nelson A. Rockefeller Empire State Plaza (known commonly as the Empire State Plaza, and also as the South Mall) is a complex of several state government buildings in downtown Albany, New York.
The complex was built between 1965 and 1976 at an estimated total cost of $2 billion. It houses several departments of the New York State administration and is integrated with the New York State Capitol, completed in 1899, which houses the state legislature. Among the offices at the plaza are the Department of Health and the Biggs Laboratory of the Wadsworth Center. The Empire State Art Collection, a major public collection of 1960s and 1970s monumental abstract artworks, is on permanent display throughout the site. The New York State Office of General Services maintains the plaza. The Nelson A. Rockefeller Empire State Plaza Performing Arts Center Corporation is a New York state public-benefit corporation that was created in 1979 to manage the performing arts facility in the plaza.

AGENT: Thought: From the summary of the page "Empire State Building", I can see that the building was designed and built from 1930 to 1931. I will use the `final_answer` tool to return the answer.
Code:
```py
final_answer("According to the information extracted from the Wikipedia page 'Empire State Building', the construction of the Empire State Building started in 1930 and ended in 1931.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
According to the information extracted from the Wikipedia page 'Empire State Building', the construction of the Empire State Building started in 1930 and ended in 1931.

The passage retrieval agent was not used as all the information was already found in the summaries of the pages.

Let’s now try a more complicated example asking if birch plywood floats in alcohol. This question requires basic physics understanding and arithmetic abilities, combined with non-trivial retrieval tasks.

Task: Does birch plywood float in ethanol?

Let’s first visualize the logs of the various agents.

Manager agent logs

AGENT:  Thought: To determine if birch plywood floats in ethanol, we need to know the density of both birch plywood and ethanol. If the density of birch plywood is less than the density of ethanol, it will float. I will use the `wikipedia_search_agent` tool to find the densities of birch plywood and ethanol.
Code:
```py
birch_plywood_density_info = wikipedia_search_agent("density of birch plywood")
ethanol_density_info = wikipedia_search_agent("density of ethanol")
print(birch_plywood_density_info)
print(ethanol_density_info)
```<end_action>
OBSERVATION: Print outputs:
The density of birch plywood is approximately 680 kg/m³.
The density of ethanol at 20°C is approximately 0.7894 g/cm³.

AGENT: Thought: Now that I have the densities, I can compare them. Birch plywood has a density of 680 kg/m³ and ethanol has a density of 0.7894 g/cm³. Since 1 g/cm³ is equal to 1000 kg/m³, the density of ethanol is 789.4 kg/m³. Since the density of birch plywood (680 kg/m³) is less than the density of ethanol (789.4 kg/m³), birch plywood will float in ethanol. I will now provide this conclusion using the `final_answer` function.
Code:
```py
final_answer("Birch plywood will float in ethanol because its density (680 kg/m³) is less than the density of ethanol (789.4 kg/m³).")
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
Birch plywood will float in ethanol because its density (680 kg/m³) is less than the density of ethanol (789.4 kg/m³).

Wikipedia search agent logs

TASK:  density of birch plywood
AGENT: Thought: I will use the `search_wikipedia` tool to find information about the density of birch plywood.
Code:
```py
search_wikipedia('birch plywood')
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
Pages found for query 'birch plywood':
Page: Plywood
Summary: Plywood is a composite material manufactured from thin layers, or "plies", of wood veneer that have been stacked and glued together. It is an engineered wood from the family of manufactured boards, which include plywood, medium-density fibreboard (MDF), oriented strand board (OSB), and particle board (or chipboard).
All plywoods bind resin and wood fibre sheets (cellulose cells are long, strong and thin) to form a composite material. The sheets of wood are stacked such that each layer has its grain set typically (see below) perpendicular to its adjacent layers. This alternation of the grain is called cross-graining and has several important benefits: it reduces the tendency of wood to split when nailed at the edges; it reduces thickness swelling and shrinkage, providing improved dimensional stability; and it makes the strength of the panel consistent across all directions. There is usually an odd number of plies, so that the sheet is balanced, that is, the surface layers have their gr
Page: Birch
Summary: A birch is a thin-leaved deciduous hardwood tree of the genus Betula (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech-oak family Fagaceae. The genus Betula contains 30 to 60 known taxa of which 11 are on the IUCN 2011 Red List of Threatened Species. They are typically short-lived pioneer species and are widespread in the Northern Hemisphere, particularly in northern areas of temperate climates and in boreal climates. Birch wood is used for a wide range of purposes.
Page: Birch wood
Summary: Birch wood is a type of wood of the birch. Birch wood is pale yellow-brown wood having a close, straight grain and uniform texture that finishes to a smooth surface. Sometimes it is dyed to imitate mahogany. This type of wood is used for among others firewood, turnery, furniture, cabinetry, tools handles, hoops, plywood, flooring and shoe heels.
Page: Falco F8L
Summary: The Falco F8L is an Italian-designed lightweight 2-seater aerobatic aircraft.
Page: Crusader (speedboat)
Summary: Crusader was a jet-powered speed boat piloted by John Cobb.
The combination of an aerodynamically stable hull form and turbojet propulsion was proposed by Reid Railton, Cobb's adviser. A rocket-powered scale model was tested at Haslar. The full size design was by Peter du Cane and built by Vospers of Portsmouth. Technical assistance came from Saunders-Roe and Vickers-Supermarine. It cost £15,000 in 1949.
It was silver and scarlet in colour and 10 m long.
The engine was a de Havilland Ghost Mk 48 centrifugal turbojet provided as a loan by the Ministry of Supply at the request of Major Frank Halford, the engine designer. The engine was rated at 5,000 lb thrust fed by two scoop inlets forward of the cockpit.
The hull was of trimaran form, a main hull with a planing step, and two smaller rear-mounted outriggers. Construction was of birch plywood frames and stringers. The hull was skinned in birch ply covered in doped fabric with metal skin reinforcement for planing surfaces. Aircraft-sty

AGENT: Thought: The pages returned from the search did not directly mention the density of birch plywood. However, the page on Birch Wood mentions that birch wood is used for plywood. I will use the `search_info` tool to find more specific information about the density of birch plywood from the page 'Plywood'.
Code:
```py
search_info('density of birch plywood', 'Plywood')
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Plywood' for the query 'density of birch plywood':
The density of birch plywood is approximately 680 kg/m³.
AGENT: Thought: I have found the density of birch plywood from the Wikipedia page 'Plywood'. The density is approximately 680 kg/m³.
Code:
```py
final_answer("The density of birch plywood is approximately 680 kg/m³.")
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
The density of birch plywood is approximately 680 kg/m³.
------------------------------
TASK: density of ethanol
AGENT: Thought: I will use the `search_wikipedia` tool to find the density of ethanol from Wikipedia.
Code:
```py
search_wikipedia('density of ethanol')
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
Pages found for query 'density of ethanol':
Page: Ethanol (data page)
Summary: This page provides supplementary chemical data on ethanol.

Page: Ethanol
Summary: Ethanol (also called ethyl alcohol, grain alcohol, drinking alcohol, or simply alcohol) is an organic compound with the chemical formula CH3CH2OH. It is an alcohol, with its formula also written as C2H5OH, C2H6O or EtOH, where Et stands for ethyl. Ethanol is a volatile, flammable, colorless liquid with a characteristic wine-like odor and pungent taste. In nature, grape-sugar breaks up by the action of fermentation into alcohol or carbonic acid, without anything being added. As a psychoactive depressant, it is the active ingredient in alcoholic beverages, and the second most consumed drug globally behind caffeine.
Ethanol is naturally produced by the fermentation process of sugars by yeasts or via petrochemical processes such as ethylene hydration. Historically it was used as a general anesthetic, and has modern medical applications as an antiseptic, disinfectant, solvent for some medications, and antidote for methanol poisoning and ethylene glycol poisoning. It is used as a chemical so
Page: Alcohol by volume
Summary: Alcohol by volume (abbreviated as alc/vol or ABV) is a standard measure of the volume of alcohol contained in a given volume of an alcoholic beverage, expressed as a volume percent. It is defined as the number of millilitres (mL) of pure ethanol present in 100 mL (3.5 imp fl oz; 3.4 US fl oz) of solution at 20 °C (68 °F). The number of millilitres of pure ethanol is the mass of the ethanol divided by its density at 20 °C (68 °F), which is 0.78945 g/mL (0.82353 oz/US fl oz; 0.79122 oz/imp fl oz; 0.45633 oz/cu in). The alc/vol standard is used worldwide. The International Organization of Legal Metrology has tables of density of water–ethanol mixtures at different concentrations and temperatures.
In some countries, e.g. France, alcohol by volume is often referred to as degrees Gay-Lussac (after the French chemist Joseph Louis Gay-Lussac), although there is a slight difference since the Gay-Lussac convention uses the International Standard Atmosphere value for temperature, 15 °C (59 °F).

Page: Alcohol fuel
Summary: Various alcohols are used as fuel for internal combustion engines. The first four aliphatic alcohols (methanol, ethanol, propanol, and butanol)
are of interest as fuels because they can be synthesized chemically or biologically, and they have characteristics which allow them to be used in internal combustion engines. The general chemical formula for alcohol fuel is CnH2n+1OH.
Most methanol is produced from natural gas, although it can be produced from biomass using very similar chemical processes. Ethanol is commonly produced from biological material through fermentation processes. Biobutanol has the advantage in combustion engines in that its energy density is closer to gasoline than the simpler alcohols (while still retaining over 25% higher octane rating); however, biobutanol is currently more difficult to produce than ethanol or methanol. When obtained from biological materials and/or biological processes, they are known as bioalcohols (e.g. "bioethanol"). There is no chemical dif
Page: Cellulosic ethanol
Summary: Cellulosic ethanol is ethanol (ethyl alcohol) produced from cellulose (the stringy fiber of a plant) rather than from the plant's seeds or fruit. It can be produced from grasses, wood, algae, or other plants. It is generally discussed for use as a biofuel. The carbon dioxide that plants absorb as they grow offsets some of the carbon dioxide emitted when ethanol made from them is burned, so cellulosic ethanol fuel has the potential to have a lower carbon footprint than fossil fuels.
Interest in cellulosic ethanol is driven by its potential to replace ethanol made from corn or sugarcane. Since these plants are also used for food products, diverting them for ethanol production can cause food prices to rise; cellulose-based sources, on the other hand, generally do not compete with food, since the fibrous parts of plants are mostly inedible to humans. Another potential advantage is the high diversity and abundance of cellulose sources; grasses, trees and algae are found in almost every envi

AGENT: Thought: The summary of the page 'Ethanol' mentions that ethanol is an alcohol with a characteristic wine-like odor and pungent taste. However, it does not provide the density of ethanol. I will use the `search_info` tool to find the density of ethanol from the 'Ethanol (data page)'.
Code:
```py
search_info('density of ethanol', 'Ethanol (data page)')
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Ethanol (data page)' for the query 'density of ethanol':
The density of ethanol at 20°C is approximately 0.7894 g/cm³.
AGENT: Thought: I have found the density of ethanol from the Wikipedia page 'Ethanol (data page)'. I will return the result using the `final_answer` tool.
Code:
```py
final_answer("The density of ethanol at 20°C is approximately 0.7894 g/cm³.")
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
The density of ethanol at 20°C is approximately 0.7894 g/cm³.

Page search agent logs

TASK:  Retrieve information about the query:"density of birch plywood" from the Wikipedia page "Plywood".
AGENT: Thought: I need to find the density of birch plywood. I will use the tool `retrieve_passages` to look for data on the Wikipedia page.
Code:
```py
result = retrieve_passages("density of birch plywood")
print(result)
```<end_action>
OBSERVATION: Print outputs:
Retrieved passages for query "density of birch plywood":
Passage 0: ... . Shear stress, by contrast, is higher in the center of the panel, and at the outer fibres. Within Europe basic plywood can be divided into three main categories: birch plywood (density approx. 680 kg/m3), mixed plywood (density approx. 620 kg/m3) and conifer plywoods (density 460–520 kg/m3). ...
Passage 1: ... . Aircraft grade plywood is made from three or more plies of birch, as thin as 0.40 millimetres (1⁄64 in) thick in total, and is extremely strong and light. ...
Passage 2: ... Baltic Birch plywood is a product of an area around the Baltic Sea. Originally manufactured for European cabinet makers but now popular in the United States as well. It is very stable composed of an inner void-free core of cross-banded birch plys with an exterior grade adhesive. The face veneers are thicker than traditional cabinet grade ...
Passage 3: ... . As a result, thicker panels can span greater distances under the same loads. In bending, the maximum stress occurs in the outermost layers, one in tension, the other in compression. Bending stress decreases from the maximum at the face layers to nearly zero at the central layer. Shear stress, by contrast, is higher in the center of the panel, and at the outer fibres. Within Europe basic plywood can be divided into three main categories: birch plywood (density approx ...
Passage 4: ... Wind turbine blades
Insulation boxes for liquefied natural gas (LNG) carriers
Smooth surface and accurate thickness combined with the durability of the material makes birch plywood a favorable material for many special end uses e.g.:

High-end loud speakers
Die-cutting boards
Supporting structure for parquet
Playground equipment
Furniture
Signs and fences for demanding outdoor advertising
Musical instruments
Sports equipment...

AGENT: Thought: From the passages, I can see that the density of birch plywood is approximately 680 kg/m³. I can use the `final_answer` tool to provide the answer.
Code:
```py
final_answer("The density of birch plywood is approximately 680 kg/m³.")
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
The density of birch plywood is approximately 680 kg/m³.
------------------------------
TASK: Retrieve information about the query:"density of ethanol" from the Wikipedia page "Ethanol (data page)".
AGENT: Thought: I need to find the density of ethanol. I will use the tool `retrieve_passages` to get the information from the Wikipedia page.
Code:
```py
result = retrieve_passages("density of ethanol")
print(result)
```<end_action>
OBSERVATION: Print outputs:
Retrieved passages for query "density of ethanol":
Passage 0: ... == Density of ethanol at various temperatures ==
Data obtained from Lange 1967

These data correlate as ρ [g/cm3] = −8.461834×10−4 T [°C] + 0.8063372 with an R2 = 0.99999.

== Properties of aqueous ethanol solutions ==
Data obtained from Lange 1967

== Boiling points of aqueous solutions ==
Data obtained from CRC Handbook of Chemistry (Page 2117): 2391

‡Azeotropic mixture

== Charts ==

== References == ...
Passage 1: ... This page provides supplementary chemical data on ethanol.

== Material Safety Data Sheet ==
External MSDS

== Structure and properties ==

== Thermodynamic properties ==

== Spectral data ==

== Vapor pressure of liquid ==

== Density of ethanol at various temperatures ==
Data obtained from Lange 1967

These data correlate as ρ [g/cm3] = −8.461834×10−4 T [°C] + 0.8063372 with an R2 = 0.99999.

== Properties of aqueous ethanol solutions ==
Data obtained from Lange 1967...

AGENT: Thought: From the passages, I can see that the density of ethanol varies with temperature. I can use the provided linear equation to calculate the density at a specific temperature. I will use the `math` module to perform the calculation.
Code:
```py
# Extract the coefficients from the equation
a = -8.461834e-4
b = 0.8063372

# Calculate the density at a specific temperature, let's take 20°C as an example
temperature = 20
density = a * temperature + b
print(density)
```<end_action>
OBSERVATION: Print outputs:
0.789413532

AGENT: Thought: The passage provides a linear equation to calculate the density of ethanol at different temperatures. I can now use this equation to provide the density of ethanol at a specific temperature. I will use the `final_answer` tool to return the relevant information.
Code:
```py
final_answer(f"The density of ethanol at 20°C is approximately round(0.789413532, 4) g/cm³.")
```<end_action>
OBSERVATION: Print outputs:
Last output from code snippet:
The density of ethanol at 20°C is approximately 0.7894 g/cm³.

The model correctly identifies density difference as the cause of floating/sinking, breaks up the task into finding the density of both substances, and is able to draw the correct conclusion from the retrieved data despite them being in different units of measurement.

The discussed system has impressive capabilities, especially considering the limited size of the engine model used, but of course, it has its own limitations. The bigger one seems to be related to the power of the engine model itself: while it seems able to understand and follow complex and long instructions as well as to use the provided tools correctly, sometimes it fails to comply with simple guidelines and it tends to repeat unfruitful actions without changes. This is particularly prominent when the information the agent is looking for is not present (or not easily extractible using the semantic search tool) from the provided pages.

Another drawback, intrinsic to agentic and even more prominent in multi-agentic systems, is the computation time needed for the various agents’ execution. In some simple cases where the standard RAG workflow would have been enough to answer the question, this could produce an unnecessary computational overhead.

Finally, having multiple agents interacting with each other can amplify the risk of hallucinations, as a hallucination made by one agent could propagate over the execution chain, making it also very difficult to track.

Let’s visualize an example in which the system, while still providing the correct answer, made excessive use of tool calls (with some repeated actions) unnecessarily extending the execution time.

Task: Which dynasty was ruling China at the time of the fall of Constantinople?

The task requires a good level of abstraction since the two events are not related and the exact year of the fall of Constantinople may not be reported on the page of the corresponding dynasty.

The manager agent execution is still lean and on point, solving the task correctly by first retrieving the date of the fall of Constantinople and then looking for Chinese dynasties in that year. However, we can already see that the search results, while still containing the required information, seem a bit strange.

Agent:  Thought: To answer this question, I need to find out when Constantinople fell and which Chinese dynasty was ruling at that time. I'll start by finding the fall date of Constantinople.
Code:
```py
wikipedia_search_agent(query="Fall of Constantinople")
```<end_action>
Observation: Print outputs:
Last output from code snippet:
The leader of the Ottoman Empire in 1453, during the fall of Constantinople, was Mehmed II, also known as Mehmed the Conqueror.
Agent: Thought: From the previous search, we know that Constantinople fell in 1453, and it was conquered by Mehmed II. Now I need to find out which Chinese dynasty was ruling in 1453. I will search for the Chinese dynasties around that time.
Code:
```py
wikipedia_search_agent(query="Chinese dynasties 1453")
```<end_action>
Observation: Print outputs:
Last output from code snippet:
In 1453, the Ming dynasty was in full control of China. No specific events related to 1453 are mentioned in the information provided.
Agent: Thought: Based on the information provided, the Ming dynasty was in full control of China in 1453. Therefore, the dynasty ruling China when Constantinople fell would be the Ming dynasty.
Code:
```py
final_answer("The dynasty ruling China when Constantinople fell in 1453 was the Ming dynasty.")
```<end_action>
Observation: Print outputs:
Last output from code snippet:
The dynasty ruling China when Constantinople fell in 1453 was the Ming dynasty.

Let’s thus inspect the corresponding Wikipedia search agent tasks and their execution. We can see that, possibly due to the broad task “Fall of Constantinople”, the agent looks at other specific information after having found the correct date. The same happens when retrieving information about the Chinese dynasties, with the further problem that some tool calls were repeated identically.

TASK:  Fall of Constantinople
AGENT: Thought: I will use the `search_wikipedia` tool to find information about the fall of Constantinople. This will likely provide me with a summary of the event and possibly links to related pages.
Code:
```py
search_wikipedia("Fall of Constantinople")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Pages found for query 'Fall of Constantinople':
Page: Fall of Constantinople
Summary: The fall of Constantinople, also known as the conquest of Constantinople, was the capture of the capital of the Byzantine Empire by the Ottoman Empire. The city was captured on 29 May 1453 as part of the culmination of a 55-day siege which had begun on 6 April.
The attacking Ottoman Army, which significantly outnumbered Constantinople's defenders, was commanded by the 21-year-old Sultan Mehmed II (later nicknamed "the Conqueror"), while the Byzantine army was led by Emperor Constantine XI Palaiologos. After conquering the city, Mehmed II made Constantinople the new Ottoman capital, replacing Adrianople.
The fall of Constantinople and of the Byzantine Empire was a watershed of the Late Middle Ages, marking the effective end of the Roman Empire, a state which began in roughly 27 BC and had lasted nearly 1500 years. For many modern historians, the fall of Constantinople marks the end of the medieval period and the beginning of the early modern period. The city's fall also stood as a turni
Page: Sack of Constantinople
Summary: The sack of Constantinople occurred in April 1204 and marked the culmination of the Fourth Crusade. Crusaders sacked and destroyed most of Constantinople, the capital of the Byzantine Empire. After the capture of the city, the Latin Empire (known to the Byzantines as the Frankokratia, or the Latin occupation) was established and Baldwin of Flanders crowned as Emperor Baldwin I of Constantinople in Hagia Sophia.
After the city's sacking, most of the Byzantine Empire's territories were divided up among the Crusaders. Byzantine aristocrats also established a number of small independent splinter states—one of them being the Empire of Nicaea, which would eventually recapture Constantinople in 1261 and proclaim the reinstatement of the Empire. However, the restored Empire never managed to reclaim all its former territory or attain its earlier economic strength, and it gradually succumbed to the rising Ottoman Empire over the following two centuries.
The Byzantine Empire was left poorer, smal
Page: Constantinople
Summary: Constantinople (see other names) became the capital of the Roman Empire during the reign of Constantine the Great in 330. Following the collapse of the Western Roman Empire in the late 5th century, Constantinople remained the capital of the Eastern Roman Empire (also known as the Byzantine Empire; 330–1204 and 1261–1453), the Latin Empire (1204–1261), and the Ottoman Empire (1453–1922). Following the Turkish War of Independence, the Turkish capital then moved to Ankara. Officially renamed Istanbul in 1930, the city is today the largest city in Europe, straddling the Bosporus strait and lying in both Europe and Asia, and the financial center of Turkey.
In 324, following the reunification of the Eastern and Western Roman Empires, the ancient city of Byzantium was selected to serve as the new capital of the Roman Empire, and the city was renamed Nova Roma, or 'New Rome', by Emperor Constantine the Great. On 11 May 330, it was renamed Constantinople and dedicated to Constantine. Constantin
Page: Moscow, third Rome
Summary: Moscow, third Rome (Russian: Москва — третий Рим; Moskva, tretiĭ Rim) is a theological and political concept asserting Moscow as the successor to ancient Rome, with the Russian world carrying forward the legacy of the Roman Empire. The term "third Rome" refers to a historical topic of debate in European culture: the question of the successor city to the "first Rome" (Rome, within the Western Roman Empire) and the "second Rome" (Constantinople, within the Eastern Roman Empire).
Page: Byzantine Empire
Summary: The Byzantine Empire, also referred to as the Eastern Roman Empire, was the continuation of the Roman Empire centred in Constantinople during late antiquity and the Middle Ages. The eastern half of the Empire survived the conditions that caused the fall of the West in the 5th century AD, and continued to exist until the fall of Constantinople to the Ottoman Empire in 1453. During most of its existence, the empire remained the most powerful economic, cultural, and military force in the Mediterranean world. The term "Byzantine Empire" was only coined following the empire's demise; its citizens referred to the polity as the "Roman Empire" and to themselves as "Romans". Due to the imperial seat's move from Rome to Byzantium, the adoption of state Christianity, and the predominance of Greek instead of Latin, modern historians continue to make a distinction between the earlier Roman Empire and the later Byzantine Empire.
During the earlier Pax Romana period, the western parts of the empire b

AGENT: Thought: From the summary of the page "Fall of Constantinople", I can see that the event took place in 1453. I will use the `search_info` tool to find more details about the event, specifically who was the leader of the Ottoman Empire at that time.
Code:
```py
search_info("leader of the Ottoman Empire in 1453", "Fall of Constantinople")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Fall of Constantinople' for the query 'leader of the Ottoman Empire in 1453':
The leader of the Ottoman Empire in 1453 was Mehmed II, also known as Mehmed the Conqueror.
AGENT: Thought: I have found that the leader of the Ottoman Empire in 1453 was Mehmed II, also known as Mehmed the Conqueror. I will use the `final_answer` tool to provide the answer.
Code:
```py
final_answer("The leader of the Ottoman Empire in 1453, during the fall of Constantinople, was Mehmed II, also known as Mehmed the Conqueror.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The leader of the Ottoman Empire in 1453, during the fall of Constantinople, was Mehmed II, also known as Mehmed the Conqueror.
------------------------------
TASK: Chinese dynasties 1453
AGENT: Thought: I will use the `search_wikipedia` tool to find information about Chinese dynasties around the year 1453. I will provide a general query related to Chinese dynasties in that time period.
Code:
```py
search_wikipedia('Chinese dynasties 1453')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Pages found for query 'Chinese dynasties 1453':
Page: Chinese Empire
Summary: Chinese Empire (or Empire of China) is a term referring to the realm ruled by the Emperor of China during the era known as Imperial China. It was coined by western scholars and used to describe the Ming and Qing dynasties (or imperial Chinese dynasties in general). Another term was "Celestial Empire", in reference to the status of the emperor as the Son of Heaven. In 221 BC, China was unified under an emperor for the first time, and various dynasties or empires founded by hereditary monarchs ruled China for a total of two millennia since then, including the Qin, Han, Jin, Sui, Tang, Song, Yuan, Ming, and Qing.

Page: Ming dynasty
Summary: The Ming dynasty, officially the Great Ming, was an imperial dynasty of China, ruling from 1368 to 1644 following the collapse of the Mongol-led Yuan dynasty. The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China. Although the primary capital of Beijing fell in 1644 to a rebellion led by Li Zicheng (who established the short-lived Shun dynasty), numerous rump regimes ruled by remnants of the Ming imperial family—collectively called the Southern Ming—survived until 1662.
The Ming dynasty's founder, the Hongwu Emperor (r. 1368–1398), attempted to create a society of self-sufficient rural communities ordered in a rigid, immobile system that would guarantee and support a permanent class of soldiers for his dynasty: the empire's standing army exceeded one million troops and the navy's dockyards in Nanjing were the largest in the world. He also took great care breaking the power of the court eunuchs and unrelated magnates, enfeoff
Page: List of time periods
Summary: The categorisation of the past into discrete, quantified named blocks of time is called periodization. This is a list of such named time periods as defined in various fields of study.
These can be divided broadly into prehistorical periods and historical periods
(when written records began to be kept).
In archaeology and anthropology, prehistory is subdivided into the three-age system, this list includes the use of the three-age system as well as a number of various designation used in reference to sub-ages within the traditional three.
The dates for each age can vary by region. On the geologic time scale, the Holocene epoch starts at the end of the last glacial period of the current ice age (c. 10,000 BC) and continues to the present. The beginning of the Mesolithic is usually considered to correspond to the beginning of the Holocene epoch.

Page: Celestial Empire
Summary: Celestial Empire (Chinese: 天朝; pinyin: Tiāncháo; lit. 'heavenly dynasty') is an old name used to refer to China or the Chinese Empire, from a literary and poetic translation of the Chinese term, one of many names for China. The name was used in reference to the status of the Emperor of China as the Son of Heaven in the Sinosphere.
Accordingly, in the 19th century, the name "Celestial" was used to refer to Chinese people. Both terms were widely used in the English-language popular mass media of the day, but fell into disuse later on. Its usage has become popular again in the present day (2015), particularly among Chinese Internet users. It is used to refer to the current Communist regime, to imply either disapproval for its political suppression and arrogance or national pride in the country's emergence as a superpower in the 21st century, depending on the context.
In modern times it has mostly fallen into disuse, but the name "Celestial Empire" still appears in various media such as ar
Page: List of dynasties
Summary: This is a list of monarchical dynasties and dynastic regimes organized by geographic region. Extant dynasties are rendered in bold and highlighted.

AGENT: Thought: From the summaries, I can see that the Ming dynasty ruled from 1368 to 1644, which covers the year 1453. I will use the `search_info` tool to find more detailed information about the Ming dynasty during the year 1453.
Code:
```py
search_info('1453', 'Ming dynasty')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Ming dynasty' for the query '1453':
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
AGENT: Thought: Now that I know the Ming dynasty ruled during 1453, I can provide a more specific answer about the events that happened in China during that year. However, the information provided does not give us specific details about the events in 1453. I will use the `search_info` tool to find more detailed information about the Ming dynasty during the year 1453.
Code:
```py
search_info('1453', 'Ming dynasty')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Ming dynasty' for the query '1453':
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
AGENT: Thought: The information provided does not specifically mention the events of 1453. However, we can infer that during the Ming dynasty, significant developments occurred. I will use the `search_info` tool to find more specific information about the events of 1453 in the Ming dynasty.
Code:
```py
search_info('1453', 'Ming dynasty')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Ming dynasty' for the query '1453':
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
AGENT: Thought: The information provided indicates that the Ming dynasty was in full control of China during the year 1453. However, there is no specific mention of events that happened in 1453. I will use the `search_info` tool to find more specific information about the events of 1453 in the Ming dynasty.
Code:
```py
search_info('1453', 'Ming dynasty')
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
Information retrieved from the page 'Ming dynasty' for the query '1453':
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
AGENT: Thought: The information provided does not specifically mention any events that occurred in 1453. However, we know that the Ming dynasty was in full control of China during this time. I will use the `final_answer` tool to provide the answer based on the information we have.
Code:
```py
final_answer("In 1453, the Ming dynasty was in full control of China. No specific events related to 1453 are mentioned in the information provided.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
In 1453, the Ming dynasty was in full control of China. No specific events related to 1453 are mentioned in the information provided.

Diving deeper into the page search agent execution, we see that this also repeated multiple tool calls with the same parameters.

TASK:  Retrieve information about the query:"leader of the Ottoman Empire in 1453" from the Wikipedia page "Fall of Constantinople".
AGENT: Thought: I need to find out who was the leader of the Ottoman Empire in 1453. I will use the `retrieve_passages` tool to search for this information on the Wikipedia page "Fall of Constantinople".
Code:
```py
result = retrieve_passages("leader of the Ottoman Empire in 1453")
print(result)
```<end_action>
OBSERVATIONS: Print outputs:
Retrieved passages for query "leader of the Ottoman Empire in 1453":
Passage 0: ... When Mehmed II succeeded his father in 1451, he was 19 years old. Many European courts assumed that the young Ottoman ruler would not seriously challenge Christian hegemony in the Balkans and the Aegean. In fact, Europe celebrated Mehmed coming to the throne and hoped his inexperience would lead the Ottomans astray. This calculation was boosted by Mehmed's friendly overtures to the European envoys at his new court. But Mehmed's mild words were not matched by his actions ...
Passage 1: ... The attacking Ottoman Army, which significantly outnumbered Constantinople's defenders, was commanded by the 21-year-old Sultan Mehmed II (later nicknamed "the Conqueror"), while the Byzantine army was led by Emperor Constantine XI Palaiologos. After conquering the city, Mehmed II made Constantinople the new Ottoman capital, replacing Adrianople. ...
Passage 2: ... . The oldest boy, renamed Murad, became a personal favourite of Mehmed and served as Beylerbey (Governor-General) of Rumeli (the Balkans). The younger son, renamed Mesih Pasha, became Admiral of the Ottoman fleet and Sancak Beg (Governor) of the province of Gallipoli. He eventually served twice as Grand Vizier under Mehmed's son, Bayezid II. ...
Passage 3: ... == External links ==

Media related to Fall of Constantinople (1453) at Wikimedia Commons
The Siege of Constantinople As The Islamic World Sees it Archived 22 April 2021 at the Wayback Machine
World History Encyclopedia – 1453: The Fall of Constantinople
Constantinople Siege & Fall, BBC Radio 4 discussion with Roger Crowley, Judith Herrin & Colin Imber (In Our Time, 28 December 2006). ...
Passage 4: ... In the summer of 1452, when Rumeli Hisarı was completed and the threat of the Ottomans had become imminent, Constantine wrote to the Pope, promising to implement the union, which was declared valid by a half-hearted imperial court on 12 December 1452. Although he was eager for an advantage, Pope Nicholas V did not have the influence the Byzantines thought he had over the Western kings and princes, some of whom were wary of increasing papal control...

AGENT: Thought: From the passages, it seems that the leader of the Ottoman Empire in 1453 was Mehmed II, also known as Mehmed the Conqueror. I will use the `final_answer` tool to provide the answer.
Code:
```py
final_answer("The leader of the Ottoman Empire in 1453 was Mehmed II, also known as Mehmed the Conqueror.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The leader of the Ottoman Empire in 1453 was Mehmed II, also known as Mehmed the Conqueror.
------------------------------
TASK: Retrieve information about the query:"1453" from the Wikipedia page "Ming dynasty".
AGENT: Thought: I need to find information related to the year 1453 from the Wikipedia page "Ming dynasty". I will use the `retrieve_passages` tool to get relevant passages.
Code:
```py
result = retrieve_passages("1453 Ming dynasty")
print(result)
```<end_action>
OBSERVATIONS: Print outputs:
Retrieved passages for query "1453 Ming dynasty":
Passage 0: ... The Ming dynasty, officially the Great Ming, was an imperial dynasty of China, ruling from 1368 to 1644 following the collapse of the Mongol-led Yuan dynasty. The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China ...
Passage 1: ... === Reference works and primary sources ===
Farmer, Edward L. ed. Ming History: An Introductory Guide to Research (1994).
Goodrich, Luther Carrington (1976). Dictionary of Ming Biography, 1368–1644. New York: Columbia University Press. ISBN 978-0-231-03833-1.
The Ming History English Translation Project, A collaborative project that makes available translations (from Chinese to English) of portions of the 明史 Mingshi (Official History of the Ming Dynasty). ...
Passage 2: ... ==== Reign of the Hongwu Emperor ==== ...
Passage 3: ... . The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China. Although the primary capital of Beijing fell in 1644 to a rebellion led by Li Zicheng (who established the short-lived Shun dynasty), numerous rump regimes ruled by remnants of the Ming imperial family—collectively called the Southern Ming—survived until 1662. ...
Passage 4: ... . Zhu Di assumed the throne as the Yongle Emperor (r. 1402–24); his reign is universally viewed by scholars as a "second founding" of the Ming dynasty since he reversed many of his father's policies....

AGENT: Thought: The passages do not directly mention the year 1453. However, I can infer that the Ming dynasty lasted from 1368 to 1644, and 1453 falls within this period. I will use the `final_answer` tool to provide a summary based on the information available.
Code:
```py
final_answer("The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
------------------------------
TASK: Retrieve information about the query:"1453" from the Wikipedia page "Ming dynasty".
AGENT: Thought: I need to find information related to the year 1453 from the Wikipedia page "Ming dynasty". I will use the `retrieve_passages` tool to get relevant passages.
Code:
```py
result = retrieve_passages("1453 Ming dynasty")
print(result)
```<end_action>
OBSERVATIONS: Print outputs:
Retrieved passages for query "1453 Ming dynasty":
Passage 0: ... The Ming dynasty, officially the Great Ming, was an imperial dynasty of China, ruling from 1368 to 1644 following the collapse of the Mongol-led Yuan dynasty. The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China ...
Passage 1: ... === Reference works and primary sources ===
Farmer, Edward L. ed. Ming History: An Introductory Guide to Research (1994).
Goodrich, Luther Carrington (1976). Dictionary of Ming Biography, 1368–1644. New York: Columbia University Press. ISBN 978-0-231-03833-1.
The Ming History English Translation Project, A collaborative project that makes available translations (from Chinese to English) of portions of the 明史 Mingshi (Official History of the Ming Dynasty). ...
Passage 2: ... ==== Reign of the Hongwu Emperor ==== ...
Passage 3: ... . The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China. Although the primary capital of Beijing fell in 1644 to a rebellion led by Li Zicheng (who established the short-lived Shun dynasty), numerous rump regimes ruled by remnants of the Ming imperial family—collectively called the Southern Ming—survived until 1662. ...
Passage 4: ... . Zhu Di assumed the throne as the Yongle Emperor (r. 1402–24); his reign is universally viewed by scholars as a "second founding" of the Ming dynasty since he reversed many of his father's policies....

AGENT: Thought: The passages do not directly mention the year 1453. However, I can infer that the Ming dynasty lasted from 1368 to 1644, and 1453 falls within this period. I will use the `final_answer` tool to provide a summary based on the information available.
Code:
```py
final_answer("The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
------------------------------
TASK: Retrieve information about the query:"1453" from the Wikipedia page "Ming dynasty".
AGENT: Thought: I need to find information related to the year 1453 from the Wikipedia page "Ming dynasty". I will use the `retrieve_passages` tool to get relevant passages.
Code:
```py
result = retrieve_passages("1453 Ming dynasty")
print(result)
```<end_action>
OBSERVATIONS: Print outputs:
Retrieved passages for query "1453 Ming dynasty":
Passage 0: ... The Ming dynasty, officially the Great Ming, was an imperial dynasty of China, ruling from 1368 to 1644 following the collapse of the Mongol-led Yuan dynasty. The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China ...
Passage 1: ... === Reference works and primary sources ===
Farmer, Edward L. ed. Ming History: An Introductory Guide to Research (1994).
Goodrich, Luther Carrington (1976). Dictionary of Ming Biography, 1368–1644. New York: Columbia University Press. ISBN 978-0-231-03833-1.
The Ming History English Translation Project, A collaborative project that makes available translations (from Chinese to English) of portions of the 明史 Mingshi (Official History of the Ming Dynasty). ...
Passage 2: ... ==== Reign of the Hongwu Emperor ==== ...
Passage 3: ... . The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China. Although the primary capital of Beijing fell in 1644 to a rebellion led by Li Zicheng (who established the short-lived Shun dynasty), numerous rump regimes ruled by remnants of the Ming imperial family—collectively called the Southern Ming—survived until 1662. ...
Passage 4: ... . Zhu Di assumed the throne as the Yongle Emperor (r. 1402–24); his reign is universally viewed by scholars as a "second founding" of the Ming dynasty since he reversed many of his father's policies....

AGENT: Thought: The passages do not directly mention the year 1453. However, I can infer that the Ming dynasty lasted from 1368 to 1644, and 1453 falls within this period. I will use the `final_answer` tool to provide a summary based on the information available.
Code:
```py
final_answer("The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.
------------------------------
TASK: Retrieve information about the query:"1453" from the Wikipedia page "Ming dynasty".
AGENT: Thought: I need to find information related to the year 1453 from the Wikipedia page "Ming dynasty". I will use the `retrieve_passages` tool to get relevant passages.
Code:
```py
result = retrieve_passages("1453 Ming dynasty")
print(result)
```<end_action>
OBSERVATIONS: Print outputs:
Retrieved passages for query "1453 Ming dynasty":
Passage 0: ... The Ming dynasty, officially the Great Ming, was an imperial dynasty of China, ruling from 1368 to 1644 following the collapse of the Mongol-led Yuan dynasty. The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China ...
Passage 1: ... === Reference works and primary sources ===
Farmer, Edward L. ed. Ming History: An Introductory Guide to Research (1994).
Goodrich, Luther Carrington (1976). Dictionary of Ming Biography, 1368–1644. New York: Columbia University Press. ISBN 978-0-231-03833-1.
The Ming History English Translation Project, A collaborative project that makes available translations (from Chinese to English) of portions of the 明史 Mingshi (Official History of the Ming Dynasty). ...
Passage 2: ... ==== Reign of the Hongwu Emperor ==== ...
Passage 3: ... . The Ming dynasty was the last imperial dynasty of China ruled by the Han people, the majority ethnic group in China. Although the primary capital of Beijing fell in 1644 to a rebellion led by Li Zicheng (who established the short-lived Shun dynasty), numerous rump regimes ruled by remnants of the Ming imperial family—collectively called the Southern Ming—survived until 1662. ...
Passage 4: ... . Zhu Di assumed the throne as the Yongle Emperor (r. 1402–24); his reign is universally viewed by scholars as a "second founding" of the Ming dynasty since he reversed many of his father's policies....

AGENT: Thought: The passages do not directly mention the year 1453. However, I can infer that the Ming dynasty lasted from 1368 to 1644, and 1453 falls within this period. I will use the `final_answer` tool to provide a summary based on the information available.
Code:
```py
final_answer("The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.")
```<end_action>
OBSERVATIONS: Print outputs:
Last output from code snippet:
The year 1453 falls within the period of the Ming dynasty, which ruled from 1368 to 1644. During this time, the Ming dynasty was in full control of China and experienced significant developments such as the construction of the Forbidden City and the expansion of the Great Wall of China.

In this blog post, I explained how to create a multi-agentic RAG system using code agents and a “small” open-source LLM like Qwen2.5–7B-Instruct. I have discussed the main architectural features and some specific choices relative to the Hugging Face code agent implementation that I made to improve the result. The full code details are available in the following GitHub repo.

The multi-agentic system described, despite being powered by a small model running on consumer-grade hardware, can solve multi-hop question-answering tasks related to complex queries. In particular:

  • It can break down the query into manageable sub-tasks;
  • It can identify the Wikipedia pages containing the necessary information;
  • It can combine information coming from multiple pages;
  • It can search for detailed information on a Wikipedia page;
  • It can determine whether it needs more information and tries to find it;
  • It can successfully fix small bugs in the code it produces and handle tool errors (like Wikipedia disambiguation errors).

I have also outlined some limitations of the system, such as increased computation time, repetitive actions, and the potential propagation of hallucinations. The latter could be mitigated by including in the system a “proofreader” agent that checks that the reported information is in agreement with the retrieved sources.

It is also worth noting that, since the agentic system has a standard RAG approach at its core, all the usual techniques used to improve the efficiency and accuracy of the latter can be implemented in the framework.

Another possible improvement is to use techniques to increase test time computation to give the model more “time to think” similar to OpenAI o1/o3 models. It is however important to note that this modification will further increase execution time.

Finally, since the multi-agentic system is made up of agents specialized in a single task, using a different model engine for each of them could improve the performance. In particular, it is possible to fine-tune a different model for each task in the system for further performance gains. This could be particularly beneficial for small models. It is worth mentioning that fine-tuning data can be collected by running the system on a set of predetermined tasks and saving the agents’ output when the system produces the correct answer, thus eliminating the need for expensive manual data annotation.

I hope you found this tutorial useful, you can find the full code implementation in the GitHub repo and try it yourself in the Colab notebook.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here