<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Lehis, Saida</style></author><author><style face="normal" font="default" size="100%">Siam, Abderrahim</style></author><author><style face="normal" font="default" size="100%">Hamouma Moumen</style></author><author><style face="normal" font="default" size="100%">Chergui, Wahid</style></author><author><style face="normal" font="default" size="100%">Souidi, Mohammed-El Habib</style></author><author><style face="normal" font="default" size="100%">Bekhouche, Abdelaali</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Multi-Head DDPG for Pursuit-Evasion with Interpretable Behavioral Decomposition</style></title><secondary-title><style face="normal" font="default" size="100%">Ingénierie des Systèmes d’Information</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2025</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">  https://doi.org/10.18280/isi.301204</style></url></web-urls></urls><volume><style face="normal" font="default" size="100%">30</style></volume><pages><style face="normal" font="default" size="100%">3117-3130</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p style=&quot;text-align: justify;&quot;&gt;
	Designing scalable and interpretable control strategies for decentralized multi-agent systems remains a challenge in reinforcement learning (RL). This challenge is particularly evident in pursuit–evasion tasks, which require coordination under partial observability, without explicit communication or centralized guidance. Although deep RL methods achieve strong performance, they typically operate as black boxes, limiting trust and deployment in safety-critical domains. We propose a Multi-Head DDPG architecture that decomposes control into three interpretable force components - pursuit, cohesion, and separation - weighted adaptively to generate context-aware actions. This design enables emergent role differentiation and interpretable self-organization in the model. In grid-based pursuit–evasion benchmarks, our method outperforms DQN, PPO, and standard DDPG in terms of success rate, convergence speed, and generalization, while also yielding transparent collective behaviors. Overall, the results show that weighted force-based behavioral decomposition provides a principled pathway toward achieving both high-performance and explainable multi-agent control.
&lt;/p&gt;
</style></abstract><issue><style face="normal" font="default" size="100%">12</style></issue></record></records></xml>