Publications

2026
Chenna A, Boubiche D-E, Benyahia A, Homero T-C, Martínez-Peláez R, Velarde-Alvarado P. A Mobility-Aware Zone-Based Key Management Scheme with Dynamic Key Refinement for Large-Scale Mobile Wireless Sensor Networks. Future Internet [Internet]. 2026;18 (3) :175. Publisher's VersionAbstract

Mobile Wireless Sensor Networks (MWSNs) enhance traditional wireless sensor networks by allowing sensor nodes to move, resulting in continuously changing network topologies. Although this mobility enables advanced applications such as disaster response, intelligent transportation systems, and mission-critical monitoring, it poses major challenges for secure and scalable key management in large-scale deployments. Most existing key management and key pre-distribution schemes are tailored to static or lightly mobile networks and therefore suffer from limited scalability, excessive memory consumption, inefficient key utilization, and increased vulnerability to node capture when applied to highly mobile environments. This paper proposes a mobility-aware, zone-based key management scheme that integrates an enhanced composite key distribution mechanism with dynamic key refinement. The network is partitioned into logical zones, each maintaining an independent key pool to confine security breaches and improve scalability. To adapt to mobility-induced topology changes, sensor nodes continuously refine their key rings by preserving only the cryptographic keys associated with persistent neighbor relationships. This selective retention strategy significantly reduces storage overhead while strengthening resilience against key compromise and unauthorized access. Comprehensive analytical modeling and performance evaluations demonstrate that the proposed scheme achieves higher secure connectivity, stronger resistance to node capture attacks, and improved scalability compared to existing approaches, particularly in dense and highly mobile MWSN scenarios.

Chenna A, Boubiche D-E, Benyahia A, Homero T-C, Martínez-Peláez R, Velarde-Alvarado P. A Mobility-Aware Zone-Based Key Management Scheme with Dynamic Key Refinement for Large-Scale Mobile Wireless Sensor Networks. Future Internet [Internet]. 2026;18 (3) :175. Publisher's VersionAbstract

Mobile Wireless Sensor Networks (MWSNs) enhance traditional wireless sensor networks by allowing sensor nodes to move, resulting in continuously changing network topologies. Although this mobility enables advanced applications such as disaster response, intelligent transportation systems, and mission-critical monitoring, it poses major challenges for secure and scalable key management in large-scale deployments. Most existing key management and key pre-distribution schemes are tailored to static or lightly mobile networks and therefore suffer from limited scalability, excessive memory consumption, inefficient key utilization, and increased vulnerability to node capture when applied to highly mobile environments. This paper proposes a mobility-aware, zone-based key management scheme that integrates an enhanced composite key distribution mechanism with dynamic key refinement. The network is partitioned into logical zones, each maintaining an independent key pool to confine security breaches and improve scalability. To adapt to mobility-induced topology changes, sensor nodes continuously refine their key rings by preserving only the cryptographic keys associated with persistent neighbor relationships. This selective retention strategy significantly reduces storage overhead while strengthening resilience against key compromise and unauthorized access. Comprehensive analytical modeling and performance evaluations demonstrate that the proposed scheme achieves higher secure connectivity, stronger resistance to node capture attacks, and improved scalability compared to existing approaches, particularly in dense and highly mobile MWSN scenarios.

Chenna A, Boubiche D-E, Benyahia A, Homero T-C, Martínez-Peláez R, Velarde-Alvarado P. A Mobility-Aware Zone-Based Key Management Scheme with Dynamic Key Refinement for Large-Scale Mobile Wireless Sensor Networks. Future Internet [Internet]. 2026;18 (3) :175. Publisher's VersionAbstract

Mobile Wireless Sensor Networks (MWSNs) enhance traditional wireless sensor networks by allowing sensor nodes to move, resulting in continuously changing network topologies. Although this mobility enables advanced applications such as disaster response, intelligent transportation systems, and mission-critical monitoring, it poses major challenges for secure and scalable key management in large-scale deployments. Most existing key management and key pre-distribution schemes are tailored to static or lightly mobile networks and therefore suffer from limited scalability, excessive memory consumption, inefficient key utilization, and increased vulnerability to node capture when applied to highly mobile environments. This paper proposes a mobility-aware, zone-based key management scheme that integrates an enhanced composite key distribution mechanism with dynamic key refinement. The network is partitioned into logical zones, each maintaining an independent key pool to confine security breaches and improve scalability. To adapt to mobility-induced topology changes, sensor nodes continuously refine their key rings by preserving only the cryptographic keys associated with persistent neighbor relationships. This selective retention strategy significantly reduces storage overhead while strengthening resilience against key compromise and unauthorized access. Comprehensive analytical modeling and performance evaluations demonstrate that the proposed scheme achieves higher secure connectivity, stronger resistance to node capture attacks, and improved scalability compared to existing approaches, particularly in dense and highly mobile MWSN scenarios.

Achouri Y, Djellab R, Hamouid K. New Multiparty Quantum Key Agreement with enhanced efficiency. Computers and Electrical Engineering [Internet]. 2026;130. Publisher's VersionAbstract

Quantum Key Agreement (QKA) is a cornerstone of quantum cryptography, facilitating secure key distribution among multiple participants. Existing QKA protocols often suffer from scalability issues and increased computational complexity as the number of participants grows. This paper proposes an efficient Circle Multiparty Quantum Key Agreement (CMQKA) protocol based on the BB84 protocol. This protocol enhances quantum resource efficiency and ensures equal participation in a circular topology. The key feature lies in the optimized use of quantum resources, minimizing the qubit overhead while ensuring high security standards. By achieving a qubit efficiency of 1/2n, it significantly improves the multiparty quantum communications. A thorough security analysis is conducted to demonstrate the protocol’s resilience against common quantum threats.

Achouri Y, Djellab R, Hamouid K. New Multiparty Quantum Key Agreement with enhanced efficiency. Computers and Electrical Engineering [Internet]. 2026;130. Publisher's VersionAbstract

Quantum Key Agreement (QKA) is a cornerstone of quantum cryptography, facilitating secure key distribution among multiple participants. Existing QKA protocols often suffer from scalability issues and increased computational complexity as the number of participants grows. This paper proposes an efficient Circle Multiparty Quantum Key Agreement (CMQKA) protocol based on the BB84 protocol. This protocol enhances quantum resource efficiency and ensures equal participation in a circular topology. The key feature lies in the optimized use of quantum resources, minimizing the qubit overhead while ensuring high security standards. By achieving a qubit efficiency of 1/2n, it significantly improves the multiparty quantum communications. A thorough security analysis is conducted to demonstrate the protocol’s resilience against common quantum threats.

Achouri Y, Djellab R, Hamouid K. New Multiparty Quantum Key Agreement with enhanced efficiency. Computers and Electrical Engineering [Internet]. 2026;130. Publisher's VersionAbstract

Quantum Key Agreement (QKA) is a cornerstone of quantum cryptography, facilitating secure key distribution among multiple participants. Existing QKA protocols often suffer from scalability issues and increased computational complexity as the number of participants grows. This paper proposes an efficient Circle Multiparty Quantum Key Agreement (CMQKA) protocol based on the BB84 protocol. This protocol enhances quantum resource efficiency and ensures equal participation in a circular topology. The key feature lies in the optimized use of quantum resources, minimizing the qubit overhead while ensuring high security standards. By achieving a qubit efficiency of 1/2n, it significantly improves the multiparty quantum communications. A thorough security analysis is conducted to demonstrate the protocol’s resilience against common quantum threats.

Merghem M, Haoues M, SENOUSSI A, Dahane M, Mouss N-K. Integrated production and maintenance planning in imperfect hybrid manufacturing–remanufacturing systems with outsourcing and carbon emissions. International Journal of Production Economics [Internet]. 2026;291. Publisher's VersionAbstract

This study investigates the integrated planning of production, maintenance, and quality control in a hybrid manufacturing-remanufacturing system, accounting for deterioration, variability in the quality of returned products, carbon emissions, and outsourcing opportunities. The network consists of a manufacturer collaborating with an outsourcing remanufacturing provider. The manufacturer operates a single failure-prone machine to produce new products and to remanufacture returned ones. Recovered products that the manufacturer cannot process are sent to the outsourcing provider for remanufacturing. The system generates harmful emissions, potentially leading to environmental taxes and sanctions. We formulate a mixed-integer nonlinear programming model to determine the optimal integrated manufacturing, remanufacturing, outsourcing, and preventive maintenance plan. Eventually, the proposed strategy minimizes total economic costs and defects and ultimately reduces carbon emissions. We use a global solver for solving small instances, while a genetic algorithm metaheuristic is developed for larger ones. Extensive computational experiments reveal that the developed genetic algorithm is highly efficient, achieving gaps of less than 0.95% within shorter execution times for small instances and significantly outperforming the solver in larger ones. The results show that the integrated outsourcing strategy, combined with accounting for carbon emissions from both new and remanufactured products, significantly reduces the reliance on new products, leading to notable cost savings and environmental benefits. These savings become more pronounced as the number of returns increases.

Merghem M, Haoues M, SENOUSSI A, Dahane M, Mouss N-K. Integrated production and maintenance planning in imperfect hybrid manufacturing–remanufacturing systems with outsourcing and carbon emissions. International Journal of Production Economics [Internet]. 2026;291. Publisher's VersionAbstract

This study investigates the integrated planning of production, maintenance, and quality control in a hybrid manufacturing-remanufacturing system, accounting for deterioration, variability in the quality of returned products, carbon emissions, and outsourcing opportunities. The network consists of a manufacturer collaborating with an outsourcing remanufacturing provider. The manufacturer operates a single failure-prone machine to produce new products and to remanufacture returned ones. Recovered products that the manufacturer cannot process are sent to the outsourcing provider for remanufacturing. The system generates harmful emissions, potentially leading to environmental taxes and sanctions. We formulate a mixed-integer nonlinear programming model to determine the optimal integrated manufacturing, remanufacturing, outsourcing, and preventive maintenance plan. Eventually, the proposed strategy minimizes total economic costs and defects and ultimately reduces carbon emissions. We use a global solver for solving small instances, while a genetic algorithm metaheuristic is developed for larger ones. Extensive computational experiments reveal that the developed genetic algorithm is highly efficient, achieving gaps of less than 0.95% within shorter execution times for small instances and significantly outperforming the solver in larger ones. The results show that the integrated outsourcing strategy, combined with accounting for carbon emissions from both new and remanufactured products, significantly reduces the reliance on new products, leading to notable cost savings and environmental benefits. These savings become more pronounced as the number of returns increases.

Merghem M, Haoues M, SENOUSSI A, Dahane M, Mouss N-K. Integrated production and maintenance planning in imperfect hybrid manufacturing–remanufacturing systems with outsourcing and carbon emissions. International Journal of Production Economics [Internet]. 2026;291. Publisher's VersionAbstract

This study investigates the integrated planning of production, maintenance, and quality control in a hybrid manufacturing-remanufacturing system, accounting for deterioration, variability in the quality of returned products, carbon emissions, and outsourcing opportunities. The network consists of a manufacturer collaborating with an outsourcing remanufacturing provider. The manufacturer operates a single failure-prone machine to produce new products and to remanufacture returned ones. Recovered products that the manufacturer cannot process are sent to the outsourcing provider for remanufacturing. The system generates harmful emissions, potentially leading to environmental taxes and sanctions. We formulate a mixed-integer nonlinear programming model to determine the optimal integrated manufacturing, remanufacturing, outsourcing, and preventive maintenance plan. Eventually, the proposed strategy minimizes total economic costs and defects and ultimately reduces carbon emissions. We use a global solver for solving small instances, while a genetic algorithm metaheuristic is developed for larger ones. Extensive computational experiments reveal that the developed genetic algorithm is highly efficient, achieving gaps of less than 0.95% within shorter execution times for small instances and significantly outperforming the solver in larger ones. The results show that the integrated outsourcing strategy, combined with accounting for carbon emissions from both new and remanufactured products, significantly reduces the reliance on new products, leading to notable cost savings and environmental benefits. These savings become more pronounced as the number of returns increases.

Merghem M, Haoues M, SENOUSSI A, Dahane M, Mouss N-K. Integrated production and maintenance planning in imperfect hybrid manufacturing–remanufacturing systems with outsourcing and carbon emissions. International Journal of Production Economics [Internet]. 2026;291. Publisher's VersionAbstract

This study investigates the integrated planning of production, maintenance, and quality control in a hybrid manufacturing-remanufacturing system, accounting for deterioration, variability in the quality of returned products, carbon emissions, and outsourcing opportunities. The network consists of a manufacturer collaborating with an outsourcing remanufacturing provider. The manufacturer operates a single failure-prone machine to produce new products and to remanufacture returned ones. Recovered products that the manufacturer cannot process are sent to the outsourcing provider for remanufacturing. The system generates harmful emissions, potentially leading to environmental taxes and sanctions. We formulate a mixed-integer nonlinear programming model to determine the optimal integrated manufacturing, remanufacturing, outsourcing, and preventive maintenance plan. Eventually, the proposed strategy minimizes total economic costs and defects and ultimately reduces carbon emissions. We use a global solver for solving small instances, while a genetic algorithm metaheuristic is developed for larger ones. Extensive computational experiments reveal that the developed genetic algorithm is highly efficient, achieving gaps of less than 0.95% within shorter execution times for small instances and significantly outperforming the solver in larger ones. The results show that the integrated outsourcing strategy, combined with accounting for carbon emissions from both new and remanufactured products, significantly reduces the reliance on new products, leading to notable cost savings and environmental benefits. These savings become more pronounced as the number of returns increases.

Merghem M, Haoues M, SENOUSSI A, Dahane M, Mouss N-K. Integrated production and maintenance planning in imperfect hybrid manufacturing–remanufacturing systems with outsourcing and carbon emissions. International Journal of Production Economics [Internet]. 2026;291. Publisher's VersionAbstract

This study investigates the integrated planning of production, maintenance, and quality control in a hybrid manufacturing-remanufacturing system, accounting for deterioration, variability in the quality of returned products, carbon emissions, and outsourcing opportunities. The network consists of a manufacturer collaborating with an outsourcing remanufacturing provider. The manufacturer operates a single failure-prone machine to produce new products and to remanufacture returned ones. Recovered products that the manufacturer cannot process are sent to the outsourcing provider for remanufacturing. The system generates harmful emissions, potentially leading to environmental taxes and sanctions. We formulate a mixed-integer nonlinear programming model to determine the optimal integrated manufacturing, remanufacturing, outsourcing, and preventive maintenance plan. Eventually, the proposed strategy minimizes total economic costs and defects and ultimately reduces carbon emissions. We use a global solver for solving small instances, while a genetic algorithm metaheuristic is developed for larger ones. Extensive computational experiments reveal that the developed genetic algorithm is highly efficient, achieving gaps of less than 0.95% within shorter execution times for small instances and significantly outperforming the solver in larger ones. The results show that the integrated outsourcing strategy, combined with accounting for carbon emissions from both new and remanufactured products, significantly reduces the reliance on new products, leading to notable cost savings and environmental benefits. These savings become more pronounced as the number of returns increases.

2025
Bensaadallah M, Ghoggali N, Saidi L. Real-Time Neuro-Fuzzy Control with Nonlinear Compensation for a Rotary Inverted Pendulum: Experimental Validation and Comparison with State-Feedback. International Journal of Computational Methods and Experimental Measurements [Internet]. 2025;13 (3) :622–640. Publisher's VersionAbstract

This paper presents simulation and experimental validation of a Nonlinear Compensation-based Neuro Fuzzy (NCNF) controller designed to balance the rotary inverted pendulum (RIP). Traditional linear controllers, such as Proportional-Integral-Derivative (PID) and state-feedback with pole placement, usually achieve satisfactory results in simulations on linearized models. However, their performance decreases in hardware implementation because of disturbances and unmodeled nonlinear effects such as Coulomb friction and mechanical backlash. To overcome these challenges, a feedforward compensation function was developed to cancel these undesired effects, which is combined with an Adaptive Neuro-Fuzzy Inference System (ANFIS) controller that updates PID gains to improve the rotary arm tracking for a square-wave reference and stabilize the pendulum at the upright position. The proposed NCNF controller is validated through hardware-in-the-loop (HIL) experiments and compared with a baseline state-feedback controller. Results show that the arm angle (θ) overshoot decreased from 40.6% to 0.8% (lower step) and from 17.2% to 2.5% (upper), total steady-state θ-error from 5.75° to 0.296°, and the fitness index dropped from 41.12 to 25.23. The nonlinear compensation reduced the gap between simulation and real-time performance, while the ANFIS further improved the defined control metrics. Overall, the NCNF controller achieves more stable and precise tracking than the state-feedback control.

Bensaadallah M, Ghoggali N, Saidi L. Real-Time Neuro-Fuzzy Control with Nonlinear Compensation for a Rotary Inverted Pendulum: Experimental Validation and Comparison with State-Feedback. International Journal of Computational Methods and Experimental Measurements [Internet]. 2025;13 (3) :622–640. Publisher's VersionAbstract

This paper presents simulation and experimental validation of a Nonlinear Compensation-based Neuro Fuzzy (NCNF) controller designed to balance the rotary inverted pendulum (RIP). Traditional linear controllers, such as Proportional-Integral-Derivative (PID) and state-feedback with pole placement, usually achieve satisfactory results in simulations on linearized models. However, their performance decreases in hardware implementation because of disturbances and unmodeled nonlinear effects such as Coulomb friction and mechanical backlash. To overcome these challenges, a feedforward compensation function was developed to cancel these undesired effects, which is combined with an Adaptive Neuro-Fuzzy Inference System (ANFIS) controller that updates PID gains to improve the rotary arm tracking for a square-wave reference and stabilize the pendulum at the upright position. The proposed NCNF controller is validated through hardware-in-the-loop (HIL) experiments and compared with a baseline state-feedback controller. Results show that the arm angle (θ) overshoot decreased from 40.6% to 0.8% (lower step) and from 17.2% to 2.5% (upper), total steady-state θ-error from 5.75° to 0.296°, and the fitness index dropped from 41.12 to 25.23. The nonlinear compensation reduced the gap between simulation and real-time performance, while the ANFIS further improved the defined control metrics. Overall, the NCNF controller achieves more stable and precise tracking than the state-feedback control.

Bensaadallah M, Ghoggali N, Saidi L. Real-Time Neuro-Fuzzy Control with Nonlinear Compensation for a Rotary Inverted Pendulum: Experimental Validation and Comparison with State-Feedback. International Journal of Computational Methods and Experimental Measurements [Internet]. 2025;13 (3) :622–640. Publisher's VersionAbstract

This paper presents simulation and experimental validation of a Nonlinear Compensation-based Neuro Fuzzy (NCNF) controller designed to balance the rotary inverted pendulum (RIP). Traditional linear controllers, such as Proportional-Integral-Derivative (PID) and state-feedback with pole placement, usually achieve satisfactory results in simulations on linearized models. However, their performance decreases in hardware implementation because of disturbances and unmodeled nonlinear effects such as Coulomb friction and mechanical backlash. To overcome these challenges, a feedforward compensation function was developed to cancel these undesired effects, which is combined with an Adaptive Neuro-Fuzzy Inference System (ANFIS) controller that updates PID gains to improve the rotary arm tracking for a square-wave reference and stabilize the pendulum at the upright position. The proposed NCNF controller is validated through hardware-in-the-loop (HIL) experiments and compared with a baseline state-feedback controller. Results show that the arm angle (θ) overshoot decreased from 40.6% to 0.8% (lower step) and from 17.2% to 2.5% (upper), total steady-state θ-error from 5.75° to 0.296°, and the fitness index dropped from 41.12 to 25.23. The nonlinear compensation reduced the gap between simulation and real-time performance, while the ANFIS further improved the defined control metrics. Overall, the NCNF controller achieves more stable and precise tracking than the state-feedback control.

HAFID AICHA, Hocine R, Guezouli L, Moumen H. Federated Reinforcement Learning and Deep Q-Network: Improving Fault Tolerance and Energy Consumption in Swarm Robotics for Mine Prospection Missions. IEEE Acces [Internet]. 2025;13. Publisher's VersionAbstract

This article focuses on improving fault tolerance and optimizing energy consumption in the context of a mining prospection mission conducted by a swarm of autonomous robots. Two major contributions are proposed. The first aims to reduce communication between robots in order to increase the system’s robustness in the presence of failures. The second focuses on minimizing the trajectory of a deminer robot to reduce overall energy consumption. To address these goals, two reinforcement learning based algorithms are proposed: Deep Q-Network (DQN) and Federated Reinforcement Learning (FRL), both derived from the Q-learning algorithm. Simulation results examining the impact of the exploration rate α on the number of detected mines show that, with 10 autonomous robots of the same architecture and 30 randomly placed mines over 30 experiments, the FRL algorithm provides better fault tolerance and ensures that the main prospection mission is accomplished even in the presence of some robotic failures or errors. Furthermore, a second series of 60 experiments involving the integration of the deminer robot, focused on optimizing energy consumption, demonstrates that the DQN algorithm is more effective in reducing energy usage, due to improved a better optimization of unnecessary deminer movements, while successfully resolving deadlock situations that the latter may encounter. These findings open the door to the development of a hybrid algorithm combining the strengths of DQN and FRL to ensure both system robustness and minimal energy consumption.

HAFID AICHA, Hocine R, Guezouli L, Moumen H. Federated Reinforcement Learning and Deep Q-Network: Improving Fault Tolerance and Energy Consumption in Swarm Robotics for Mine Prospection Missions. IEEE Acces [Internet]. 2025;13. Publisher's VersionAbstract

This article focuses on improving fault tolerance and optimizing energy consumption in the context of a mining prospection mission conducted by a swarm of autonomous robots. Two major contributions are proposed. The first aims to reduce communication between robots in order to increase the system’s robustness in the presence of failures. The second focuses on minimizing the trajectory of a deminer robot to reduce overall energy consumption. To address these goals, two reinforcement learning based algorithms are proposed: Deep Q-Network (DQN) and Federated Reinforcement Learning (FRL), both derived from the Q-learning algorithm. Simulation results examining the impact of the exploration rate α on the number of detected mines show that, with 10 autonomous robots of the same architecture and 30 randomly placed mines over 30 experiments, the FRL algorithm provides better fault tolerance and ensures that the main prospection mission is accomplished even in the presence of some robotic failures or errors. Furthermore, a second series of 60 experiments involving the integration of the deminer robot, focused on optimizing energy consumption, demonstrates that the DQN algorithm is more effective in reducing energy usage, due to improved a better optimization of unnecessary deminer movements, while successfully resolving deadlock situations that the latter may encounter. These findings open the door to the development of a hybrid algorithm combining the strengths of DQN and FRL to ensure both system robustness and minimal energy consumption.

HAFID AICHA, Hocine R, Guezouli L, Moumen H. Federated Reinforcement Learning and Deep Q-Network: Improving Fault Tolerance and Energy Consumption in Swarm Robotics for Mine Prospection Missions. IEEE Acces [Internet]. 2025;13. Publisher's VersionAbstract

This article focuses on improving fault tolerance and optimizing energy consumption in the context of a mining prospection mission conducted by a swarm of autonomous robots. Two major contributions are proposed. The first aims to reduce communication between robots in order to increase the system’s robustness in the presence of failures. The second focuses on minimizing the trajectory of a deminer robot to reduce overall energy consumption. To address these goals, two reinforcement learning based algorithms are proposed: Deep Q-Network (DQN) and Federated Reinforcement Learning (FRL), both derived from the Q-learning algorithm. Simulation results examining the impact of the exploration rate α on the number of detected mines show that, with 10 autonomous robots of the same architecture and 30 randomly placed mines over 30 experiments, the FRL algorithm provides better fault tolerance and ensures that the main prospection mission is accomplished even in the presence of some robotic failures or errors. Furthermore, a second series of 60 experiments involving the integration of the deminer robot, focused on optimizing energy consumption, demonstrates that the DQN algorithm is more effective in reducing energy usage, due to improved a better optimization of unnecessary deminer movements, while successfully resolving deadlock situations that the latter may encounter. These findings open the door to the development of a hybrid algorithm combining the strengths of DQN and FRL to ensure both system robustness and minimal energy consumption.

HAFID AICHA, Hocine R, Guezouli L, Moumen H. Federated Reinforcement Learning and Deep Q-Network: Improving Fault Tolerance and Energy Consumption in Swarm Robotics for Mine Prospection Missions. IEEE Acces [Internet]. 2025;13. Publisher's VersionAbstract

This article focuses on improving fault tolerance and optimizing energy consumption in the context of a mining prospection mission conducted by a swarm of autonomous robots. Two major contributions are proposed. The first aims to reduce communication between robots in order to increase the system’s robustness in the presence of failures. The second focuses on minimizing the trajectory of a deminer robot to reduce overall energy consumption. To address these goals, two reinforcement learning based algorithms are proposed: Deep Q-Network (DQN) and Federated Reinforcement Learning (FRL), both derived from the Q-learning algorithm. Simulation results examining the impact of the exploration rate α on the number of detected mines show that, with 10 autonomous robots of the same architecture and 30 randomly placed mines over 30 experiments, the FRL algorithm provides better fault tolerance and ensures that the main prospection mission is accomplished even in the presence of some robotic failures or errors. Furthermore, a second series of 60 experiments involving the integration of the deminer robot, focused on optimizing energy consumption, demonstrates that the DQN algorithm is more effective in reducing energy usage, due to improved a better optimization of unnecessary deminer movements, while successfully resolving deadlock situations that the latter may encounter. These findings open the door to the development of a hybrid algorithm combining the strengths of DQN and FRL to ensure both system robustness and minimal energy consumption.

Lehis S, Siam A, Moumen H, Chergui W, Souidi M-EH, Bekhouche A. Multi-Head DDPG for Pursuit-Evasion with Interpretable Behavioral Decomposition. Ingénierie des Systèmes d’Information [Internet]. 2025;30 (12) :3117-3130. Publisher's VersionAbstract

Designing scalable and interpretable control strategies for decentralized multi-agent systems remains a challenge in reinforcement learning (RL). This challenge is particularly evident in pursuit–evasion tasks, which require coordination under partial observability, without explicit communication or centralized guidance. Although deep RL methods achieve strong performance, they typically operate as black boxes, limiting trust and deployment in safety-critical domains. We propose a Multi-Head DDPG architecture that decomposes control into three interpretable force components - pursuit, cohesion, and separation - weighted adaptively to generate context-aware actions. This design enables emergent role differentiation and interpretable self-organization in the model. In grid-based pursuit–evasion benchmarks, our method outperforms DQN, PPO, and standard DDPG in terms of success rate, convergence speed, and generalization, while also yielding transparent collective behaviors. Overall, the results show that weighted force-based behavioral decomposition provides a principled pathway toward achieving both high-performance and explainable multi-agent control.

Lehis S, Siam A, Moumen H, Chergui W, Souidi M-EH, Bekhouche A. Multi-Head DDPG for Pursuit-Evasion with Interpretable Behavioral Decomposition. Ingénierie des Systèmes d’Information [Internet]. 2025;30 (12) :3117-3130. Publisher's VersionAbstract

Designing scalable and interpretable control strategies for decentralized multi-agent systems remains a challenge in reinforcement learning (RL). This challenge is particularly evident in pursuit–evasion tasks, which require coordination under partial observability, without explicit communication or centralized guidance. Although deep RL methods achieve strong performance, they typically operate as black boxes, limiting trust and deployment in safety-critical domains. We propose a Multi-Head DDPG architecture that decomposes control into three interpretable force components - pursuit, cohesion, and separation - weighted adaptively to generate context-aware actions. This design enables emergent role differentiation and interpretable self-organization in the model. In grid-based pursuit–evasion benchmarks, our method outperforms DQN, PPO, and standard DDPG in terms of success rate, convergence speed, and generalization, while also yielding transparent collective behaviors. Overall, the results show that weighted force-based behavioral decomposition provides a principled pathway toward achieving both high-performance and explainable multi-agent control.

Pages