Relevant experience learning:A deep reinforcement learning method for UAV autonomous motion planning

来源 :中国航空学报(英文版) | 被引量 : 0次 | 上传用户:cho159753
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a suitable method to solve the UAV Autonomous Motion Planning(AMP) problem can improve the success rate of UAV missions to a certain extent.In recent years,many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP prob-lem and have achieved good results.From the perspective of sampling,this paper designs a sam-pling method with double-screening,combines it with the Deep Deterministic Policy Gradient(DDPG) algorithm,and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algo-rithm.The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool,finds the experiences most similar to the current state to learn according to the theory in human education,and expands the influence of the learning process on action selection at the current state.All experiments are applied in a com-plex unknown simulation environment constructed based on the parameters of a real UAV.The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm,while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.
其他文献
The efficiency parameters are studied in this paper for evaluating the compression qual-ity of the inlets with different compression degrees and assessing different design methods.Self-consistency is proposed for the efficiency parameters,based on two mat
Ammonia (NH3) is considered as a potential alternative carbon free fuel to reduce green-house gas emission to meet the increasingly stringent emission requirements.Co-burning NH3 and H2 is an effective way to overcome ammonia\'s relative low burning vel
This paper presents a practical and efficient design method for aircraft Mission Success Space (MSS) based on the entropy measurement (EM).First,fundamentals regarding MSS,Inverse Design (ID) and entropy are discussed.Then,two EM schemes of entropy-based
In this paper,a failure evaluation criterion was proposed for the bolted casing-flange structure under impact loading.Subsequently,ballistic tests with eighteen bolted casing-flange structure specimens were conducted to validate the failure evaluation cri
Hollow cathode discharges are widely used as neutralizers for the electric propulsion systems and recently developed into micro-thrusters for the small satellites.In this work,a dual-emitter hollow cathode thruster is developed,which can be operated in tw
State-of-the-art model-driven Direction-Of-Arrival (DOA) estimation methods for mul-tipath signals face great challenges in practical application because of the dependence on the precise multipath model.In this paper,we introduce a framework,based on deep
In this paper,the nonlinear resonance characteristics of a dual-rotor system are investi-gated with the consideration of a local defect on the inter-shaft bearing of the system.A simplified model of the dual-rotor system is proposed by considering that th
High dynamic conditions impose critical challenges on Global Navigation Satellite Sys-tem (GNSS) receivers,leading to large tracking errors or even loss of tracking.Current methods that intend to improve receivers\' adaptability for high dynamics requir
For accurate Finite Element (FE) modeling for the structural dynamics of aeroengine casings,Parametric Modeling-based Model Updating Strategy (PM-MUS) is proposed based on efficient FE parametric modeling and model updating techniques regarding uncorrelat
Conventional semi-active laser guidance takes advantage of the laser designator to illu-minate the stable and uniform laser spot on target precisely.The seeker collects the reflected light by a quadrant detector and outputs the relative position informati