Object detection in aerial images dataset is a challenging concept because of its dynamic behavior. This proposed work provides a novel way of aerial image detection in high spatial resolution aerial picture land-use/cover mapping using a method that is introduced to deal with the unique properties of aerial photographs, such as frequency domain content variability. Patch detection and description, in particular, are devised to partition and describe diverse sub-regions of objects made up of many homogenous components. In the present work we have proposed the VGG16 and its output is further feed to the Faster RCNN which makes the proposed model a novel work. Furthermore, the proposed bag of feature representation is built using statistics learned from the training dataset about the occurrence of the learning set of the image dataset. The analyses of several patch descriptors show that a mixture of spectral and textural characteristics is a good choice. In addition, to limit the impact of outliers on categorization in test data, a threshold-based technique is used. Experiments with data from aerial images are simulated and results are obtained using MATLAB 2021R software then the results are contrasted with the methods currently in use. The proposed method outperforms the existing work.