Large-diameter trees (taking DBH > 30 cm to define large trees) dominate the dynamics, function and structure of a forest ecosystem. The aim here was to employ sparse airborne laser scanning (ALS) data with a mean point density of 0.8 m(-2) and the non-parametric k-most similar neighbour (k-MSN) to predict tree diameter at breast height (DBH) distributions in a subtropical forest in southern Nepal. The specific objectives were: (1) to evaluate the accuracy of the large-tree fraction of the diameter distribution; and (2) to assess the effect of the number of training areas (sample size, n) on the accuracy of the predicted tree diameter distribution. Comparison of the predicted distributions with empirical ones indicated that the large tree diameter distribution can be derived in a mixed species forest with a RMSE% of 66% and a bias% of - 1.33%. It was also feasible to downsize the sample size without losing the interpretability capacity of the model. For large-diameter trees, even a reduction of half of the training plots (n = 250), giving a marginal increase in the RMSE% (1.12-1.97%) was reported compared with the original training plots (n = 500). To be consistent with these outcomes, the sample areas should capture the entire range of spatial and feature variability in order to reduce the occurrence of error. (C) 2017 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.