Over the years, modelling approaches have played an increasingly important role in designing agricultural and environmental policies and formulating measures to reduce nutrient emissions in The Netherlands. In recent years, increased emphasis has been put on validation of models used for that purpose. Nitrogen (N) cycling and leaching in sandy soils in The Netherlands have been intensively studied in a number of plots at the experimental dairy farm `De Marke`. These plots differed with respect to crop rotation, fertiliser application and hydrology. The three crop rotations were, respectively, permanent grassland, 3 years with grass followed by 1 year with beets (Beta vulgaris L.) and 2 years with maize (Zea mays L.), and 3 years grassland followed by 1 year with beets and 4 years with maize. The experimental results have been used to validate two nutrient emission models, the integrated modelling system STONE for regional and national scale analyses and the ANIMO model for site-scale analyses. Comparison of the measured and simulated N fluxes and balances for the different experimental plots showed that mineral N in the top soil and hence the main N inputs into the soil system were simulated well with both models, and that nitrate leaching to groundwater was moderately well and moderately well to poorly simulated by ANIMO and STONE, respectively. The simulated nitrate leaching by STONE was often too high, which was mainly caused by underestimation of crop N-uptake. Nitrogen uptake was calculated more precisely by ANIMO, but this N uptake approach needs calibration at the site-scale and cannot be applied at larger scales. This study showed that testing of a large-scale model like STONE on measured data from field experiments can hardly be expected to be satisfactory and second, calibration of a large-scale model on well-managed experiments may be wrong for practical applications. This study also showed that in regional or national scale nutrient emission studies with a model like STONE, the model initialisation and parameterisation can only be done in a regionally schematized way. Hence, the results are generally less precise than those from modelling at the site-scale.