Air mass factor (AMF) calculation is the largest source of uncertainty in NO2 and HCHO satellite retrievals in situations with enhanced trace gas concentrations in the lower troposphere. Structural uncertainty arises when different retrieval methodologies are applied within the scientific community to the same satellite observations. Here, we address the issue of AMF structural uncertainty via a detailed comparison of AMF calculation methods that are structurally different between seven retrieval groups for measurements from the Ozone Monitoring Instrument (OMI). We estimate the escalation of structural uncertainty in every sub-step of the AMF calculation process. This goes beyond the algorithm uncertainty estimates provided in state-of-the-art retrievals, which address the theoretical propagation of uncertainties for one particular retrieval algorithm only. We find that top-of-atmosphere reflectances simulated by four radiative transfer models (RTMs) (DAK, McArtim, SCIATRAN and VLIDORT) agree within 1.5 %. We find that different retrieval groups agree well in the calculations of altitude resolved AMFs from different RTMs (to within 3 %), and in the tropospheric AMFs (to within 6 %) as long as identical ancillary data (surface albedo, terrain height, cloud parameters and trace gas profile) and cloud and aerosol correction procedures are being used. Structural uncertainty increases sharply when retrieval groups use their preference for ancillary data, cloud and aerosol correction. On average, we estimate the AMF structural uncertainty to be 42 % over polluted regions and 31 % over unpolluted regions, mostly driven by substantial differences in the a priori trace gas profiles, surface albedo and cloud parameters. Sensitivity studies for one particular algorithm indicate that different cloud correction approaches result in substantial AMF differences in polluted conditions (5 to 40 % depending on cloud fraction and cloud pressure, and 11 % on average) even for low cloud fractions (<0.2) and the choice of aerosol correction introduces an average uncertainty of 50 % for situations with high pollution and high aerosol loading. Our work shows that structural uncertainty in AMF calculations is significant and that it is mainly caused by the assumptions and choices made to represent the state of the atmosphere. In order to decide which approach and which ancillary data are best for AMF calculations, we call for well-designed validation exercises focusing on polluted conditions in which AMF structural uncertainty has the highest impact on NO2 and HCHO retrievals.