BACKGROUND: Studies evaluating safety of different birth settings for low-risk deliveries are often difficult to interpret because of great methodological problems. OBJECTIVE: To assess potential bias in comparisons of mortality between maternity institutions with different size and level of care, particularly when using various definitions of low-risk delivery and when studying stillbirth rates. DESIGN: Population-based study. POPULATION: The population of 1.74 million births in Norway from 1967 to 1996 recorded in The Medical Birth Registry of Norway. METHODS: First we explored the problems of properly identifying low-risk deliveries from population-based data and calculated adjusted perinatal mortality rates in sub-populations by excluding different risk factors. Then we measured the difference in apparent low-risk deliveries between institutions of different size and level of care. Finally we explored bias by using stillbirths and discuss the loss of statistical power by studying only livebirths. RESULTS: The occurrence of a whole spectrum of risk factors differed between small and large institutions, even after adjustment for birthweight. Although the majority of births were from low-risk deliveries, only 1/10th of all perinatal deaths occurred in this group after admission to a maternity unit. There was a systematic difference in the reporting of time of death for stillbirths between types of institutions; the rate of stillbirths occurring during delivery was higher among small institutions, while large institutions were more often uncertain in classifying time of death for stillbirths. CONCLUSIONS: Adjustments for a large number of different risk factors, large sample-sizes and caution in including stillbirth as outcome measure are needed when comparisons of safety between different sizes of delivery units are made for low-risk pregnancies.