1 Introduction

Maximum countries of the world are developing a policy on society’s quality of life (Meira et al. 2020). Industrialization and urbanization result from these efforts, which are responsible for energy consumption and waste generation that will increase the severe environmental issues, especially deterioration of air quality. On the other hand, developing countries face a total concentration of air pollutants due to the unplanned rapid growth of industrial activities with no rigid environmental rules and regulations (Fotourehchi 2016). Many researchers (Kumar et al. 2022; Gautam et al. 2020, 2022; Atash 2007; Sathaye et al. 1994; Faiz 1993) have noticed that the reason like an increased number of private vehicles, old vehicles without maintenance, poor fuel quality, unpaved road, or inadequate transportation/infrastructure in high-density areas like Calcutta, Beijing, Tehran, etc. Therefore, air pollution control is complex, with no rigid environmental regulations. Now, air pollution keeps its steps in every area like a food chain, climate change, and ecosystems and highlights its presence with various effects. Sun et al. (2017) found that air pollution, directly and indirectly, affects plant growth and food security for living things, respectively. A new trend in air pollution and epidemiological research is the relationship between air pollutants and the risk of diseases (i.e., air pollution and type 2 diabetes, air pollution and Parkinson’s, air pollution and incident chronic obstructive pulmonary disease, and air pollutants and respiratory issues) (Di et al. 2019; Hendryx et al. 2019; Liu et al. 2019; Kasdagli et al. 2019). According to published articles, it can easily understand that the impact of air pollution on developing countries is much higher than in developed nations, which affects the socio-industrial condition of developing countries.

Similarly, we can say that still the steps toward exposure to air pollution are scattered and vague in developing countries. Targeted policymaking can be feasible; it is necessary to clarify the trend of organized research work on air pollution with the cost of research and avoid repetitive work. Therefore, to investigate recent atmospheric problems and associated dynamic researchers in the field of air pollution exposure and health effects, providing the platform to scientists/researchers and policymakers to solve the problems related to exposure to air pollutants is essential. A scientific review of articles published in recent years should first be undertaken to provide a theoretical basis and determine the practical directions for controlling air pollution or reducing exposure in developing countries. Although many studies have already been reported on air pollution, sampling, and strategies for monitoring, different sources, and related health impact; however, no Scientometric articles have been found on the air pollution exposure and health (Feng et al. 2020; Idrees and Zheng 2020; Yang et al. 2020; Wang et al. 2019) suggested scientometric review as a reliable and valuable method for visualizing the scientific research pattern and getting scientific outputs. Here, the presented techniques highlight research insights by using the output of publications (i.e., authors, countries, keywords, citation and co-citations, and collaboration to enhance research on scientific platforms. Many researchers (Li et al. 2018; Sweilesh et al., 2018; Song et al. 2020) are now focusing on scientometric studies to explore the area of air pollution (i.e., haze particles, air pollution, respiratory health, air pollution & human health, and gaseous pollutants, etc. The present work aims to the bibliography and scientometric analysis of the research articles in air pollution exposure and health field to equip the researchers from institutes and industries with a platform of their interest and relevant research fields. Bibliography or scientometric analysis would provide the exact sketch of necessary research articles for their research in their desired arena either it can study the exposure of air pollution or the sampling techniques, not in the field of air pollution but in various fields of their interest using software like VOSviewer. Commendable and extensive studies have been widely employed using bibliometric analysis like Ranjbari et al. 2022a; Ranjbari et al. 2022b; Ranjbari et al. 2022c; Ranjbari et al. 2022d, their research has encouraged the quality of review papers by implementing these techniques to provide a new dimension in the systematic reviews. Based on the personal opinion and the authors; best knowledge, the presented work is the first approach to applying scientometric methods to air pollution exposure and health. The authors review the progress of air pollution sources and their possible effects on human health in the study. Moreover, the authors investigated the critical environmental conditions that affect source and level of concentration and the techniques for source apportionment. The study and outcomes introduce the future perspective on air pollution exposure and health based on their current status.

2 Methodology

This research analysis incorporates a bibliography and transforms the texts into structured formats to find new insights (Ranjbari et al. 2021, 2022a). The concept of mapping and estimating the area of air pollution in the branch of health and exposure is followed by a study that is divided into bibliometric analysis and text mining, which also includes bibliographic coupling (Ranjbari et al., 2022; Esfandabadi et al., 2022) The articles collectively gather to form various clusters based on the similarities by references that are cited. Text mining analysis is validated to uncover diverse themes in this field of air pollution concerning health and exposure. The extracted datasets are mined by excluding the nouns used by the researchers/authors. The co-occurrence linkages later cluster the abstracts and titles of the articles and later form different themes for discovery (Waltman et al. 2010). Eventually, the outputs would highlight it needs to focus on the area of air pollution and provide insights on atmospheric complications caused in health due to exposure. Figure 1 represents the research structure for this study.

Fig. 1
figure 1

Framework of the research

A systematic review requires a well-structured arrangement of desired literature of relevant research (Chaudhary et al. 2021), therefore strings used to construct are different pools of keywords “air pollution”, “chemical characterization”, “health “and “exposure” as they fabricate the skeleton of the present review. The data extracted from ISI Web of Science, also known as Web of Knowledge, as it contains a directory of bibliographic citations for versatile fields (medical, humanities, scientific and social sciences(citing web of science)) database involved search strings like “Air Pollution -Chemical Characterization-Health-Exposure” or “Air Pollution -Chemical Characterization” or “Air Pollution- Health” or " Air Pollution - Exposure” and Exposure. After all, the research objective is to consider the related works in Air pollution& health and exposure. In the final data sampled from 2018–2022, around 26859 documents were retrieved, these data were stored in batches of 105 plain text formats. The data collected were scrutinized by selecting the reliability conditions, peer-reviewed articles, conference proceedings, editorial notes, book chapters, and materials apart from English were factored out. Table 1 abridges the steps carried out to attain the final sample.

Table 1 Steps of the data collection process

3 Scientometric analysis

Evaluating the factual production statistically and mapping the research structure is done by bibliometric analysis. It achieves various interpretations by linking articles, journals, authors, keywords with co-citations, and co-occurrences networks (Feng et al. 2017), which can aid future trends and directions. VOSviewer (1.616) was used to perform this analysis adopted by Van Eck and Waltman (2010). The geographical distribution of publications, productivity, and influence of the author working in the field of air pollution, health and exposure, keyword analysis is followed to discover hotspots for the research liabilities and main journals publishing research articles in this field. Data cleaning is performed usually as it removes the redundant keywords like plural words and singular words, short forms and complete forms are merged, etc., to arrive at the accurate occurrences of the keywords used by authors. The manuscripts’ titles and abstracts have been extracted and text mined; this is one of the indicators coined with the algorithm to follow the co-occurrences. Data cleaning is also carried out before the text mining analysis. These indicators give various clusters and patterns related to the themes generated after the analysis of air pollution concerning health and exposure.

4 Results and discussion

This section will discuss the results of both analyses which will confront the research questions.

5 Geographical distribution of publications

The sample is geographically and spatially disseminated, to provide information on the leading countries contributing in the field of air pollution, health, and exposure. One hundred twenty-four countries have been part of the co-authorship network out of 169 countries that have published articles in this field. Figure 2 traces the co-authorship networks to the countries, the size describes the node quantity of published articles in that particular country, and the links based on the thickness evaluate the strength between each pair based on co-authorship. Tables 2 and 3 represent the top 10 countries concerning collaborating countries, published articles, and the overall co-authored articles and their citations. This is understood by noting the number of links and total link strength between each node pair.

Fig. 2
figure 2

Co-authorship Networking on Air pollution, Exposure and Health

Table 2 Top 10 countries in terms of overall published articles and number of co-author countries
Table 3 Top 10 countries in terms of overall co-authorship and citations number

As noticed in Table 2, China has the highest number of articles published and cited with 8239 articles and 95,351 citations. The USA has the highest co-author countries with 116 countries and 7571 collaborations and ranks second with 7444 articles published in the area of air pollution, health, and exposure. England ranks third in collaboration, citation, and articles published (31,197 citations 4546 collaboration, 2022 articles), and second rank in co-author countries. Countries like Italy, France, Germany, India, Canada, Spain, and Australia have also been ranked after China, The USA, and England in the top 10 countries in all four categories.

6 Authors’ productivity and influence

6.1 Most productive author

In an academic field, research authors engage in assessment development (Guo et al. 2021). A total of 90,343 authors have contributed to the research in this field of air pollution considering health and exposure, among which 1000 authors contributed at least five articles. Table 4 lists the most productive and Table 5 list the most influential authors.

Fig. 3
figure 3

Network Visualization of most influential authors

The authors who have contributed the highest number of articles are considered the most productive authors in our dataset. Similarly, the authors whose articles have been cited the most are considered the most influential authors with their co-authors and co-authorships. Yuming Guo is the most productive and influential author among 90,343 authors from our dataset. Figure 3 represents the network visualization of the top 1000 authors and their spatial distribution based on their number of citations and co-authorships.

6.2 Most influential authors

Table 4 The most productive authors in air pollution health and exposure
Table 5 The most influential author in air pollution health and exposure

6.3 Core journals

The articles which deal with air pollution, health, and exposure within our dataset are 26,859 and 651 journals have published these articles. This collection of journals has published at least five articles from our dataset. Figures 4 and 5 emphasize the top 10 journals that have most published articles and most citied articles in the particular journals. The figures provide the line graphs to understand the different journals and their number of publications and articles cited the most. International Journal of Environment Research and Public Health has the highest number of published articles i.e. 1347 marking the top as being the most productive journal, Science of The Total Environment journal has the highest citation numbers i.e., 23,182 securing the most influential journal in terms of the number of citations by their articles. There is no significant gap between the International Journal of Environment Research and Public Health and Science of The Total Environment in terms of productivity. The other productive journals like Environmental Science and Pollution Research and Environmental Research have published 1146 and 1108 articles. Based on citations the second most influential journal is Environmental Pollution having 15,945 citations, followed by 14,198 citations from Environmental International respectively.

Fig. 4
figure 4

Top productive journals in terms of published articles

7 Articles

This section is concerned mainly with analysis conducted for articles citations and bibliographic coupling.

7.1 Influential articles

The articles are often cited related to the research field. Hence it is considered an influential article (Merigó et al. 2015). Table 6 represents the top 10 highly cited articles within this field of air pollution, health, and exposure. The table lists two papers with 235 citations in the 8th and 9th positions. Landrigen et al. (2018) demonstrated the research on pollution and its chemical and biological contamination in air and water impacted health like chronic diseases like cancers, diabetes, and heart diseases, as well as endocrine disruptions and neuroendocrine disorders in humans with other such reasons like premature deaths of infants and complications during pregnancy, were discussed in this research article published by The Lancet. This article recorded the highest cited article within our dataset with 1419 citations and earned the most influential article among 400 selected articles out of 3679.

Fig. 5
figure 5

Top influential journal in terms of the number of citations in the article

Burnett et al. (2018) focuses on the global health concerns that dealt with exposure to outdoor fine particulate matter by studying mortality risk factors by using GEMM (Global exposure Mortality Model) to predict and compare the risk functions when exposed to PM2.5 at higher concentrations. This article has the second-highest citations with 685 and it’s been published in Proceedings of the National Academy of Sciences (PNAS) a peer-reviewed journal. The highly influential article (Rajagopalan et al. 2018) which has 322 citations, emphasized air pollution as a major reason for environmental risk factors that causes disability, global cardiovascular mortality based on long-term exposure, and short-term exposure to the fine particulate matter. This article was published in Journals of the American College of Cardiology. The other articles by Lelieveld et al. (2019) also focus on GEMM (Global exposure Mortality Model) to discuss coherent studies that require reevaluation of disease burden by understanding the reduction of life expectancy caused by air pollution, the following article has 318 citations and is the fourth influential article in our dataset. Ferronato & Torretta (2019) indicated their views about the crucial influence of mismanagement of waste and major health issues due to the pollutant emissions by water soil and air contamination which are the medium to spread disease when not disposed of treated, and managed properly. This article is published in the International Journal of Environmental Research and Public Health with 309 citations. Wu et al. (2020) studied the probabilities of long-term exposures to air pollution that increased the complications in health and mortality due to COVID-19, with 235 citations and is published in Science Advances. Coccia (2020) also have 235 citations, the research focuses on the industrial pollution that triggered the viral infectivity by the human transmission of COVID-19 from long-term exposure to PM10. Overall, articles discuss air pollution and its exposure affecting health. Highlighted influential articles have been published in significant journals with high impact factors. These articles are cited every day as contributing studies related to awareness and opening new areas of research that can reduce and control these effects on the environment and human health.

Table 6 Top 10 highly cited articles in the field of air pollution, chemical characterization, health, and exposure

8 Bibliographic coupling of articles

Bibliographic coupling of articles is performed to find admissible categories and prime themes in Air pollution associated with Health and Exposure based on their shared references. Of 26,859 research articles, 400 articles were selected with at least one common reference to each other. Figure 6 conceptualizes three main clusters.

Fig. 6
figure 6

Bibliographic coupling of the articles within the field of air pollution, chemical characterization, health, and exposure

The blue cluster represents the case studies of different countries affected by pollution and its impact on human health and climate and pandemic-related studies. The red cluster represents the health hazards that directly influence air pollutants. The green cluster deals with the articles that deal with chemical compositions and management practices to control these effects of pollution; safety policies, pollution prevention, and pollution mitigation is highlighted. The intersecting outlines describe the incorporation of two major themes. The size of the links denote the number of citation, the size of the bubbles demonstrates the co-occurrences between the pairs —Tables 7 and 8, and Table 9 list each theme’s top 5 highly cited articles. Few studies (Pye & Nenes, 2020; Fumian, 2020; Robinson et al., 2021) from the blue clusters (Cluster A) have been the major papers in this research category research area during the pandemic based on health and environment. Pye & Nenes (2020) suggested the atmospheric acidity of particles and clouds and its impacts on human health by providing new model calculations. Fumain (2020) discussed the research is about spatial tracking of COVID-19 by wastewater-based epidemiology by Whole-genome sequencing and the use of geo-processing tools to track and update by building heat maps on the sewage samples from SARS-CoV-2. Robinson et al. (2021) showed the reviews that due to the pandemic there is an extreme advancement in digitalization by the change in socio-economic conditions. Also, a constructive study has been done by Ravina et al., 2020, based on the application of the Lagrangian dispersion model for the urban air quality that had been affected by traffic pollution during the lockdown. Overall research and reviews from this blue cluster are based on COVID-19 and its post-pandemic and case studies and how the lockdown influenced better the air quality either by the sudden decrease in the emissions of NO2, PM2.5, etc. by mobile sources.

Table 7 Top 5 articles based on Case studies and Pandemic related articles

The most cited article (Landrigan et al., 2018) in the red cluster (Cluster B); research highlights chronic diseases, premature deaths caused by pollutants, and stresses that pollution-related diseases are increasing severely. Based on the global burden of disease, nearly 4.2 million premature deaths were caused by outdoor pollution in 2015, and also stated that Air pollution is a bridge for 7 million premature deaths each year, the research emphasizes the need to find measures for the adverse effects of air pollution on human health. Fine particulate matter impacts environmental risk factors for global cardiovascular mortality and disabilities, the assessment of exposure to these fine pollutants increases cardio-metabolic conditions. Several studies include reduction strategies to mitigate the health risks (Rajagopalan, 2018). Studies focus on SARS and new viruses associated with coronavirus Wu (2020), as air is the most basic medium for transmission, these respiratory viruses are significant to travel over multiple routes, and the droplets and aerosols exposed also influence human to human transmission in accelerating the infection rates and cause deaths (Morawska and Cao 2020; Shiu et al. 2019). There are few studies on the impacts of peak concentrations of air pollution on health based on the long term and short term exposure among those. Huang (2018) and Zhao (2018) have also contributed articles related to air pollution, health, and exposure.

Table 8 Top 5 articles based on Health Hazards influenced by air pollutants

The green cluster (Cluster C) mainly focuses on the management practices related to waste causing global issues like water, air, and soil contamination directing to health issues (Ferronato & Torretta, 2019). These studies are related to the multiphase chemistry, the aging of the aerosol in the ambient atmosphere, and the emission concerning the nutrient deposition of the particulate matter can suggest the human-induced ecosystem limitations, due to the anthropogenic emissions. Moreover, their long-term exposure and their management strategies, and preventive measures are studied in this category.

Table 9 Top three articles on chemical composition and management practices of aerosols and particulate matters

Keyword analysis provides a framework for describing the research domain and the assembly of the collected articles. In our data sets, Overall keywords were 58843, the scrutiny of the keywords is out of 5 minimum occurrences, and 6882 meet the threshold frequency. The authors’ keyword from our data set is 40567 and 3144 keywords were generated by data cleaning and displayed in the heat map for 400 author keywords in Fig. 7. The main hotspots are seen in Fig. 7: air pollution, PM2.5, particulate matter, COVID-19, and air quality. The most commonly used keywords from the collected articles have been shown in Table 10. The keyword air pollution has the highest frequency with 5313 occurrences compared with particulate matter which has the second-highest frequency of 2197 occurrences. Table 11 represents the keyword air pollution linked to other keywords such as particulate matter, PM 2.5, air quality, asthma, etc., in the author keyword lists of 763, 269, 229, 191, and 176 articles respectively. However, to provide a clearer display of the most frequent pairs of keywords without taking air pollution keywords into account, Table 12 lists chemical composition, health, and exposure keywords that are linked with their most frequent keywords. Chemical composition has most occurrences with PM2.5, source apportionment, particulate matter, and PM10 by 19, 14, 11, and 12. Health and Exposure have the keyword “air pollution” as the most frequent pair with 77 and 63 occurrences. These keywords are in queue with the bibliographic coupling analysis.

Table 10 The most frequent author keywords in the domain of air pollution, chemical characterization, health, and exposure
Fig. 7
figure 7

The main hotspots of the keywords

Table 11 The most frequent pairs of author keywords consider air pollution as the pivotal keyword
Table 12 The most frequent pairs of author keywords excluding Air pollution

9 Text mining results

Text mining is performed on the 26,859 collected articles from the Web of Science to identify the various research themes. Before the data cleaning, there were 359,922 noun phrases discovered, since the large dataset, the minimum occurrences of each term is selected to 35 and 2395 meet the threshold. For each of 2395 terms, a relevant score was calculated for 60% most relevant terms. After data cleaning, 800 noun phrases were selected, the total collected data collected, unrelated nouns like repetitive abbreviations were manually unchecked from the list, and the rest remained to build the clusters based on the co-occurrences of the terms to unfold the research themes. The four major themes that have been identified are.

  1. 1.

    Associations of air pollutants and their risk factors on health and the environment,

  2. 2.

    Chemical Composition and Characterization of aerosol and fine particulate matter,

  3. 3.

    Human Health effects due to air pollution and exposure, and.

  4. 4.

    Modeling and Analytical approaches to reduce and eradicate the effects of air pollution.

Fig. 8
figure 8

The network display of prime studies carried out inside air pollution, health and exposure categories

Figure 8 represents the thematic diagram for air pollution, health, and exposure. Similar to Fig. 6 as seen in (bibliographic coupling) the larger the circles and the links between the pairs show the highest co-occurrences based on their occurrences respectively. The green cluster has main research themes based on the terms association, risk factor, lung function, and blood pressure. Blue cluster has the main terms as inflammation, disorder, biomarker, and toxicity. A short-term effect, relative risk, pneumonia, and additive model as its major terms presented in the yellow cluster. The red cluster has main terms like contamination, characterization, and aerosol as research areas.

These research themes are based on the association of air pollution, Chemical composition of aerosol, and fine particulate matter. Human health and effects due to air pollution and exposure, validation of modeling and analytical approaches to reduce, control and measure the effects of air pollution caused by long-term or short-term exposure on human health and the environment. The direct and indirect association with the contaminated air leads to a high risk of health degradation (Weichenthal et al. 2014). Some studies have been published on the effects of air pollution resulting in respiratory and cardiovascular issues (Strak et al., 2010). Health risks have been countered due to an increase in air pollution exposures irrespective of sources. PM2.5 exposures lead to neutrophilic pulmonary inflammation and oxidative stress in adults. O3, PM2.5, NOx, and black carbon are some specific residence-specific air pollutants; their higher concentrations are modeled to monitor their effects from the late 90s. They mainly contribute to worsening lung health and obstruct airflow flowed by an increase in mortality ratios. The immune system is also at risk at the molecular stage due to environmental pollution.

Their research area focuses on PM10 having a positive association with autoimmune diseases, stating a risk of 7% is by an increase in 10 µg/m3. Risk of rheumatoid arthritis, an increase in PM2.5 exposure leads to inflammatory bowel diseases and connective tissue diseases (Adami et al., 2022). Air pollution exposure leads to respiratory or chronic diseases and disorders like psychiatric and schizophrenia, and depression. Residential air pollution is evident to cause mood disorders (Newbury et al., 2021). The cardiac functionality is caused by the variations in the bloodstream by the long-term exposure to ultrafine particles found in the bloodstream. Coronary arteriosclerosis is also reported due to pollutants released due to traffic emissions (Hoffmann B 2007). Exposure due to NO2 reports ventricle hypertrophy in human beings. While short-term exposure leads to stroke and myocardial infarction. Based on these organic aerosols’ chemical and biological composition, the carcinogenic properties impact deterioration of health in all age groups, especially in infants and pregnant women. To understand the complexity of these pollutants biomarkers are introduced. Biomarkers unravel the complex relationship between air pollutant exposure and respiratory health issues. Nonetheless, the individuality of the air pollutants is complex and distinctive based on their inorganic and organic nature; their effects on health are tedious and vigorous to be discovered. Sometimes due to the weather conditions and traffic density, the variability of pollutants causes complex scrutiny by biomarkers (Suhaimi and Jalaludin 2015). Global warming and climate shifts also instigate the situation. Additive models like generalized additive models and many more are used to measure and analyze air pollutants’ long-term and short-term effects. Therefore, the intention to understand the leading themes in air pollution, health and exposure are research areas that deal with studying the exposure levels of resident pollutants and the effects of those pollutants on human beings. These cited articles provide remarkable approaches and reviews to measure, analyze, and control these effects of pollutants on human health and the environment. The future aspect of the study relates that air pollution and health as a merging research area and certain relevant studies have not been clear about the severity and need to explore in these domains. Even though there is commendable research done by the researchers over the past years yet the need for the latest sampling techniques, physical, chemical characterization and its impacts on health and environment, control and measure, and management strategies are some categories that have to be explored more, invested and mitigated. Our study attempted to find these research gaps to address the main research areas, authors working in this area, type of journals that have been published in these field-related articles, hotspots in these 5 years by analyzing the keywords, and lastly understanding the 3 domains in the field of air pollution related to health and exposure causing scientist and researchers to have a elucidate study for their reference.

10 Conclusion

This study attempted systematic evaluation and scientometric analysis of air pollution, health, and exposure by referring to 26,859 research articles from the Web of Science. The result is mapped by using scientometric analysis which includes (i) the geographical distribution of publications; (ii) authors’ influence and productivity; (iii) influential articles;(iv) bibliographic coupling of articles; (v) keyword analysis; and (vi) text mining. In the bibliographic coupling of articles three different clusters are identified related to the case studies of different countries affected by pollution and its impact on human health and climate and pandemic-related studies, the second cluster was about the health hazards and the last cluster was based on chemical composition and management practices. The research themes and areas of research are mostly on the associations and risk factors based on the resident pollutants, the effects of the pollutants on human beings, and modeling and analyzing long-term and short-term effects of air pollution. Few research articles talk about the environmental degradation that impacts health by air pollution. The overall understanding of this research was to learn the areas covered in air pollution about health and exposure, with this outcome, one could understand the need for research in this domain. The themes that are emerging in this field but lack evident studies to impose the need to work in this arena. Our study unraveled 4 themes that are falling in the above criteria are associations of air pollutants and their risk factors on health and the environment, chemical composition and characterization of aerosol and fine particulate matter, human health effects due to air pollution and exposure, and modeling and analytical approaches to reduce and eradicate the effects of air pollution. There are limited studies to outline these direct influence of exposure to pollutants and knowing that modern lifestyle and industrialization are one of the major links for daily deaths. This has an indirect effect on the economy and welfare of the country. Specific studies cannot provide definite conclusions because of the unavailability of data due to negligent behavior by the management.

Moreover, essential preventive measures are not fully applicable in low-income countries, and providing health care equally is nearly impossible. There are some limitations to our research. We considered only the Web of Science database; assimilating Scopus, data would enhance the reliability of the current research and extend the further findings. Secondly, as our search was constrained to five years of data, our research did not recognize prior work that would have the most outstanding research. Lastly, some non-English documents were removed from our research, which probably would have caused us to miss some specific research and practices in air pollution health and exposure.