Methods for network dependent data
Lead Research Organisation:
University of Essex
Department Name: Economics
Abstract
Networks are a pervasive feature of modern economies; examples include financial networks such as inter-bank networks, social networks such as friendship or social media networks and international networks formed by trading links. Economists have become increasingly interested in understanding the formation of networks and their implications for the economy. For example, the 2007-09 financial crisis starkly illustrated that the implications of both intra and inter-economy networks can be profound as contagion swept through the US banking market and global economies as a whole. The crisis clearly demonstrated the perils of treating cross-sectional economic agents, such as banks, as independent of each other. The aim of this project is to develop tools and methods that will help practitioners and academics to do valid inference in settings where the data are dependent across individual units and flexibility is permitted in the exact form of the economic relationship assumed.
Analysis of economic data with network dependence is different from that of dependent `time series' data. Time series are observed at chronological points and thus can be ordered in a natural fashion. A basis for modelling is by assuming forms of dependence that allow observations to be determined partially by past values. However, networks are formed by 'cross sectional' units and cannot generally be ordered in this way. Information on locations of observations may also be unavailable. Thus, specialised models and approaches are required that can handle these problems.
While there is an active econometric literature dealing with network dependent data, the toolkit available for practitioners facing such data sets is no match for that when handling independent data. This lacuna is even more acute when one considers the problem of inference in a nonparametric framework, in which one allows the data to learn the form of the relationship between economic variables, rather than impose a particular form. The cost of the generality of nonparametric methods is the requirement of a larger volume of data than parametric methods to achieve similar levels of performance. However, very large, even huge, data sets are a feature of modern applied econometrics and increase the appeal of nonparametric methods. Furthermore, in the age of `big data' practitioners must remain open-minded about choice of model as they gain access to larger data sets, and thus even the traditional parametric approach should be relaxed to permit models to become more complex as sample size increases, as the project proposes.
The project will recognize and develop a link between the modern econometric literature in partial identification and nonparametric network analysis. Specifically, partial identification occurs when the object of interest can only be distinguished within a certain set of `correct' objects, and not as a single unique object. On the other hand, a nonparametric object called `graphon' has been devised to parsimoniously capture the properties of a network. Graphons, however, are only partially identified, and the project proposes to develop a link between them and the econometric partial identification literature. For example, testing if two graphons are equal can be viewed as test for the 'equality' of the underlying networks.
Rigorous theory for the methods proposed to deal with the questions summarised above will be developed. Once the methods are theoretically justified, their properties will be examined in simulation studies and illustrated with applications to real data. Computer packages will be made available to help practitioners further in using the methods proposed in the most efficient way possible. In addition to academics, these practitioners would include professional economists in international organisations as well as researchers in companies with access to large, connected data sets.
Analysis of economic data with network dependence is different from that of dependent `time series' data. Time series are observed at chronological points and thus can be ordered in a natural fashion. A basis for modelling is by assuming forms of dependence that allow observations to be determined partially by past values. However, networks are formed by 'cross sectional' units and cannot generally be ordered in this way. Information on locations of observations may also be unavailable. Thus, specialised models and approaches are required that can handle these problems.
While there is an active econometric literature dealing with network dependent data, the toolkit available for practitioners facing such data sets is no match for that when handling independent data. This lacuna is even more acute when one considers the problem of inference in a nonparametric framework, in which one allows the data to learn the form of the relationship between economic variables, rather than impose a particular form. The cost of the generality of nonparametric methods is the requirement of a larger volume of data than parametric methods to achieve similar levels of performance. However, very large, even huge, data sets are a feature of modern applied econometrics and increase the appeal of nonparametric methods. Furthermore, in the age of `big data' practitioners must remain open-minded about choice of model as they gain access to larger data sets, and thus even the traditional parametric approach should be relaxed to permit models to become more complex as sample size increases, as the project proposes.
The project will recognize and develop a link between the modern econometric literature in partial identification and nonparametric network analysis. Specifically, partial identification occurs when the object of interest can only be distinguished within a certain set of `correct' objects, and not as a single unique object. On the other hand, a nonparametric object called `graphon' has been devised to parsimoniously capture the properties of a network. Graphons, however, are only partially identified, and the project proposes to develop a link between them and the econometric partial identification literature. For example, testing if two graphons are equal can be viewed as test for the 'equality' of the underlying networks.
Rigorous theory for the methods proposed to deal with the questions summarised above will be developed. Once the methods are theoretically justified, their properties will be examined in simulation studies and illustrated with applications to real data. Computer packages will be made available to help practitioners further in using the methods proposed in the most efficient way possible. In addition to academics, these practitioners would include professional economists in international organisations as well as researchers in companies with access to large, connected data sets.
Planned Impact
Networks of economic agents, for instance financial institutions connected by common exposures to risks, are a distinctive feature of modern economies. This project aims to develop tools and methods that are valid when such agents are connected through networks and apply these to data sets to gain insight into the effect of networks on the real economy. The research produced during the project is anticipated to have impact beyond the university-based researchers discussed in the academic beneficiaries. We now outline these beneficiaries together with how they will benefit.
Researchers in central banks are interested in the implications of networks for their work, as are those based in the research wings of treasuries and the US Federal Reserve boards. The applicant has had a personal meeting with Raghuram Rajan, at the time the Governor of the Reserve Bank of India (India's central bank), who expressed the desire to see such advances percolate to the systemic risk and financial stability analysis team at the bank. This is no doubt applicable to central banks globally. Researchers at international organizations such as the International Monetary Fund (IMF), the World Bank and the Organisation for Economic Cooperation and Development are also expected to have a strong interest in the results of the project.
Little is known empirically about the effect of financial networks on the real economy, and one reason for this is a lack of appropriate tools. This project will address this missing link. Central banks will be able to use the research produced by this project to conduct more accurate inferences on the real economy, as well as gain insight on spillovers via financial networks, e.g. in credit markets. Both aspects are important from a policy perspective and will generate impacts. The applicant is a fellow of the Essex Centre for Macroeconometrics and Financial Econometrics, which has strong links with central banks via the Bank of England (Profs Simon Price and Martin Weale), the Bank of Italy (Dr Fabio Busetti) and the Bank of Portugal (Dr Paulo Rodrigues). The applicant also has contacts at the IMF, and various central banks (Bundesbank, Chile, Thailand) which will help to maximize impact.
Interest in the work on graphons (scalar nonparametric objects that characterise network structure) will come also from industry. The initial development of graphons is in part due to research teams based in Microsoft, while many online retailers (eg Amazon) either employ specialist network analysis teams or enlist the services of companies that provide such skills (eg Qubit). This stakeholder group will be engaged through workshops with industry participants, eg those conducted by the Institute for Analytics and Data Science at Essex, UCL Big Data Institute and the Alan Turing Institute. Publishers are interested in more sophisticated network analysis, given their interest in citation networks. The PI has links with Mendeley that will be used to generate impacts. Aviation companies, airports and aviation consultants also have access to network data sets relating to passenger flows that they wish to analyse better. This can lead to substantial cost-savings through more efficient routes, e.g. by determining the impact of a change in a given city on passenger flows to and from that city and other cities. Such companies frequently employ PhD researchers for whom the research conducted is accessible. The PI has contacts with links to Airbus, while the university has contacts at Southend Airport through a commercial liaison wing. Both will be followed up to create impact.
Long-term impact will be ensured via the one-day workshop at the end of the grant period, which will bring together a range of academic and industrial delegates and seeks to create an 'impact legacy', as well as via open access software packages and routines to ensure that beneficiaries are able to use the methods developed easily, thereby increasing impact.
Researchers in central banks are interested in the implications of networks for their work, as are those based in the research wings of treasuries and the US Federal Reserve boards. The applicant has had a personal meeting with Raghuram Rajan, at the time the Governor of the Reserve Bank of India (India's central bank), who expressed the desire to see such advances percolate to the systemic risk and financial stability analysis team at the bank. This is no doubt applicable to central banks globally. Researchers at international organizations such as the International Monetary Fund (IMF), the World Bank and the Organisation for Economic Cooperation and Development are also expected to have a strong interest in the results of the project.
Little is known empirically about the effect of financial networks on the real economy, and one reason for this is a lack of appropriate tools. This project will address this missing link. Central banks will be able to use the research produced by this project to conduct more accurate inferences on the real economy, as well as gain insight on spillovers via financial networks, e.g. in credit markets. Both aspects are important from a policy perspective and will generate impacts. The applicant is a fellow of the Essex Centre for Macroeconometrics and Financial Econometrics, which has strong links with central banks via the Bank of England (Profs Simon Price and Martin Weale), the Bank of Italy (Dr Fabio Busetti) and the Bank of Portugal (Dr Paulo Rodrigues). The applicant also has contacts at the IMF, and various central banks (Bundesbank, Chile, Thailand) which will help to maximize impact.
Interest in the work on graphons (scalar nonparametric objects that characterise network structure) will come also from industry. The initial development of graphons is in part due to research teams based in Microsoft, while many online retailers (eg Amazon) either employ specialist network analysis teams or enlist the services of companies that provide such skills (eg Qubit). This stakeholder group will be engaged through workshops with industry participants, eg those conducted by the Institute for Analytics and Data Science at Essex, UCL Big Data Institute and the Alan Turing Institute. Publishers are interested in more sophisticated network analysis, given their interest in citation networks. The PI has links with Mendeley that will be used to generate impacts. Aviation companies, airports and aviation consultants also have access to network data sets relating to passenger flows that they wish to analyse better. This can lead to substantial cost-savings through more efficient routes, e.g. by determining the impact of a change in a given city on passenger flows to and from that city and other cities. Such companies frequently employ PhD researchers for whom the research conducted is accessible. The PI has contacts with links to Airbus, while the university has contacts at Southend Airport through a commercial liaison wing. Both will be followed up to create impact.
Long-term impact will be ensured via the one-day workshop at the end of the grant period, which will bring together a range of academic and industrial delegates and seeks to create an 'impact legacy', as well as via open access software packages and routines to ensure that beneficiaries are able to use the methods developed easily, thereby increasing impact.
Organisations
People |
ORCID iD |
Abhimanyu Gupta (Principal Investigator) |
Publications
Gupta A
(2020)
Household sorting in an ancient setting
in Journal of Urban Economics
Gupta A
(2019)
Order Selection and Inference with Long Memory Dependent Data
in Journal of Time Series Analysis
Gupta A
(2023)
Household sorting in an ancient setting
in Journal of Urban Economics
Gupta A
(2022)
CONSISTENT SPECIFICATION TESTING UNDER SPATIAL DEPENDENCE
in Econometric Theory
Gupta A
(2023)
Efficient closed-form estimation of large spatial autoregressions
in Journal of Econometrics
Gupta A
(2022)
NONPARAMETRIC PREDICTION WITH SPATIAL DATA
in Econometric Theory
Gupta A
(2022)
Consistent specification testing under spatial dependence
in Econometric Theory
Gupta A
(2020)
Networks and information in credit markets
Gupta A
(2022)
Nonparametric prediction with spatial data
in Econometric Theory
Description | The project has made significant advances in our understanding of how economic entities (e.g. people, organizations, governments) make decisions under situations where they are connected by a network. Networks can be geographic (via shared borders for instance) or via some economic or social connection such as belonging to similar social strata or taking decisions that have to also account for others' decisions. In the empirical line of research, the project has successfully defined a clear notion of a financial network in a specific type of financial market, namely the market for large, syndicated loans in the US. These loans are so big that a number of banks form a syndicate to share the burden amongst them. It is noted that banks often specialise in lending to specific sectors of the economy. We find that this sectoral specialisation has important network implications: banks with similar lending portfolios tend to correlate their lending decisions in good times, but this correlation disappears in bad times. This is consistent with profound theories of information, which predict that banks rely more on their own private information when the economy is marked by negativity and lack of trust, as is the case in a recession. The project has also taken a unique approach to understanding how households arrange themselves around some central amenity in a settlement, e.g. a marketplace. The project studies modern economic theories of such types of sorting using archaeological data from ancient Greece, specifically the island of Antikythera in the Mediterranean. We find evidence of sorting in two eras where clearly defined central locations existed and none in a purely agrarian era. This is consistent with theory, and the paper is now forthcoming in the top field journal. This is novel evidence of 'urban' sorting from the ancient world. The project has also made major theoretical advances. In a paper forthcoming in a leading field journal, a self-contained mathematical framework has been established for testing the truth or correctness of a network model used by a practitioner. This covers a vast number of commonly employed models in one 'big tent' setup, lending a very elegant generality to the theory. We employ the methods to empirical studies as diverse as the fighting effort exerted by factions in the Second Congo War (networks are formed via alliances and enmities) and cross-country growth spillovers due to proximity. Another theoretical paper forthcoming in a leading field journal develops a new method and associated mathematical theory for prediction across space. This is applied to prediction of house prices in Los Angeles. In yet another theoretical paper, published in the top field journal, a mathematical theory is developed that draws a deep connection between commonly used network models and a 'big data' framework. The paper shows excellent estimation results are possible while avoiding many notorious pitfalls of complicated, or 'big', network models. The method is applied to the problem of co-location of venture capital firms and biotechnology firms and finds improvements over existing techniques in estimation accuracy. |
Exploitation Route | The outcomes can be taken forward in a variety of ways. For both practitioners and academics, the project delivers both empirical and theoretical insights. For instance, it provides empirical confirmation that banks with similar types of exposure make correlated decisions in good times. This is important for policymakers because it suggests that a booming economy exhibits herd-like behaviour. A regulator, depending on context and circumstances, may wish to encourage or curtail such behaviour. The evidence on sorting in the ancient is of particular interest to academics working in the field of urban economics, where this finding can lead to greater interest in ancient cities and their layout. The theoretical approaches developed in the project have deep and profound implications for academics working in the field of econometric theory. For instance, the project develops an overarching framework that encompasses many network models and proposes a mathematically novel way of learning about the truth of such models. This is applicable in many other settings, and the connections with big data make the findings particularly 'modern'. |
Sectors | Environment Financial Services and Management Consultancy Government Democracy and Justice Culture Heritage Museums and Collections |
URL | https://sites.google.com/site/abhimanyugupta85/esrc-new-investigator-grant |
Description | Workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | An international workshop on 'Methods for Network Dependent Data' was held at the University of Essex on June 1st 2022. Attendees included leading researchers from the UK, US and Europe and PhD students. The programme can be found at the URL below. |
Year(s) Of Engagement Activity | 2022 |
URL | https://drive.google.com/file/d/1OFTu1xh9ZgkAFxsfaBqkwNya53iLWoI1/view?usp=sharing |