From Wayback Machine to Yesternet: New Opportunities for Social Science
William Arms, Dan Huttenlocher, Jon Kleinberg, Michael Macy, and David Strang
William Arms, Dan Huttenlocher, Jon Kleinberg
Department of Computer Science, Cornell University, USA
Michael Macy, David Strang
Department of Sociology, Cornell University, USA
Corresponding author: Michael Macy, mwm14@cornell.edu
Social scientists have stores of data on individuals and groups but relatively little on social interactions, the basis of all social life. That is likely to change due to the spread of computer-mediated interactions that leave a digital record. The flood of available on-line information from corporate web pages to news groups, wikis, and blogs has the potential to open up new frontiers in social science research on the diffusion of innovations and beliefs, the self-organization of on-line communities, and the collective behavior of individuals. The Cornell Yesternet project will create a research laboratory for social science research based on the Internet Archive's 40-billion page Web collection. These snapshots of the Web have been captured and archived every two months for nearly ten years. The Yesternet project will copy and reconfigure large portions of this massive collection as a relational database that can be used for research on social and information networks. The Cornell team, composed of social, computer, and information scientists, will develop, test, and refine the necessary tools as part of a series of testbed research applications that track the diffusion of innovation on the Web.
