** Welcome
*** { @unit = “”, @title = “Meet Your Instructor”, @lecture, @foldout }
Hi there! My name is Jacob Young and I will be your instructor for this course. For the last two decades I have researched and taught about social networks in a variety of contexts: adolescent friendship groups, incarcerated men and women, police officers, and academics. I am passionate about social network analysis and I hope to pass that passion on to you in this course.
*** { @unit = “”, @title = “Social Network Analysis and the R Toolkit”, @reading, @lecture, @foldout }
Network science is an approach to science that views the world as being composed of systems of actors connected through relational ties (i.e. a network). Network science takes these relational structures as the primary domain of interest. In so doing, research questions take the following forms: How does the network matter? What effects the network? Network analysis is the set of tools used to study relational variables. A set of methods for systematically understanding and identifying connections among actors. This course will introduce you to these tools and their application to problems in the the field of criminology and criminal justice.
In this course you will learn how to use R and R Studio to import, analyze, and report on social networks.
R is a 30-year-old statistical language created by New Zealand statisticians Robert Gentleman and Ross Ihaka as a free alternative to proprietary software for their students at the University of Auckland. In fact, its rich lineage can be directly traced to inventor and scientist Alexander Graham Bell.
In this course we cover the foundations of social network analysis and show how to implement these topics with the R language. In order to create robust and dynamic analysis we need to use a couple of tools that were built to leverage the power of R and create compelling narratives.
RStudio helps you manage projects by organizing files, scripts, packages and output. Markdown is a simple formatting convention that allows you to create publication-quality documents. R Markdown is a specific version of Markdown that allows you to combine text and code to create data-driven documents.
The following resources will help you get a better understanding of these tools.
Chapter 1: Core R: Learning the basics of R
Chapter 2: RStudio: RStudio’s functionality and features
Data-Driven Docs: How R Markdown is used for interactive and dynamic reports
A Guide to Markdown: How to use Markdown - the easy-to-learn formatting syntax
You will get plenty of practice with these tools and submit your labs as knitted R Markdown (.RMD
) files.
View R Markdown in action in the below image:
*** { @unit = “”, @title = “Videos”, @lecture, @foldout }
RStudio is a graphical user interface (GUI) and integrated developer environment (IDE) that makes it much easier to use R for writing code, importing data, installing packages, and other features.
The following video provides a tour of the RStudio interface and key components for getting started.
Visit the video to navigate using timestamps in the description or bookmarks in the progress bar.
Markdown is a “lightweight”, easy-to-learn syntax that allows you to format language with boldface, italicization, bullet points, and more, even when there’s no “rich content editor” menu available.
Websites and applications that support Markdown may surprise you, including:
The following video provides a brief introduction to Markdown fundamentals.
Visit the video to navigate using timestamps in the description or bookmarks in the progress bar.
GitHub Issues allow you to quickly troubleshoot issues with instructors and peers by sharing code, reproducing errors, and thoroughly explaining complications as you learn R.
The following video provides a tutorial for using GitHub Issues.
Visit the video to navigate using timestamps in the description or bookmarks in the progress bar.
R Markdown is one of the most powerful tools you’ll learn. It allows the synthesis of human language and code to perform processing and analysis tasks while explaining them to broad audiences.
The following video provides a tutorial and demonstration of R Markdown.
Visit the video to navigate using timestamps in the description or bookmarks in the progress bar.
*** { @unit = “”, @title = “Getting Help”, @reading, @foldout }
Social network analysis is a very social endeavor and real-world analytics projects are almost always collaborative.
This course is designed to be interactive, and a lot of learning occurs by practicing the technical jargon from the field and learning how to talk about data and models.
Learning how to seek help and use discussion boards will accelerate learning and facilitate collaboration. Social coding tools like GitHub use these features extensively.
This course is going to throw a lot at you, but also provide a lot of support. Over these first couple of weeks feel free to reach out for anything you might need!
If you find something confusing, let us know (likely others will find it confusing as well).
As a general rule of thumb, if you are stuck, need clarification about what the question is asking, want to make sure you understand a formula, or are having similar issues then the help discussion page is the easiest and quickest way to get help. If you are confused about concepts or having a hard time even formulating your question, then virtual office hours are your best option.
Note that the discussion board is hosted by the GitHub issues feature. It is a great forum because:
Please preview your responses before posting to ensure proper formatting. Note that you format code by placing fences around the code:
```
# your code here
lm( y ~ x1 + x2 )
```
The fences are made of three back-ticks. These look like quotation marks, but are actually the character at the top left of your keyboard (if you have a US or European keyboard) and shared with the tilde (~
).
```
y = b0 + b1•X1 + b2•X2 + e
b1 = cov(x,y) / var(x)
```
If all of this looks foreign to you, that is perfectly fine! By the end of the course you will know exactly what all of this means.
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you organize and prepare for success in this course.
** Week 1 - Introduction to Social Network Analysis
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
This unit introduces the fundamentals of social network analysis. This unit also familiarizes you with working with R and RStudio.
Once you have completed this section you will be able to answer the following questions:
In this lab you will provide you the opportunity to start thinking about networks and get your “feet wet” in social network analysis. For the first lab, you will construct several networks of your own that you will work with it throughout the course.
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
Required reading for this unit includes:
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your first week.
*** { @unit = “FRI January 12”, @title = “Discussion Topic”, @assignment, @foldout }
We will be using a discussion board called YellowDig for this course. For your first discussion post, I would like you to introduce yourself to the class by telling us:
Please post your reflection as a new pin on YellowDig.
Note: You get points on Yellowdig by interacting with content. That means creating new posts and participating in posts that your classmates create. Your Yellowdig posts are due on Friday, but you gain points throughout the week by participating in discussions.
You can earn up to 20 points a week, and points reset on Fridays. You need to earn 100 points throughout the seven-week term, which means averaging 15 points a week.
*** { @unit = “TUES January 16”, @title = “Lab 1”, @assignment, @foldout }
In this lab you will provide you the opportunity to start thinking about networks and get your “feet wet” in social network analysis. For the first lab, you will construct several networks of your own that you will work with it throughout the course.
** Week 2 - Introduction to R and Social Network Data in R
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
This section will focus on how we represent network data as matrices. We will also focus on creating networks in R as well has how we visualize networks.
Once you have completed this section you will be able to:
Required:
Lab 2 will build off your work in Lab 1 by having you reconstruct your network in R and creating a visualization of that network in R.
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
Required:
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your second week.
*** { @unit = “FRI January 19”, @title = “Discussion Topic”, @assignment, @foldout }
As you work through the materials this week, I want you to keep in mind some of the difficulties that arise when working with criminal justice records to construct network data.A great review of some of these issues is covered in the article Using social network analysis to study crime: Navigating the challenges of criminal justice records by David Bright, Russell Brewer, and Carlo Morselli. As they state in the paper, “Much like archeologists who deal with incomplete data, criminal network researchers must ‘dig’ to access relevant data, prepare the artefacts for analysis in the knowledge that such artefacts are but a sample, and engage in analysis and interpretation of such artefacts giving due consideration to the limits inherent in the artefacts under study.”
For your discussion this week, think about your own experience with data and describe the extent to which you have encountered such issues as those described in the article. If you have not encountered any, think about how the topics mentioned in the article might apply to future work you do in your field.
Please post your reflection as a new pin on YellowDig:
*** { @unit = “TUES January 23”, @title = “Lab 2”, @assignment, @foldout }
The purpose of this lab is to familiarize yourself with how networks are created and visualized in R.
This lab has a template. Click to download the lab template. Modify the template using the instructions and submit your assignment.
** Week 3 - Centrality
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
How do we know whether a node is important in a network? How can we compare the structure of different networks? These are key questions in network analysis and this week we will start to think about how we describe networks. One of the most popular concepts in network analysis is centrality. That is, important nodes are those who are central. Also, we can compare networks by examining how they differ (or are similar) based on the distribution of centrality scores. This section introduces the concept of centrality, focusing specifically on degree centrality. Next week we will shift to two alternative measures of centrality, closeness and betweenness.
Once you have completed this section you will be able to:
Required:
Lab 3 provides an opportunity to familiarize yourself with calculating degree centrality and degree centralization scores for undirected and directed networks in R.
You will use data from two sources:
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
Required:
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your third week.
*** { @unit = “FRI January 26”, @title = “Discussion Topic”, @assignment, @foldout }
This week we focused on degree centrality as a tool for describing networks. In the article Vertical organizations, flat networks: Centrality and criminal collaboration in the Italian-American Mafia by Andrew Krajewski, Daniel DellaPosta, and Diane Felmlee, they use degree centrality to measure social status.
For your discussion this week, describe how will you think this measure captures the concept they are interested in examining. Also, think back to the discussion for Week 1. What are some of the limitations to these data that may have influenced the findings?
*** { @unit = “TUES January 30”, @title = “Lab 3”, @assignment, @foldout }
Lab 3 provides an opportunity to familiarize yourself with calculating degree centrality and degree centralization scores for undirected and directed networks in R.
You will use data from two sources:
Click to download the lab template. Modify and submit using the instructions.
** Week 4 - Closeness and Betweenness Centrality
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
In Week 3, you were introduced to the concept of centrality and asked to think about the following questions: “how do we know whether a node is important in a network?” and “how can we compare the structure of different networks?” And, we focused on degree as an indicator of whether a node is central. This week, we will continue to think about how we describe nodes and networks through the lens of centrality. However, we will now focus on two different ways of conceptualizing what it means to be central in a network. We will examine closeness and betweenness centrality and contrast it with degree centrality.
Once you have completed this section you will be able to:
In Lab 4 we will focus on familiarizing yourself with calculating closeness centrality and betweeness centrality scores as well as centralization scores for undirected and directed networks in R. We will revisit the networks we used in Lab 3 - Degree Centrality and Centralization to assess how different centrality measures tell us a different (or the same) story about what it means to be “central” in a network.
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
Required:
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your fourth week.
*** { @unit = “FRI February 2”, @title = “Discussion Topic”, @assignment, @foldout }
In criminology and criminal justice, much attention is focused on disrupting networks. That is, trying to disconnect a network so that it is less functional. For discussion this week, I would like you to read the paper Disrupting resilient criminal networks through data analysis: The case of Sicilian Mafia by Lucia Cavallaro and colleagues.
How does betweenness centrality operate as an intervention procedure in their study? What might the interventions look like if one were to use degree centrality or closeness centrality instead? Would they differ from the betweenness centrality intervention?
*** { @unit = “TUES February 6”, @title = “Lab 4”, @assignment, @foldout }
In Lab 4 we will focus on familiarizing yourself with calculating closeness centrality and betweeness centrality scores as well as centralization scores for undirected and directed networks in R. We will revisit the networks we used in Lab 3 - Degree Centrality and Centralization to assess how different centrality measures tell us a different (or the same) story about what it means to be “central” in a network.
Click to download the lab template.
** Week 5 - Bipartite Graphs and Two-Mode Networks
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
So far, we have worked with networks that have one set of nodes and one set of edges. But, not all of the networks we want to examine have a single node set. More complex relational structures have multiple partitions of node sets. Bipartite graphs allow us to represent networks that have two partitions of nodes. This section of the course will introduce bipartite graphs and get you started working with two-mode networks.
By the end of this unit you will be able to:
Lab 5 will provide the opportunity to continue to familiarize yourself with bipartite graphs and two-mode networks in R. You will work with two networks.
First, you will use data collected from Paul Revere’s Ride, by David Fischer. In the book, Fischer documents Reveres connections through various affiliations in locations and how these influenced history. The Paul Revere conspiracy dataset concerns relationships between 254 people and their affiliations with seven different organizations in Boston. The network is two-mode, with 254 actors and 7 organizations (“events”).
Second, you will use the Philippine Kidnappings Data which is a collection of relationships on the Abu Sayyaf Group (ASG), a violent non-state actor operating in the Southern Philippines. In particular, this data is related to the Salast movement that has been founded by Aburajak Janjalani, a native terrorist of the Southern Philippines in 1991. ASG is active in kidnapping and other terrorist attacks. The network is two-mode, with 246 actors (i.e.terrorist kidnappers) and 105 terrorist events the actors attended.
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
Required:
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your fifth week.
*** { @unit = “FRI February 9”, @title = “Discussion Topic”, @assignment, @foldout }
One of the most discussed topics in the study of illicit or covert networks is the “efficiency/security trade-off”. As discussed in the article The efficiency/security trade-off in criminal networks, these organizations have to decide whether to emphasize efficiency or security, both of which influence network structure.
For your discussion this week, think about a network that you are interested in studying (or are currently studying) and discuss the “efficiency/security trade-off”. Is your network more efficiency focused or security focused? Or does it depend?
*** { @unit = “TUES February 13”, @title = “Lab 5”, @assignment, @foldout }
Lab 5 will provide the opportunity to continue to familiarize yourself with bipartite graphs and two-mode networks in R. You will work with two networks.
First, you will use data collected from Paul Revere’s Ride, by David Fischer. In the book, Fischer documents Reveres connections through various affiliations in locations and how these influenced history. The Paul Revere conspiracy dataset concerns relationships between 254 people and their affiliations with seven different organizations in Boston. The network is two-mode, with 254 actors and 7 organizations (“events”).
Second, you will use the Philippine Kidnappings Data which is a collection of relationships on the Abu Sayyaf Group (ASG), a violent non-state actor operating in the Southern Philippines. In particular, this data is related to the Salast movement that has been founded by Aburajak Janjalani, a native terrorist of the Southern Philippines in 1991. ASG is active in kidnapping and other terrorist attacks. The network is two-mode, with 246 actors (i.e.terrorist kidnappers) and 105 terrorist events the actors attended.
Click to download the lab template. Modify and submit using the instructions.
** Week 6 - Network Projection
*** { @unit = “”, @title = “Unit Overview”, @reading, @foldout }
As we saw in the last section, networks with complex node sets can be represented using bipartite graphs. A common approach in research is to reduce a bipartite graph to a unipartite graph so as to use the tools developed for networks with a single set of nodes. Projection is the process by which we map the connectivity between modes to a single mode. This week will focus on network projection.
By the end of this unit you will be able to:
Lab 6 will serve to familiarize yourself with projecting bipartite graphs to unipartite graphs in R. For this lab, we will revisit the networks we used in Lab 5 - Bipartite Graphs and Two-Mode Networks.
*** { @unit = “”, @title = “Readings”, @reading, @foldout }
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized in your sixth week.
*** { @unit = “FRI February 16”, @title = “Discussion Topic”, @assignment, @foldout }
What good is a network analysis if we can’t put it into action? A common network intervention is the “group-based violence intervention” where the goal is to use the network to disseminate credible threats to actors in a network.
An example of this approach is discussed in the article Choosing Representatives to Deliver the Message in a Group Violence Intervention by Andew Wheeler, Sarah McLean, Kelly Becker, and Robert Worden.
For your discussion this week, review the article above and think about how such an intervention might be used in a network you are interested in studying (or currently studying). For this article, don’t get lost in the details, think big picture in terms of what the “group-based violence intervention” model does and how analyst go about locating individuals who should be the ones to disseminate the message.
*** { @unit = “TUES February 20”, @title = “Lab 6”, @assignment, @foldout }
Lab 6 will serve to familiarize yourself with projecting bipartite graphs to unipartite graphs in R. For this lab, we will revisit the networks we used in Lab 5 - Bipartite Graphs and Two-Mode Networks.
Click to download the lab template. Modify and submit using the instructions.
** FINAL PROJECT
*** { @unit = “”, @title = “Checklist”, @assignment, @foldout }
The following checklist will help you stay organized for your final week.
*** { @unit = “TUESDAY February 27”, @title = “Final Project”, @assignment, @foldout }
The final project will use all of the information you have learned in this course to create a report on a network. For the final project, you will use data from the City of Phoenix Open Data Portal. Specifically, you will use co-arrest data. These data represent incidents where individuals were arrested together.
For the final project, imagine that you work for a police department and your supervisor has asked you to create a report on co-offending networks.
Download the recommended template for your final project with the below link.