Data Mining Email to Discover Organizational Networks and Emergent Communities in Work Flows
The social graph above shows the email flows amongst a large project team. It is an x-ray of how the project actually works! Each person on the team is represented by a node. Each node is colored according to the person's department -- red, blue, or green. Yellow nodes are consultants and other specialists hired to work on this project. Grey nodes are not formal team members but are external experts consulted during the project.
The client's I/T department gathered the email data and provided a snapshot every month of the project. Only information in an email's To: and From: fields was gathered. The Subject: line and the actual content of the email were ignored. Only emails addressed to individuals were used. Emails addressed to large distribution lists were disregarded. A grey link is drawn between two nodes if two persons sent email to each other at a weekly or higher frequency.
In addition to the network visualizations, network metrics were generated to see how well the various departments and groups were interacting. We used the E/I Ratio to measure the external/internal flows between/amongst formal groups. We also applied cluster analysis to see the emergent informal groups that self-organized as the project progressed. We took several "SNApshots" over time to view the emerging changes.
The project x-rays began after a key milestone was missed in the 4th month of the project. The x-rays continued for the next 11 months. The project leadership reviewed the network maps and metrics each month to monitor the health of the project. No further milestones or deadlines were missed.
The above diagram shows the project network soon after the missed deadline. Notice the clustering around formal departments -- blues interacting with blues, greens interacting with greens. Several of the hubs in this network were under-performing and often came across as bottlenecks. Project managers saw the need for more direct integration between the departments. One of the solutions was very simple, yet effective -- co-location of more project team members. A surprising solution in the age of the Internet! This intervention, along with others, improved the information flow, and reduced the communication load on the hubs, whose performance improved later in the project.
One of the interesting "side effects" of this project was the discovery of the great connectedness of the yellow nodes -- the outside contractors -- in this project. They were more integrated in the knowledge flows of this project than any other group -- they reached more people, over shorter paths, than ay other group. Of course, the bad news is that these contractors will all leave at the end of the project, and the company will no longer have access to their knowledge. The company did not want to lose key knowledge from, and about, the project. They set up regular knowledge-sharing sessions where key network nodes would share their wisdom, experience and learning about the project. This allowed the knowledge to flow from the well-connected contractors back into the organization.