Who shaped computing?

Who Shaped Modern Computing — Part 2?

In part one, I asked the question “Who is big in computing?” and probed the answer by constructing a social network gleaned from the references listed below. As expected, the network is scale-free, meaning it contains a handful of highly connected nodes—people, places, and things—and a majority of sparsely connected nodes. Furthermore, the most-connected nodes in the social network are languages, John Backus, and Edsger Dijkstra.

The question really has two dimensions: How to decide what metric to use to judge a person’s influence on a field, and how to then apply the metric to determine who or what ranked highest in terms of the metric? Part one reported the ranking of nodes according to betweenness and found language, John Backus, John McCarthy, and Edsger Dijkstra at the top. But, is betweenness a suitable measure? Ostensibly, high betweenness indicates high influence, but how can we be sure? Might there be other metrics that do a better job of measuring influence? Is influence the same as impact?

Spectral Analysis

Google’s PageRank has become one of the most-used methods of determining importance of World Wide Web documents. It is based on spectral analysis; nodes are ranked according to their eigenvalue, which is obtained by computing the eigenvector of the network’s connection matrix. If the importance of a document in a bibliographic network of documents can be determined by URL links, then perhaps the same idea applies to importance of nodes in a social network. Higher eigenvalue means more importance. Applying this metric ranks language, John McCarthy, and Edsger Dijkstra as the top three nodes.

Social Influence

Intuitively, if node A is connected to node B then we can assume A influences B and vice versa. Actually, this is the rationale behind the wiring diagram of the social network under study. Links indicate influence, or potential influence, by the spreading of ideas, techniques, and products. Transitively, if A influences B and B influences C, then A influences C. Applying this rule recursively, we conclude every node influences every other node at some level. But how much influence transmits across many intermediary nodes?

Suppose we pick two nodes at random, R and B, and permanently color R red and B blue. Further suppose we repeat the simple rule: for every node X in the network (except R and B) count the number of adjacent nodes that are red and the number that are blue, and paint X red with probability #red/(#red+#blue). For example, if three adjacent nodes are red and two are blue, then X is painted red with probability 3/5. Count the number of red nodes after each sweep or pass over every node in the social network to determine its color. Next, iterate multiple (1000) passes over the network, and measure the percentage of passes where there are more red nodes than blue across the entire network. This percentage yields node R’s social influence.

I claim that if node R prevails and the number of nodes painted red exceeds the number painted blue, on average, then R has more influence on the entire network of nodes than does B. Repeating this process for all node pairs (R, B) yields an influence score for every node in the network. When this is done by computer simulation, language, John Backus, and John McCarthy come out on top. This suggests that the social influence of language, Backus, and McCarthy extends across all nodes—not just adjacent nodes.

Ideas Spread Like an Epidemic

Two metrics related to the red-blue influence metric described above are influence spread by cascading and spreading by random walks. Cascading starts by painting a seed node red, and then measuring the number of times links propagate red paint from node-to-node with a given probability. Nodes that are painted red more often than not are considered more influential due to their “infectiousness.” We can rank nodes according to these counts.

Random walk is even simpler: It releases a sample of tokens into the network at random nodes, allows them to randomly jump from node to node, and records the number of times each node and link is traversed by the random walking tokens. The number of times each node is visited is normalized to obtain a measure of “influence,” because some nodes are visited more often than others.

The cascading spread of influence produces John McCarthy, language, and Edsger Dijkstra as the top three influencers. The random walk simulation produces language, John Backus, and John McCarthy as the top three. Can you detect an emerging pattern, here? Language, Backus, Dijkstra, and McCarthy keep coming out on top.

Blocking Nodes

Another measure of importance similar to betweenness is the blocking node value, which is one if removal of node X separates the network into isolated components, and zero, otherwise. For example, if node John McCarthy is removed from the network, it becomes disconnected—the network splits into at least two islands or components, such that it is impossible for an influence in one component to spread to the other component(s). There are 55 blocking nodes in the network of 224 nodes. Removal of any one of these 55 nodes separates the network.

It is reasonable to assume a blocking node is more important than a non-blocking node, simply because a blocking node serves as a kind of gatekeeper. Remove the blocking node and influence is bottled up in one isolated part of the network. Language, John Backus, John McCarthy, and Edsger Dijkstra are among the blocking nodes. Deleting John Backus leaves the network fragmented into three components of size 203, 9, and 3, plus 9 single or isolated nodes.

All Together Now

Figure 1 shows the results for all of the measures applied in this analysis. Non-blocking nodes are filtered out, leaving the 55 most important people, places, and things to be sorted out according to one or more additional measures. Figure 1 ranks the network nodes according to the sum over all measures described here and in Part one. This produces a familiar top three list: language, Edsger Dijkstra, and John Backus.

Figure 1. Scores from all metrics described here: connectivity as measured by degree, betweenness as measured by flow, eigenvalues obtained from spectral analysis, simulated contagion-like cascades, simulated sentiment-like influence, simulated random walks, and normalized sum across all metrics versus blocking node. They produce similar results: Language, John Backus, Edsger Dijkstra, and John McCarthy wielded the biggest influence on the social network developed in part one.
Figure 1. Scores from all metrics described here: connectivity as measured by degree, betweenness as measured by flow, eigenvalues obtained from spectral analysis, simulated contagion-like cascades, simulated sentiment-like influence, simulated random walks, and normalized sum across all metrics versus blocking node. They produce similar results: Language, John Backus, Edsger Dijkstra, and John McCarthy wielded the biggest influence on the social network developed in part one.

Clearly, computer languages have played a critical role in the development of computer science. Without a language for expressing algorithms, hardly anything else matters. Languages made it possible for many more people to program computers, and they acted as mechanical levers—expediting software production. It may be unfair to lump them altogether, because there are so many languages and alternative paradigms embraced by them.

Edsger Dijkstra made many key contributions in his long career. Starting with Dijkstra’s shortest path algorithm written in 1960 to test the first ALGOL compiler, through structured programming and the “GO TO Considered Harmful” controversy in 1968, which were followed by the Dining Philosopher’s Problem, the invention of PV semaphores, and criticism of ADA and the U.S. Department of Defense’s “Star Wars” strategic defense initiative. Dijkstra vowed to remain bearded until finishing the ALGOL compiler in 1960, but never shaved it off after completing it in an amazing six weeks.

Similarly, John Backus had a dramatic impact on computing, not just for his leadership in developing FORTRAN, but also for his theoretical work on meta-languages (BNF), functional programming languages, and influence on McCarthy’s LISP. In the 1950s most people believed it impossible to produce a compiler capable of generating machine code as efficient as a human programmer, but FORTRAN remains to this day, a living proof that automatic translation can be both fast and efficient. While other languages come and go, FORTRAN is still widely used, today 58 years after it was released in 1957.

While languages, Dijkstra, and Backus take top spots, John McCarthy is number one according to several metrics and frequently among the top four, overall. McCarthy was the inventor of LISP, of course, but he was also the inventor of the linking loader for Compatible Time-Sharing System (CTSS) at MIT and was one of the earliest to promote time-sharing to manipulate lambda expressions in real time; LISP was the lambda expression language, and time sharing the provider of interactivity.

Sources

Daylight, Edgar G., The Dawn of Software Engineering from Turing to Dijkstra, Lonely Scholar bvba, Belgium, 2012, 239pp.

Denning, Peter, and Craig H. Martel, Great principles of computing, MIT Press, 2015, 302pp.

Lammers, Susan, Programmers at Work: Interviews with 19 programmers who shaped the computer industry, Microsoft Press, 1986, 1989, 394pp.

Lohr, Steve, GO TO: The story of math majors, bridge players, engineers, chess wizards, maverick scientists and iconoclasts – the programmers who created the software revolution, Basic Books, 2001, 250pp.

Shasha, David, and Cathy Lazere, Out of their minds: The lives and discoveries of 15 great computer scientists, Copernicus-Springer-Verlag, New York, NY, 1998, 291pp.