Room for Doubt: 2012

Friday, December 28, 2012

Turing's Revolution

In my post Alan Turing Year, I discussed how it's important for people to be aware of Turing's intellectual contributions to math and computer science, as well as his war efforts and tragic death. It's time to live up to my own call and do my part. In this post I'll talk about Turing's impact, from his best known contributions to his lesser known, but important ones.

I should note that the content of this post is largely stolen from answers to a question I asked on the StackExchange (SE) cstheory site, where I sought to gather a compendium of Turing's less known results. The answers made a collection of impressive contributions, some of which I wasn't familiar with. Some of his contributions turned out to be surprisingly modern. They also introduced us to viewing the world through an algorithmic lens.

Without further ado, here they are:

1. Defining Turing Machines and Showing the Existence of UTMs (1936): A Turing machine is a universal computing device that forms the basis of how we think about computing. A Universal Turing machine is a fixed machine that can simulate any other Turing machine -- most relevant to our current computers. (SE link)

a Lego Turing Machine, photo from wikipedia

2. The Church-Turing Thesis (1939): The idea that the Turing Machine (and Lambda calculus) captures all of computation, and is therefore worth studying. This hypothesis cannot be proven, but everyone believes it nonetheless. (wiki link)

3. Solving the Entscheidungsproblem (1936): Turing solved Hilbert's "decision problem" by showing that deciding whether a program will halt is not possible in general. (wiki link)

4. The Turing Test and Introducing AI (1950): Turing gave an objective test that decides whether a computer exhibits intelligence. The idea is to say that a computer is intelligent if it can fool humans into thinking it is also human. Accepting this idea is a way to avoid many philosophical arguments about the nature of intelligence. (wiki link)

5. Introducing Oracles and Relativization into Computability Theory (1939): Oracles are incredibly useful for all sorts of analysis -- for example the famous Baker-Gill-Solovay result that says P vs NP cannot be separated any technique that relativizes (holds for relative to any oracle), eg diagnolization. It was part of the paper "Computing machinery and intelligence" that launched AI. (SE link)

6. The Bombe (1950): Also called the Turing-Welchman Bombe, for breaking the German enigma machine in WW2 and introducing tools to cryptanalysis. (SE link)

the Bombe, from http://www.uh.edu/engines/epi2666.htm

7. Randomized Algorithms (1950): Turing observed the potential advantage an algorithm can gain from having access to randomness. (SE link)

8. The First Artificial Neural Network (1948): This idea is an important one in AI and machine learning. Turing's design predated Rosenblatt's preceptron. (SE link)

9. The Chemical Basis of Morphogenisis (1952): In his most cited paper, he began to tackle the question of how a symmetric embryo develops into an assymetric organism using symmetry perserving chemical reactions. This started to introduce algorithmic thinking into biology. (SE link)

10. Good-Turing Estimator (1953): For estimating an unseen fraction on a population when given past data. (SE link)

11. One of the First Chess Algorithms (1952): Before we had computers, we had paper chess algorithms. (SE link)

Deep Blue, photo form wikipedia

12. Checking a Large Routine (1949): The idea of using invariants to prove properties of programs and "well-foundedness" to prove termination. (SE link)

13. LU-Decomposition (1947): A method for factorizing a matrix as the product of a lower triangular matrix and an upper triangular matrix. Now commonly studied in Linear Algebra. (SE link)

14. Turing's Method for Computational Analysis of a Function (1943): Invented for studying the of the Riemann zeta-function but is more widely applied. (SE link)

I'm sure there are more. Any one of these is an impressive achievement for a scientist -- the first couple are revolutionary. That Turing did all of these and more shows his remarkable contributions to our field.

Update (12/31/12): Alan Richmond points out another important work of Turing's, which I added as # 14. Also, my wording for the Turing test proved a bit controversial, so I've changed it to better reflect reality.

Friday, December 07, 2012

UIC's MCS Program Seeking Graduate Students

As I'd posted before, UIC has two departments interested in computer science. We have a computer science department in the engineering school. And the math department in the school of liberal arts and sciences, where I have an appointment, has a smaller program in "mathematical computer science" (MCS).

The MCS program is more focused on the mathematical aspects of computing and less on systems, databases, etc. It is similar to the ACO programs at CMU and Georgia Tech.

We recently updated our website, where you can learn more about the program and the affiliated faculty. More importantly, if you're a student looking to get a Ph.D. in a mathematical area of computing, consider applying to our program -- we're looking for strong applicants.

a screenshot of the new MCS website

Tuesday, August 07, 2012

Diamond Forever

Recently, the academic community started to fight back against traditional academic publishers, who earn large profits without any longer providing many clear services. The internet has been a great tool for taking care of such inefficiencies, and it's quickly disrupting the publishing market as well.

Having thought a bit about these issues myself, I wanted to summarize my views of some of the various publishing models (as described by Tim Gowers) under consideration, as well as the pros and cons of each. It's possible, even likely, that I am missing something, so feel free to point it out in the comments.

The Systems

closed-access

screenshot, after I tried to access a paper from nature.com

system: Researchers send manuscripts to publishers. Publishers have volunteer editorial boards consisting of researchers who decide whether to accept/reject manuscripts. Accepted manuscripts are published but cannot be accessed without a paid subscription to the given journal. Authors lose most (or all) rights to their own work.

who wins: publishers, by charging for access

who loses: university libraries and therefore scientists who pay for it with overhead; the public, who doesn't get access to the research their taxes likely funded; authors, who cannot easily have their research read and used.

opinion: This system made some sense before the internet era, where there were few other good ways to disseminate research. The publishers used to have to work hard for their profits by doing serious formatting/printing/publishing work. With the internet, all of this is now done by the authors, and the closed-access system no longer makes sense.

green-access

from Springer's "self archiving policy"

system: Same as closed-access, except that authors are allowed to put their accepted manuscripts on their websites and onto online repositories such as the arXiv.

who wins: publishers, by charging for access; researchers who put their papers online

who loses: basically same as closed-access, except for the authors who put their papers online

opinion: This is the system most journals are now on, or at least they usually don't go after scientists who keep their papers online. This is strictly better than closed-access, but doesn't go far enough. Too many scientists do not put their papers on their websites or arXiv. There is still no guarantee that science will be public, as it should be.

gold-access

a screenshot of the new gold-access math journal

system: Researchers send manuscripts to publishers. Publishers have volunteer editorial boards consisting of researchers who decide whether to accept/reject manuscripts. Authors of accepted manuscripts pay to have their work published but published manuscripts are freely accessible to everyone online.

who wins: the public, who can access research for free. the publishers, who still get paid (but not as much)

who loses: the authors, who have to pay (usually large sums) to have their work published. universities, which might need to subsidize the fees. quality (see below).

opinion: I think adopting this system would be a mistake. First, scientists are the ones doing the work, the editing, and the main contributions -- it seems unfair to make them pay as well. Researchers without grants or university support would have to use their personal funds. Moreover, I think this would seriously impact paper quality. In this system, journals would make money by accepting more papers as opposed to getting more readers, and some authors may have incentives to pad their CVs. I cannot see how this wouldn't go wrong.

diamond-access

the journal of machine learning research, image from their website

system: Researchers send manuscripts to publishers. Publishers have volunteer editorial boards consisting of researchers who decide whether to accept/reject manuscripts. Authors of accepted manuscripts have their work published for free and published manuscripts are freely accessible to everyone online.

who wins: scientists, who can publish for free and have their papers read. the public, who can access research freely. universities, who do not have to pay for journal access. research quality.

who loses: publishers, who cannot any longer charge for anything

opinion: This is the best of all worlds. Everyone seems to win, and research quality especially wins because nobody makes a profit from publishing bad results. The only problem is that the publishers lose, and why would anyone run a journal without making a profit?

A Solution

Diamond-access is the best system for everyone, except for the publishers. However, publishers need incentives to run a journal, which they wouldn't have in a diamond access system. Now, becoming a publisher would involve operating at a loss. So who should take the hit?

One obvious solution is for universities to publish (or pay for publishing) journals. They would make a sacrifice in maintaining the journals for free, but what would they win? Quite a lot actually. Were we do switch to this system, university libraries would no longer have to pay subscription fees to journals, and the university professors wouldn't have to pay to have their work published. And the quality of publications wouldn't suffer. Some of this cost could even be handled by overhead.

Yale's library (from wiki), where I not once went to do research during my Ph.D.

But what is the cost of running a diamond access publication? Well, it turns out to be very small. With pdfs online, there's really no need to publish paper proceedings. I don't think I've ever gone to a library to find an article on paper -- pretty much nobody does these days. So, all you need to do is register a domain and pay for some maintenance and hosting. The rest of the work is done by volunteers. Looking at the financial statement (of which I learned from Google+) of the Journal of Computational Geometry, it could take as little as $10 per year. Heck, we don't really need universities to pay for it.

Yes, it really costs very little to run a journal these days if you do it right. There's no need for readers, or authors, to pay huge sums to middle-men.

Monday, May 07, 2012

Atlanta, Northward

When you accept a postdoctoral position, you know you probably can't stay indefinitely. Psychologically, it means you have to prepare yourself for having to switch jobs, coworkers, offices, etc., and I, as a postdoc, have been ready for this foreseeable consequence with Georgia Tech, where I've enjoyed being part of a strong and lively department the last two years.

Unfortunately, when you take a postdoc in a city like Atlanta, where there isn't as high a concentration of universities and research labs as some other places, you not only have to leave your job, but likely also the city. And even though it was in the back of my mind that I'd probably have to move, I didn't expect to love Atlanta as much as I have come to.

Atlanta's botanical garden

Atlanta has sunny, warm weather, really friendly people, lots of trees, amazing restaurants, nice parks, museums, a symphony, theaters, and everything you could want in a city (except, perhaps, a waterfront). It's also very livable -- the population density is fairly low, so you're not elbow-to-elbow every time you leave your house, and you can live in really nice suburbs right in the middle of the city. Because of the large influx of new residents, it's mostly new and clean, yet the cost of living is quite low. And you can reach great hiking with a 20 minute drive and reach a beautiful national park in 3 hours. It's certainly a place I wouldn't mind living in again at some point in my life.

Smoky Mountains National Park

But while I'm sad to have to leave Atlanta, but I am also excited to begin a new phase in my career. In a couple months, I'll be joining the University of Illinois at Chicago (UIC) as an Assistant Professor in the Math department, actually called "Mathematics, Statistics, and Computer Science." The department is quite strong, but unlike most math departments, UIC's is expressly concerned with Computer Science (so, yes, I'm remaining a computer scientist). Additionally, UIC has another entire standard Computer Science department, where I'll also have a courtesy appointment. Between the two departments, there are quite a few great faculty members working on various aspects of machine learning -- I'm looking forward to joining them. And with Northwestern, UChicago, and TTI-C nearby, Chicago should be quite an exciting place to work.

Chicago, reflected in "the bean"

But because this post started with Atlanta, I should say something about the city of Chicago. While I don't yet know Chicago well, it also looks like a beautiful, livable place, on the shores of a (literally) great lake. The people seem friendly, and there is a lot to do. I just hope that, after being spoiled by Atlanta, I can get used to Chicago's cold winters.

Friday, April 27, 2012

Computing Conceptions

From Georgia Tech, many of us have been closely and concernedly watching our southern neighbors at the University of Florida, where budget cuts and a seemingly hostile dean conspired in an attempt to decimate their well-respected computer science department. Dismantling the department, at a time when computer science research is so critical, would be a laughably bad decision on the part of the university.

I don't claim to fully understand the rationale behind this plan. However, I have a feeling that part of the reason such an idea was even being seriously considered has to do with a couple misconceptions of what computer science research is, and I fear that such misconceptions extend beyond the state of Florida. And I don't just mean that the average person doesn't understand computer science; that's to be expected. I mean that many academics, even scientists, don't understand the basics of what computer science is about and therefore tend to devalue it, especially as an academic discipline.

First, many people seem to assume that computer science is just programming or fixing computers. As a graduate student at Yale, where computer science is a small department, I was often asked by other Ph.D. students why computer science even has a Ph.D. program. They didn't view it as an academic pursuit, but more as a trade skill. I fear that many scientists view computer science as limited to programming or getting computers to work, probably because that's the way most, say, physicists use computers. They have little understanding of the beautiful, deep results and insights that computer science has brought the world. Viewing it as an instrumental non-academic field, people think it would be okay to kill-off computer science research and leave the professors to teach programming (which, admittedly, is an important part of a computer science department's job).

(clip art from here)

something computer scientists do not normally do

The other, very related, misconception, one that was clearly in play at the University of Florida, is that the computer science department was somewhat redundant because the electrical and computer engineering department already has the word "computer" in it. Their reasoning sounded more sophisticated than that, but only superficially. But computer science and electrical engineering are very far in their central concerns. Computer science, for the most part, is divorced from concerns about electricity, physical media, or anything of that sort. Whether you work on operating systems, machine learning, or the theory of computation, you mostly don't really care about the underlying hardware, whereas electrical engineers do. Greg Kuperberg, writing on Scott Aaronson's great blog post on this issue, puts it better than I could:

"Apparently from (Florida engineering dean) Abernathy’s Stanford interview, and from her actions, she simply takes computer science to be a special case of electrical engineering. Ultimately, it’s a rejection of the fundamental concept of Turing universality. In this world view, there is no such thing as an abstract computer, or at best who really cares if there is one; all that really exists is electronic devices.

[...] Yes, in practice modern computers are electronic. However, if someone does research in compilers, much less CS theory, then really nothing at all is said about electricity. To most people in computer science, it’s completely peripheral that computers are electronic. Nor is this just a matter of theoretical vs applied computer science. CS theory may be theoretical, but compiler research isn’t, much less other topics such as user interfaces or digital libraries. Abernathy herself works in materials engineering and has a PhD from Stanford. I’m left wondering at what point she failed to understand, or began to misunderstand or dismiss, the abstract concept of a computer."

(image from here)

something usually not of research interest to computer scientists

It looks like a disaster in Florida has so far been avoided. And with each passing year, more scientists will have, at the very least, taken some basic computer science in college -- it is part of our job to teach the important concepts in our introductory courses. I'm hoping this will improve the perceptions of our field. But in the meanwhile, it has become apparent that we have much more PR to do to.

(image from here)

now we're talking!

Sunday, January 22, 2012

Alan Turing Year

I am excited that 2012, the centenary of Alan Turing’s birth, will see many celebrations of his life and legacy. It is hard to think of a scientist who has had more impact both on science and on our lives, but who still remains as unknown to the public, so I am glad Turing will be getting some much-deserved attention this year.

Alan Turing, photo copyright held by National Portrait Gallery in London

Alan Turing laid out the foundations of the study of computer science (in the process of solving Hilbert’s Entscheidungsproblem – no small feat on its own), contributed fundamentally to our understanding computation in nature, envisioned the future of intelligent machines, and saved millions of lives by helping shorten the length of World War 2.

Yet upon discovering Turing was gay (and convicting him for "indecency"), the British government took away his job, kept many of his contributions secret, and chemically castrated him, driving him to suicide.

After some long-overdue outrage, Gordon Brown issued an apology on Britain's behalf, albeit decades too late. Now there are new petitions, asking for the British Government to grant Turing a pardon. I have nothing against these efforts, but how Turing was treated is to our collective shame, and we should remember that these efforts are to make us, not him, feel better. Turing knew he wasn't doing anything wrong and never needed condescending “pardoning” from politicians, nor does he need them any more after his death. If the pardon is to send an apology to the homosexual community for their mistreatment, I’m sure the British government can think of something a little more direct.

I think a better way of honoring Turing is to make sure people know about his work – like we do of Kepler, Galileo, Newton, Darwin, Einstein, and many of the other great scientists who revolutionized our world-views. Unfortunately there’s a lot of work to be done: even well-meaning articles trying to popularize Turing’s work mainly describe him as an accomplished code-breaker and World War 2 hero. Turing did break codes, most notably the German Enigma machine, but calling Turing a code-breaker without mentioning his scientific impact is akin to calling Isaac Newton a treasurer, given his years as England's Master of the Mint.

So, this year, we academics will celebrate Turing’s great accomplishments by holding conferences, awards, and lectures in his honor. There will be documentaries released about his life. I’m sure an op-ed or two will be written in the newspapers. But I also hope that reporters, lecturers, and especially teachers, will help the the public at large learn about this pioneer of the information age.