Welcome to Season 2 of Beyond the Stacks! We’re now back to our regular schedule, with a new episode on the first day of each month.
On this month’s episode, we hear from Alex Wade, the director of Scholarly Communications at Microsoft Research, where he’s currently focused on Microsoft Academic, involving aspects of knowledge acquisition, knowledge representation, intentionality, dialogue systems, semantic search, and intelligent agents.
Join us as we discuss search algorithms, software development, anthropomorphic dogs, and the roles librarians are fulfilling at major technology companies.
Length – 30:42
The first 30 seconds of this episode are a little fuzzy due to technical difficulties, but the rest of the podcast’s audio is normal.
We hope to see you again on March 1st, when we’ll be featuring a new interview with an expert in video game and software preservation.
See below the fold for a full transcript of the interview.
Derek: Hello, and welcome to Beyond the Stacks: Innovative Careers in Library and Information Science. My name is Derek Murphy, and today I’ll be talking with Alex Wade.
Alex is the director of Scholarly Communications at Microsoft Research, where he’s currently focused on Microsoft Academic, involving aspects of knowledge acquisition, knowledge representation, intentionality, dialogue systems, semantic search, and intelligent agents.
Alex holds a Bachelor’s Degree in Philosophy from UC Berkeley and a Master’s of Librarianship Degree from the University of Washington.
During his career at Microsoft, Alex has managed Microsoft’s internal corporate search and taxonomy management services, has worked on Windows search for multiple Windows OS releases, and has implemented an open access policy governing Microsoft Research’s scholarly output.
Prior to joining Microsoft, Alex worked in the library systems of the University of Washington, the University of Michigan, and the University of California at Berkeley. Hi Alex!
Alex: Hi Derek, Hi.
Derek: It’s great to talk to you. Thanks for joining me!
Alex: Thanks for having me on.
Derek: So I wanted to start out by talking about your educational background, your advanced degrees you’ve gotten that brought you to where you’re at.
Alex: Yeah, as you mentioned in the bio, I have an undergraduate degree from UC Berkeley in Philosophy. That was really studying a wide variety of… Philosophy of Language, of Intentionality, Philosophy of the Mind. A lot of courses that I took with John Searle, Hubert Dreyfus, Donald Davidson, who were sort of leaders in the field in this area. And then more recently, although not that recently I guess, I also have a Master’s in Librarianship from the University of Washington.
Derek: So why did you choose to get your Master’s in Library Science? What brought you there?
Alex: I think as an undergraduate student back at UC Berkeley, I probably had my first job in a library. It was something that, you know, you don’t think too much about these things when you’re a student and just seeking some additional income there, but it was something that I enjoyed. Both my brother and his wife were and are librarians, so I got a lot of some of the inside baseball from them about both academic libraries and law libraries. And then as soon as I got out of my undergraduate career with a massive degree in philosophy, my first official job in the real world was a library clerk at the University of Michigan.
Derek: So that got you kind of looking at that field, not just from your relatives’ point of view, but from your own.
Alex: Yeah, exactly. And that stage of my life, I wasn’t headed for a library degree, it was a job that I could get. I was 21 years old. In my mind at that time, I was headed for a law degree. So after about a year in that position, I actually quit my job at the University of Michigan library and got a job in the law office. And it didn’t take me very long to determine that that’s not really what I wanted to be doing.
Derek: Mhm, yeah, I’ve had this experience too. I kind of jumped around, ended up in the film industry for a little bit and man, yeah, very quickly decided to go and try librarianship based on a library clerk kind of job I had too.
Alex: Exactly. The timing of this as well was sort of interesting. As I backpedaled from that law degree, it was right on the cusp of the World Wide Web, things like WAIS and Gopher. It wasn’t too difficult, at least for me at the time, to see that the field was going to start undergoing some fundamental changes. It was quite exciting to think about what the prospects of that would be, to get that degree at that specific time.
Derek: Yeah, that was definitely an exciting time in that domain. So, how did you envision the MLIS aiding your career goals?
Alex: Well, to be honest with you, at that point in time I’d worked at a couple of large academic library systems, and I liked the idea of staying in an academic environment. So in my mind at the time, I was… In fact, at the time I started my studies for the librarianship degree, I was working as a clerk at the University of Washington Libraries. So in my mind, it was the stepping stone to bigger and better things in the library systems within an academic library setting. I think if you’d asked me at the time, I’d have said, “that’s what I’ll be doing for the rest of my career.”
Derek: Mhm. And, well, I suppose that we know the answer, but… I guess things didn’t shake out quite that way, huh?
Alex: They did for a number of years. As soon as I got my degree I was lucky enough to be offered a tenure track librarian position at the University of Washington. I started first at the Engineering Library there, and then moved in to the Library Systems Office. And because of my background in Philosophy, I was also the Philosophy selector for the Philosophy Department for a while. So things were shaping up nicely for me. And I’m not sure where I took the right turn, but did get offered a position at Microsoft, and things have changed since then.
Derek: Yeah. So, I guess, I am curious how that pivot point happened.
Alex: It wasn’t as huge a transition at the time, because I actually moved over to the Microsoft Library. Microsoft had a main library here at our main campus in Redmond, but also had a few branch libraries, one within Microsoft Research, and a number of other places. The attractive thing for me at the time was that in addition to being a corporate library, sort of fulfilling the information needs of employees, not just with the physical collection but with access to a lot of external news sources and market research reports and things like that, but that the team I joined also had a number of responsibilities for trying to shape the corporate intranet and think more broadly about the information needs of a set of corporate users beyond just marrying them up with this finite set of information resources.
Derek: Going back into talking about your studies, I’m curious, when you were getting your Masters in Library Science, what were some of the coolest things that you were able to work on during that time?
Alex: As I mentioned, during the time that I was in the program, I was also a full time employee of the University of Washington Libraries. That had both positive and negative impacts on my studies. One is that I was oftentimes running between my job and classrooms. But one of the nice things about it was that I was able to take some of the things that were perhaps more theoretical or hypothetical library management questions that arose over the coursework and actually take some of the real world situations that I was encountering and feed it into the curriculum. So it had a very sort of pragmatic effect on me, because I was sort of living it in my job and studying between the times I was working.
But I think probably the most interesting aspect, the thing that I still hold with me is… We had a program for directed field work at the time, and my advisor was Raya Fidel, and I went to her at one point as said, you know, I’d really like to do some directed field work. I’d like to do something in the area of information retrieval and the evaluation of information systems. And she hooked me up with Micheline Beaulieu and Stephen Robertson in the UK, who had been running the Okapi Project, the BM25 ranking algorithm, for a number of years. So I had a fantastic opportunity to move over to the UK for a couple of months, help design and conduct some usability studies on some of the early graphical user interfaces that they were building on top of that system.
Derek: Nice. So, now that you are where you are, in your professional career up until now, like, overall, what are some of the coolest experiences that you were able to be a part of thanks to that library science degree?
Alex: I’ve been blessed to land in the company I’ve landed in. As I mentioned earlier, I came to work initially for the Microsoft Library. So for me, at that time, it was not a huge transition. It was a transition from an academic library to a corporate library, and sort of what that implied, but I was still in a librarian position. But it quickly dawned on me that there were a lot more opportunities at Microsoft.
In fact, if you go back in time and if you talk to Microsoft about how many librarians are employed by the company and where are they, we were probably ten to fifteen people, and we all worked for the corporate library. Over the years of growth, we the librarians, or the librarian community, started seeing additional opportunities. A lot of the people that I worked with in that initial era are now program managers, are dev leads, are test managers, are writers and content producers across the company. So I think it wasn’t just me taking advantage of an opportunity, but it was also, the opportunity for us as a field to demonstrate some of the value we could provide.
Derek: That’s very interesting. So it seems like there are a number of librarians over at Microsoft fulfilling a pretty wide range of roles, huh?
Derek: And is this kind of a new thing, that Microsoft is putting librarians in various different positions?
Alex: You know, I don’t think that anybody at Microsoft would necessarily say that we are actively seeking librarians for particular roles. But rather, what I think it is is that the types of roles and the types of expertise that we as a company have been trying to grow have been, in many cases, ripe for the taking for the library community. It was really one of allowing people with library degrees to imagine themselves in roles that are not with the title of librarian per se, and being able to articulate the skills they have.
One of the other big companies that we have in our back yard right now here in the northwest is Amazon. Amazon hires a huge number of MLIS graduates right now as taxonomists for the Amazon service. I think they do actually have as a desired degree on some of their job descriptions an MLIS degree, but that’s probably not always been the case.
Derek: So it seems like the MLIS degree sets you up with a variety of skills that are actually a natural fit for the technology industry. So what kinds of skills would you say are at the forefront helping librarians working at technology companies?
Alex: I don’t know if I can provide a comprehensive analysis of that, but I can sort of speak to it from my own personal perspective. One of the first projects that I worked on internally here at Microsoft was to develop a taxonomy management system. It was something that was used by our own team as a way of managing a lot of the navigational taxonomies that went in to some of the web based projects we were producing, but it also was the early days of what one would now call a knowledge graph, a set of relationships between concepts that we used as an enhancement to our information retrieval system over the corporate intranet.
And it was not my expertise, but some of the expertise of the people on my team that were able to draw upon the skills of cataloguing and of, more or less, knowledge representation, that allowed us to build a very robust system. And we quickly scaled that system up so it wasn’t just running our own services, but it was also powering a lot of the different portals across our corporate intranet. The sales team had a portal, the finance team had a portal, the IT department had a portal, and they were all built upon a set of services and a set of managed solutions that my team was running for them, to the point where it actually started powering all of the Microsoft.com website, one of the largest websites in the world, in dozens of different languages.
Derek: So it grew in scale beyond what you’d originally expected?
Alex: Exactly. And that was sort of tied very much to understanding information retrieval and being able to translate user needs and user intent into functional requirements for how search engines should behave and should behave beyond just the traditional TFIDF type of rating or BM25 type ranking of ‘here’s a set of documents, now it’s up to you to figure out…”
So the next stage of my career at Microsoft really took me in to some of the early days of our Sharepoint search software development and becoming an early adopter for that and providing feedback to that team, after which I became Program Manager on the Windows team. I’m the one responsible for killing the dog [from Windows XP’s search engine].
Derek: Oh, thank you! (laughs) You know, I wanted to ask about that dog.
Alex: (laughs) Yup, that was based on an old Grep search engine that basically took the user’s term and said, “ok, we’re gonna go open up every single file and see if we can find this string and match it.” And what we replaced that with, several versions of Windows ago, was a fast inverted index search over your files, your photos, your music, your email, etc.
Derek: Thank you for killing the dog. I’ve gotta say, the newer Windows search is a vast improvement.
Alex: My daughter wasn’t happy about it, but there it is.
Derek: Well, you can’t please everyone (laughs).
Alex: So now, for the past almost eight years, I’ve been within Microsoft Research. One of the things that you may have seen evolving in large web scale search, in Google, in Bing, in Yahoo, is moving beyond the notion of traditional information retrieval as string based search. You type in a few words in the search box, we view those as strings, and we give you back a bunch of web pages; the sort of traditional notion of “ten blue links”. And what that is being supplemented with today is knowledge based search. Searches over knowledge graphs. Understanding a deeper level of user intent, and trying to match that up with things, with objects, with people, places, and things. So you see this very often in a lot of queries. Somebody types something like “weather Boston.” Humans are really good at understanding what it is that the person wants. They don’t necessarily want a list of webpages that have the string “weather” on it and the string “Boston” on it. What they really want is, they want to know what the temperature is going to be today or tomorrow. And so it’s that level of understanding user intent and then building up a large knowledge infrastructure around it that I’ve been working on it, both working with the Bing team so they can scale out their knowledge infrastructure, but also specifically within the domain of scholarly communications and understanding research outputs. What are the topics that people are writing about? What are the most important papers, the most important journals, the most important conferences… is this sort of localized neighborhood of this larger knowledge graph that my team has been working on.
Derek: I didn’t know that you were involved with Bing search.
Alex: Yup. All the teams within Microsoft Research, or many of the teams within Microsoft Research do very direct partnerships. More traditionally, we’d consider these more like tech transfers, where we’d do some research, we’d build prototypes, and we would present them to the product team, saying “Hey, do you like this? You may consider doing something like that.” But more recently, we’ve graduated the sets of relationships that we have with teams, have with the product groups, so that we work very much hand in hand with them. One of the great examples, the big successes over the past year, out of one of my sister teams here within Microsoft Research, is taking a lot of our research in machine translation and actually embedding that group within the Skype team so that you can now do a Skype call between two people who don’t speak the same language. And the machine translation will kick in and do real time speech-to-speech translation. Which is something that a year and a half, two years ago, people would have said is impossible. And because we have this very deep bench of researchers who understand the space, we were able to build this and ship it as part of Skype.
Derek: Wow, is that feature active on Skype today?
Alex: It has been in public beta for a while. I believe that it is now officially launched as part of the Skype client. It may only be in the Skype client for Windows at the moment, I don’t know when it’s coming to other operating systems.
Derek: Ah, I see. Is it like… one person speaks, and the computer translates it, and then a computer voice speaks the translated version, is that it?
Alex: Exactly. Yeah.
Derek: That’s amazing. Some Hitchhiker’s Guide stuff.
Alex: (laughs) Make it smaller and we could turn it into Babel Fish, yes.
Derek: Yup, I was just thinking that. So, you work with a lot of different teams that are working on specific products then?
Alex: For the most part right now, the work that my team does here is working with the Bing team, and a lot of the back end infrastructures, the knowledge graph that generates a lot of the answers on Bing, and that also powers a lot of the experiences on Cortana, our virtual assistant. So, Cortana is something that’s been around for a little over a year as part of the Windows Phone environment, and then a couple of months ago when Windows 10 shipped, Cortana’s now part of Windows 10 and is in the process of being released for both Android and iPhone.
Derek: Alright. So I’m curious about Microsoft’s relation with academic researchers. Could you elaborate on the ways that Microsoft works with academics, and perhaps the ways that your background with academia aids you in helping that?
Alex: Absolutely. The first thing to keep in mind for us as a research organization is that we behave very much like a large computer science department. We have roughly a thousand people, maybe a little bit more now within our research division. And ninety percent of those people have PhDs in computer science or related fields, and have been brought in to Microsoft to continue that research. Notwithstanding what I mentioned earlier about our partnerships with the product teams, there’s still a significant portion of the researchers here within MSR that are really doing core, fundamental CS research, rather than more of the applied things like I mentioned with Skype and Bing.
But for us, academia represents a tremendous amount of opportunity. For one thing, it’s a pipeline, and we have a very strong PhD internship program. I know at least here for our Redmond lab, we have several hundred interns who come in on an annual basis and typically will spend about twelve weeks with us, working on projects, and that’s also true of some of our other labs around the world. There’s also a lot of areas where academic researchers are doing things that extend or complement the areas of expertise that our own researchers have research interests in. So there’s a lot of collaborations and partnerships. We fund individual research projects and institutes on campuses. We spend a lot of time on campuses just sort of doing knowledge sharing and exploring possibilities for collaborations on future work. So for us, the University, the academic environment is a critical piece of our overall infrastructure.
Derek: Awesome. On a totally unrelated note, another thought I just had, I think a lot of Master’s of Library Science students are interested in technology, but may not have a very strong technology background. I’m curious how someone who is in library school now, that doesn’t have, say, a Computer Science degree, or super strong background in computers, how do they develop their skills or knowledge to the point where they could be working in the technology field?
Alex: I mentioned earlier that one of the opportunities that presented itself to me around going to library school in the first place was really the timing of information technology taking off within libraries and around the world, and I saw the opportunity, but also realized for myself, that I did not want to be a developer. That wasn’t a direction that I wanted to go. So I actually went through my master’s degree program with the aim of sufficiently understanding the capabilities of technology so that I could make the right decisions without necessarily needing to understand them at depth. And that’s sort of a hard line to toe, especially now as I work in a technology company, but also within a research organization.
And I have conversations on a daily basis with people who know infinitely more about a very large number of topics than I do. The trick is being able to understand them sufficiently so that you can carry on a conversation and find the applicability of a technology or an approach to a particular problem that you’re trying to solve. So it ends up becoming a communication issue more than anything else.
But I wouldn’t let the sort of lack of experience in high tech, or the high tech industry, persuade people or dissuade people from pursuing careers. I think that there’s obviously a certain baseline proficiency in software that people ought to have, but you should have these things anyway as a process of going through graduate school. And I think a lot of the companies, as I mentioned before, we have people here at Microsoft whoa re program managers, who don’t necessarily have technical background, people who are taxonomists and editors and content creators, who are all leveraging their library degrees with varying levels of technical proficiency.
Derek: So it sounds like, if you know your niche and you know your area of expertise well within the company, and if you are mentally agile enough to carry on a conversation with someone about something that you might not necessarily be an expert in, but well enough to be able to communicate about it, I guess that’s the key, huh?
Alex: Yup, precisely.
Derek: Well that’s very heartening to me to hear, and I’m sure many others who can’t necessarily, you know, code up a huge piece of software, but know enough to kind of get their hands a little dirty.
Alex: Yeah, exactly. That ability to get your hands dirty and play around with things is a fantastic skill. It should not be undervalued. Even if you’ll never write a line of code that ever gets put into production, the ability to hack things and play with things is a great skill.
Derek: Awesome, well, I think that that about covers it, unless, did you have anything else you wanted to add to the conversation?
Alex: One of the things that I’m fascinated by, and that I don’t think we studied much at all when I was in library school, or if we did I wasn’t paying attention that day, is really the area that I’m working in now, around scholarly communication and the research outputs of academic researchers. What I mean by this is that there’s this process that goes on, the scientific method, whereby research gets done and discoveries get made, and there’s this ancillary cycle of how that information gets disseminated to the rest of the world. Which, you know, we’d probably refer to as scholarly communication. So somebody conducts research, but how is that research put onto paper, how is it reviewed, how is it collected, how is it disseminated, how is it accessed, and how is it preserved? How is the sum total of all human knowledge put into a system so that all of these verbs can be applied to it for all of humanity for all of time? It’s sort of a grandiose way of putting it, but I don’t mean to underestimate the value of doing this.
What’s happened over the past few hundred years of this method is that it’s led to a set of conventions or customs and… Business practices I guess, around things like the publishing industry, and around tenure, and around evaluation of researchers and of research, which has really sort of ossified itself. And in the advent of the information age and information technologies, a lot of people have come along and have asked questions about: “Aren’t there ways that we can make this process more efficient? Aren’t there better means than the .PDF as a digital representation of paper to communicate research?” for example. And that’s led to a lot of conflict and destructive ideas over the past few years.
But at the core, I think some of the ideas there around access to information, specifically the conversations around open access, around open science, data sharing, concepts like post-publication peer review and alt metrics. These are all, I’d say, instances, of the community saying “let’s reexamine some of these ossified bits and see if we can imagine a better system, and one that leverages this amazing communication and digital infrastructure that we have right now.
So it’s sort of within the context of that which we’re building some of these services, but at the same time there’s still some unresolved questions that I think the library community needs to take on, which is… There are large companies like Microsoft with Bing, and Google, and what IBM is doing with Watson, that are all private entities creating solutions here. And what I think is the missed opportunity is for the library community really to think about: how are these things being collected? How are these things being managed and stored? What really is the knowledge architecture that libraries are responsible for maintaining moving forward? Is it just the digital representation of scholarly objects, or is it something bigger than that, as we start talking about things like knowledge spaces and knowledge graphs?
There continue to be more questions that are coming up in this domain than we’re necessarily addressing, and my fear is that the library community, specifically the academic library community, is still continuing to do some of the same things that they’ve done all along, and ignoring some of the bigger questions that are coming up.
Derek: Wow. That is very interesting. I don’t know, do you have any kind of… ideas or prescriptions for next steps for getting there? Like, what can the library community be doing?
Alex: Well, I don’t have an answer to that question. But if you look back at the history of higher education, of universities, and sort of by extension, of libraries, a lot of what they were founded on was the notion of scarce resources. Scholars were scarce, and books were scarce. And so, establishing a university as a place where you could co-locate these people together and these materials together so that a scholar would travel to Cambridge University to visit their library and to learn from the scholars, is sort of what made the university system make sense. Because of that, there’s this notion of creating a collection of resources and, to a great extent, a lot of libraries had very duplicative collections along with some special collections that made their collections unique. What I think is missing somehow is how we create a bridge to this more global infrastructure. Since we do have the Internet, the web, at our disposal, how should libraries reimagine themselves to be contributing to a much larger single collection that can be accessed from anywhere? How should libraries reimagine themselves so they’re spending more of their resources focusing on those bits that do make them unique, those special collections? Where they’re the only ones who can actually supply that into the larger graph and larger community.
So, I don’t have answers, but like when I started library school many years ago, I think there’s still some important and critical questions that need to be addressed here. So I’m quite bullish on the future of our profession.
Derek: Most certainly. Again, I thank you very much for your time. Very cool work you’re doing, and it was cool talking with you, and thank you for killing the dog.
Alex: (laughs) Derek, it was my pleasure.