"From the trenches"

This past Monday, I had a chance a chance to give academic publishers a piece of my mind.

The Professional and Scholarly Publishing division of AAP held a pre-conference meeting that Monday to talk about the way technology, specifically “Web 2.0″, was affecting the way students consume and use academic content. They asked Sayeed Choudhury from the JHU libraries to find a few students who might be able to give them a student’s perspective; he brought me, a history post-doctoral student, and another computer science grad student. The event’s name was “From the trenches”.

We each had a few minutes to talk before we would each be inundated with questions. It was a very active, interested group attending our panel, and I really appreciated that. I made a couple of points, but I want to first highlight one thing that we at FreeCulture.org have been thinking a lot about lately.

That is, of course, Open Access. I believe that the public has a right to read the research it pays for with tax dollars. I further believe that as we start to taste what open access is like, pressure will come from the demand side. At lunch, I talked with an editor of a smallish journal on clinical oncology, and he said that neither he nor the people whose work he was publishing cared much about the access model, so they just did it the way the usually do it: charge for copies.

That’s why I think the demand-side change is so important. When academics realize that they’re cutting readers away from their work, they will demand to be published in an open access model.

I heard a presentation this summer from Science Commons about the work they were doing in analyzing and revisualizing a corpus of neuroscience papers, I was deeply impressed; I saw natural language processing technology being used to make the progress science more comprehensible. But most libraries aren’t allowed to let their students do such analysis because of the contracts they sign with publishers. Open Access has the power to do away with these restrictions; as always, it’s not just about getting it free of cost, it’s about freedom.

I started by telling them outright that students find publishers typically getting in the way of access rather than enabling it. I told them it was their job to fight this by providing value.One thing they wanted to know more about was, What sorts of tools do students use (or want to use) to manage reference information? I highlighted PennTags, an extension to the U Penn library catalog that lets students tag books and articles with tags in the style of del.icio.us. This sort of system – where students can tag anything with the same interface – is what we’re after. In the presentation after ours, an editor from Nature discussed their new collaborative filtering system for Nature articles, and that’s the kind of walled-garden approach that is so boring in the Web 2.0 era.

I told them about the Long Tail. Some of them worried that Web 2.0 becomes a popularity contest, but I pointed out that it’s not the same popularity contest for everyone. The niche markets can add up to more volume than the mainstream, as the famous Amazon example reminds us. Tim Stinson, the history post-doc, made a great point of this too.

To me, this is reminiscent of the same fight: How do we get the public to realize what freedom tastes like so it knows to demand more? Free Software and open source have been wondering this for decades, and are just figuring it out, but there are many more people who can read than can code. Being allowed to read and discuss current science is easy to explain and understand.

I closed by talking about authority. The Educational Testing Service had an “Information and Communication Technology” test last year that showing that not even half of college students can correctly determine the objectivity, timeliness, and authority of a document. I took issue with the claim that this was a technology-related skill set; when people hand me pamphlets on street corners, I need the same skills. Anyway, it’s my feeling that college students are only going to get better at this given the huge amounts of data we can access on the Web.

There was one question I want to highlight: Someone asked if we each purchase content online right now. Because the JHU library makes available every academic work I need access to (that isn’t already Open Access), I don’t buy academic work. But I also don’t buy textbooks online. I made it clear that students view DRM’d textbooks as broken, and that we would be willing to part with cash for non-broken digital delivery. Provide value, I told them.

Here’s the full text of what I remember saying. You can also read my notes from after the event for the few questions I remember answering.

First of all, let me thank you all for being here and inviting students to speak. Thanks to Sayeed for the great introduction.

I want to begin by honestly sharing with you the feeling that most of us college students have: publishers get in the way of access to knowledge. What I want you all to consider is how to change this perception.

Here’s an analogy from another field of publishing: Online music stores floundered for years until Apple created iTunes. For years and years you’ve read about college students sharing copyrighted songs. What did Apple do differently? They created an integrated experience, building the ability to pay for music into their music organizing program, and – crucially – they made every song cost the same amount of money.

Consumers with half a second and a buck could now, on impulse, buy the song they wanted and take it with them, even if just on Apple-branded players. Apple provided enough freedom for most users, and they allowed consumers to forget about the publishing companies and their various deals with Apple.

I want to turn to a case study in crossing Web 2.0 with academic publishing: PennTags, a project at the University of Pennsylvania. It integrates with the U Penn library catalog, and wherever you see a resource, you can tag it with words and phrases. The engineers worked out how to make every resource taggable, be it a New York Times page or a PDF journal article provided by Elsevier. The result is at tags.library.upenn.edu: students doing research can forget about publishers and just get to work with the library search engine; as they find things that look interesting, they match them up with projects and labels. When it comes time to write the paper, they get a personalized view of the library’s content. Even better, students can see each others’ tags of documents and get immediate insight about the material they’re browsing.

My mother uses EndNote, but most of my fellow students have never heard of it. They would love to use a web app like PennTags.

When end-users can forget about publishers, they feel empowered and happier. What you need to do is be of value to consumers like the record labels making money through iTunes.


These words and phrases that users use to tag books and articles on PennTags represent a common theme in the Web 2.0 mindset: make services flexible for users. You’ve already heard a bit about Folksonomies today. In 1994, Yahoo! was born to put every important page on the Web into strict categories. People moved on to search engines like what Google introduced in 1998: Google’s big idea was using web links as votes. If lots of people linked to some web page, it must be important! They used the massive size of the web to rank the pages in it, hoping that some sort of consensus would form that, for example, “white house” would go to the government web site instead of some unrelated commercial site. Today, tagging lets readers a part of the consensus that emerges around a web page.


I want to move on to a topic that I’ve been a bit involved in lately. A group of Students for Free Culture has been active in this space.

Open Access is a movement that has been gaining buzz lately. At its core, it represents the ideal that much of the world’s scientific work should be available free of charge for all to read. In a medium like the Web where copies cost nothing, electronic journal subscriptions cost libraries ten times as much as print subscriptions. Students like me familiar with the web find this amazing. Organizations like the Public Library of Science fund successful journals like PLoS Biology by getting money on publication; then they distribute copies for free. When so much research is funded by the government, students resent having access to that knowledge restricted behind pay walls.

But it’s more than just about getting papers free of charge. It’s also about freedom. A smart engineer, the likes of whom built PennTags, may come up with a way to visualize papers. Maybe he can discover and show citations, refutations, and supporting articles. But the library signed a contract with you folks banning him from running that analysis on the journals the library pays for. Open Access is about freedom, not just price.

Web 2.0 has built this terminology, the “long tail.” It’s used to describe the new opportunities the free linking on the web creates: there a vast
array of niche interests on the planet, and by letting everyone communicate freely with everyone else, they will find each other. Amazon, for example,will sell more books today that didn’t sell yesterday at *all* than books that did. That means the niche markets together are a bigger market than the bestsellers.

I want to close with a look at authority. I talked about consensus earlier, and Wikipedia is another success story of consensus on the web. Lately there has been some fuss about professors banning students from citing it, but serious students have never been allowed to cite encyclopedias anyway.

Many of you may be familiar with the Educational Testing Service’s study on so-called “technical competence” that was published toward the end of last year. The preliminary results indicated that only 49% of college students could identify correctly a website’s authority, objectivity, and timeliness. I object to the use here of the phrase technical competence: this has nothing to do with technology. These are reading skills problems. They are exacerbated by the breadth of information available online, but if someone on a street corner hands me a pamphlet, I need just the same abilities.

It used to be that the academic library was the largest repository of knowledge students would run into. Today, that’s the web. We’re getting comfortable with tools to manage constant information overload, so we’re going to expect academic versions of the same tools to help us do our research. In fact, we may even want to bring our existing tools in. That’s why I like PennTags so much; students who use social bookmarking sites like Delicious can apply the skills they already have.

We do need to evaluluate authority on the Web. If I read something I have trouble believing, I ask a service like Technorati or Google blog search what the rest of the web is writing about it, right now. Savvy students already know, and higher education is working on teaching these skills to everyone. So we’re going to get better at it. The line between academic and personal use is definitely blurring because, for the first time, we have access to vast amounts of information (whatever you want to say about its quality) outside the library. We’re talking to each other about it, linking to articles in our blogs, and discussing it in the comments.

I started by saying that we feel publishing companies restrict knowledge. The contrasting look we have is at libraries, who handle making information available. If you want, you can see that like the iTunes music store: we can forget about the publisher and get right to the information.

On the other hand, I have seen electronic textbooks that students are not allowed to copy and paste from. The more you lock it down, the less value you provide, and the more students will look elsewhere.

Leave Yours +

One Comment

  1. nice but I come across this blog looking for small other things. probably means that this page has visibility for a keyword that I’m sure doesn’t seem to be appropriate to the subject you are writing about here

  • Comments are Closed