Role of Wikisource
The year of Wikisource
A couple of weeks ago, I went to an event organized in Paris by the French Government about "economics of culture". During that event, I mentionned that the French chapter has several ongoing discussions with various museums to set up content partnerships.
Here are two examples of such potential partnerships:
- a small museum with very old and precious documents. The museum has limited room for access and documents are fragile, so only a few visitors are allowed to look at them. The museum wants to digitize these docs, but has limited technical infrastructure. Opportunity: we host their documents on wikisource and provide them additional visibility through an article on Wikipedia, featuring their best manuscripts.
- a large museum already has a digitization procedure for the documents, as well as a hosting service. However, the digitized version contains mistakes (errors generated in the process) and the museum simply does not have the human power to provide the corrections of the numerous documents digitized by their services. Our members can take care of this task.
Wikisources members know all that very well and much better than I. I just summarize that very quickly for reference.
In Europe, at least in some countries, we meet several problems
- many scholars have a rather bad image of Wikipedia (because written by amateurs, anonymous members, plagued by vandals etc...)
- the other wikimedia projects have rather poor popularity and would benefit from more "light"
- journalists are bored and need new information (otherwise, they focus on all the bad stories)
- some projects are more difficult to advertise than others, because they are full competitors with other commercial projects of very good quality (eg, wiktionary, wikinews...)
Besides, my feeling is that contributors and in particular members from chapters need a project on which they can team.
I would like to propose that next year be Wikisource year.
And since the planet is very large, if this is done in large part through chapters, that it be an opportunity for some european chapters to work together.
I am not necessarily thinking of anything very complicated. Examples of efforts we could make together:
- leaflets about wikisource updated and available in a large number of languages;
- webbuttons to advertise the project on the web;
- each time someone gives a conference about Wikipedia, take the opportunity to spend a couple of minutes of Wikisource as well; distribute leaflets;
- summarize our best cases on Wikisource;
- develop stories about these best cases. Illustrate. Feature these stories on chapter websites;
- develop initiatives on projects for cross project challenges (eg, best article with content improved in at least 3 projects);
- chapters may write and distribute a couple of press releases about wikisource;
- chapters may propose conferences about wikisource (and speakers available to talk about it);
- develop arguments for museums etc...
Measures of success are numerous, from improvements of Wikisource (number of docs), number of mentions in the press, partnerships established with museums etc...
What do you think ?
A role for Wikisource
Wikisource is really a much larger project than Wikipedia. Consider any public library: The encyclopedia shelf or quick reference section (Wikipedia) is less than one percent of the whole library (Wikisource). After seven years of writing Wikipedia, we are now getting useful results in many languages. Wikisource might take 70 years.
What we can expect during 2009 is some small step forward on this longer path. Taking a single step might sound easy, but it's hard enough to know which direction is forward.
If you can achieve real, practical, pragmatic cooperations that actually result in more free content, even if it is not very much, that is probably the best step forward. But you must be prepared that infighting and prestige among public institutions can be tough, especially when it comes to competing for funding.
There is a clear risk that this bad image is enforced. Our message that "anybody can contribute" is hard to combine with the prestigeous thinking among the institutions where you seek cooperation.
I'd like to recommend an article in the October 2008 issues of the open access journal "First Monday", "Mass book digitization: The deeper story of Google Books and the Open Content Alliance" by Kalev Leetaru, http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2101/2037
This article is just one in a ton of literature on how to scan (or microfilm) books, that have appeared in the last 20 years. But it is interesting because it evaluates two large-scale projects of the last few years, and compares them to each other. Even though "digital libraries" is a new science, it is already full of established truths. Perhaps this is due to the high involvement of public institutions. One such truth is that image compression (with JPEG artifacts) must be avoided at all cost.
Both Google Books and the Open Content Alliance (Internet Archive) break this rule by using consumer-grade digital cameras and JPEG compression, and should thus be considered a waste of time, according to conventional wisdom (or "best current practices"). Still, nobody can avoid being impressed with their results, and so the scientific world needs to revise its understanding of the current state of the art. The author of this article goes to great lengths (in the "Discussion" section) to explain that what these projects do is "access digitization", which is described as something completely different than traditional book scanning:
"Before one can compare the two projects, it is important to first realize that both projects are really only access digitization projects, despite the common assertion of OCA captures as preservation digitization. Neither initiative uses an imaging pipeline or capture environment suitable for true preservation scanning. The OCA project outputs variable–resolution JPEG2000 files built from lossy camera–generated JPEG files. A consumer area array digital camera is used to produce images ..."
Needless to say, neither Project Gutenberg nor Wikisource are mentioned in this article. Their goals are just too different (what? free content?), their achievements not impressive enough. They are not potential future employers of "digital library" scholars. If you help them or cooperate with them, you will only help mankind in an altruistic fashion (what fools!), you will not help your own professional or academic career.
In the article, the Open Content Alliance already plays the role of the fools. They have only (!) digitized 100,000 books, while Google Books has millions. They do not provide the same search capability. And so it goes on. The next time the Internet Archive (OCA) applies for funding or tries to establish cooperations with more institutions, such arguments might be used against them.
What Wikisource really needs to do, is to provide an explanation of what it does, and how this goes beyond Google Books' "access digitization". In Europe, this must be set in the perspective of ongoing French, German and EU initiatives (Gallica, Theseus, Quaero, Europeana, ...). When one of those projects applies for funding, it will need to show that it is successful in attracting cooperation partners and that it is a leader among similar projects. We should be prepared that they take any opportunity to define Wikisource as a loser, amateurish, clueless project. This is not because they are evil, only because they do what they can to get the funding they need.
Why should museum X or library Y or archive Z cooperate with Wikisource, when it risks being associated with such descriptions of failure? The alternative for that institution might be to cooperate with the successful Google or Gallica. So why is Wikisource superior? This is what we need to explain.