New book: The Digital Potlatch

September 20th, 2011 2 comments

Klallam people at Port Townsend The Wikipedia Editor Survey 2011, published last April, emphasized the importance of explicit acknowledgement and recognition of effort among Wikipedia editors as an instrumental factor to sustaing and grow its community over the next years (page 4):

Positive Reinforcement: Acknowledging the effort of editors is important to reverse the editor decline. It is a commonly held view that editors just want to see their articles improve and read by lots of people and they don’t care about the opinion of their peers. This is false. The survey finds that acknowledgement of peers via a nice note or a barnstar (or kitten) is valued even more highly than achieving featured article status. To sustain and grow our community, we need to provide each other with positive feedback, and we should create tools to make it easy to do so.

In fact, this is the central argument of “El Potlatch Digital: Wikipedia y el Triunfo del Procomún y el Conocimiento Compartido” ["The Digital Potlatch: Wikipedia and the Triumph of Commons and Shared Knowledge"], a new book that I have written along with Joaquín Rodríguez, vice-dean of EOI. The book has been published in Spanish by Ediciones Cátedra, and now it should be available in your favourite book shop.

Participation in Internet communities has been a fascinating topic for researchers, practitioners and members of these communities. A previous study by Michlmayr, Robles and González-Barahona showed evidence of lasting volunteer participation in Debian. In this work, they defined the half-life of contributors as the “the time required for a certain population of maintainers to fall to half of its initial size”. Their estimation for the half-life in Debian was 7.5 years. In other words, after 7.5 years of project evolution we can still find 50% of the initial Debian maintainers participating in the project. Enough said about commitment of Debian developers.

In the case of larger online communities like Wikipedia we need to account for the effects of casual contributors versus more active and experienced editors. In any case, our study on the inequality of contributions to Wikipedia, published in 2008, shows that the balance between casual and very active contributors has remained stable since many years ago (2004). Even more interesting is the fact that this balance did not experimented any variation from 2007 onwards, despite the well-known “plateau effect” in the monthly number of edits to the largest Wikipedias starting that year.

Unfortunately, it is not possible to infer possible causes behind this behavioral patterns from observational studies like these ones. What does it make participants to stay in online communities? What factors motivate them to contribute? Why do they stop participating? This book is an attempt to shed some light on this, mixing empirical results with qualitative investigation (interviews to editors in the Spanish Wikipedia).

Our conclusion is clear: meritocracy and effort recognition has a central role in the motivation of contributors in collaborative habitats like Wikipedia. This resembles the Potlatch, an example that let us understand how in certain contexts we need to give away our capital (material or intangible) so that the community can give it back to us as acknowledgment, recognition and renown. As a result, in these collaborative habitats the working capital does not have a monteray but a symbolic nature, under the form of reputation and popularity, and the logic of its accumulation demands unselfishness to create antoher form of social value. We don’t claim that this example is valid for all kind of Internet communities, but some of the best-known cases (such as Wikipedia) exemplify the triumph of shared knowledge and Commons over other individualistic strategies.


PS: We believed that it was a great opportunity to publish this book in Spanish, specially with a reputated publisher such as Alianza, given the lack of books about Wikipedia in our native language. However, we would be very happy to have this book also available in English. So if you can help please let us know!

Open data sets in science

May 18th, 2011 No comments

I have a question to challenge all my colleagues working with research data in Computer Science: When was the last time you could replicate a previous study, from other author(s)?

For different reasons, over the past few months I have found myself diving into the rich collection of previous research works in several areas: Wikipedia studies, libre software engineering, social media and social network analysis, to name a few. Probably, many of you already know my inborn bias towards quantitative research (but also for multidisciplinar research methods). So, it may sound totally unsurprising that most of the publications I was reviewing included empirical experiments on different datasets gathered from a wide variety of sources, target systems and virtual communities. As I was scrolling through the pages, I realized, once again, the huge proportion of research work that cannot be replicated in a easy way. Still a sad lesson to be learned, considering that, today, most of us researchers work with digital data. And bits can be duplicated or sent to the other side of the world at negligible cost.

4-digit combination padlock I already commented in my first post the curious study conducted by my colleague Gregorio Robles, about replicability of research works published in MSR. For those of you unfamiliar with MSR series, this is a working conference (formerly a workshop) devoted to the art of “Mining Software Repositories”. It is also co-located with ICSE, preeminent conference on software engineering, so it attracts the top-notch specialists in this area. One would expect that a scientific conference focused on such an empirical, hands-on activity would encourage (and even demand) the ability to access all datasets and tools used in previous experiments, in order to i) better learn the insights of different methods and practical solutions to problems in this area and ii) to make their life easier to other researchers willing to build on top of existing methods, tools and results.

Far from this, the conclusions from the replicability study were quite dissapointing. From the 171 papers published in the 6 previous editions of MSR, the most frequent case (64 papers) is that of a study that uses publicly available data sources, but it doesn’t offer access to the processed dataset (the results), or to the tools/scripts to perform that study, either. Even more worrisome is a trend discovered in these publications: as time goes by, the number of papers with publicly available processed datasets was lower!! Therefore, the situation is getting worse.
Read more…

Categories: Conferences, Open Movements Tags:

Wikipedia is not a place for promotion

January 24th, 2011 12 comments

Last week, one of the most popular questions asked by journalists, bloggers and other people reflecting on the 10 years of Wikipedia was: what are the main challenges for Wikipedia over the next 10 years? In my list of answers, I remarked conflicts around self-promotion in Wikipedia as one of the topics that will create many issues in due course. Indeed, with more than 400 million unique monthly visitors, according to comScore data, Wikipedia is now the 5th most visited website. That also make it a major attraction for experts in promotion, PR services, marketing and advertising.

What I didn’t expected (honestly) is that, one week after that, I would be able to find an exemplary case of this kind of issues. I’ve just discovered WikiExperts, a division of OnlineVisibilityExperts, which offers (I quote verbatim):

INCREASE VISIBILITY AND CREDIBILITY of your company, brand, or product by being present in Wikipedia – world’s largest and most used research tool. Wikipedia has more traffic than Twitter, LinkedIn, MySpace, Delicious and almost all other social media. Your social media marketing strategy is incomplete without it.

For anyone familiar with Wikipedia policies, it’s obvious that this service comes into conflict with one of the things Wikipedia is not: Wikipedia is not a soapbox or means for promotion. If we take a closer look at the previous blurb, we can find some questionable points. The statement that Wikipedia is “the world’s largest and most used research tool” is very clever, but somewhat biased. Wikipedia is an encyclopedia. Thus, it can be used (in the adequate way) to start our own research, pointing us to more authoritative information sources. Wikipedia’s accuracy depends, among other things, on the many reviews from volunteer editors and support from outside, reliable information sources (a.k.a [citation needed]). It’s just the first step, not the end of the journey. Another point quickly grabbed my attention. These folks are explicitly considering Wikipedia as a key part of a social media marketing strategy, comparing it to Twitter, MySpace or LinkedIn. Should we really do that? Well, I think the answer is: no, we shouldn’t.

Many people tend to think about social media in terms of audience and outreach. Today, marketing and PR experts constantly follow global trends to identify where (phisically or, now more frequently, virtually) we spend most of our time. It would be really tempting to consider Wikipedia as a great platform for promotion. Except for the fact that Wikipedia is an encyclopedia, which imposes some restrictions not shared by other social media. In Twitter, for example, I am free to talk about anything I want to. And this also includes expressing my very personal point of view about any topic. However, in Wikipedia editors must follow certain policies, in particular the five pillars. More precisely, NPOV is incompatible with promotional purposes about topics in which one might have direct interests (such as charging for writing a Wikipedia article on behalf of a certain company, organization or individual).

All the same, we can even find the WikiExperts code of ethics, where they state they adhere to Wikipedia policies, such as avoiding opinionated, biased or unsupported content, not removing negative information, writing about not notable topics or performing activities contrary to Wikipedia principles. Even after reading this, one wonders how this could be ever achieved if you are being payed, explicitly, to improve the public image your contractor. We must also note that this is quite different from initiatives such as the Public Policy Initiative or scientists improving Wikipedia entries on RNA biology. These editors doesn’t have a direct interest in presenting  a certain point of view. They just want to improve Wikipedia’s coverage about those topics, for the common good. It is very difficult to assert that you can do the same if you are an interested party in a “social media marketing strategy”.

This is not the first time promotional affairs have been detected in Wikipedia. In his excellent book, Andrew Dalby points out some examples such as Marshall Poe or Boy*d Up (this one apparently misinterpreted). As time goes by and Wikipedia popularity continues to increase, I’m afraid that we will see a significant increment of these actions. However, we must remember that Wikipedia is not just like any other social media. If Wikimedia Foundation is working to keep it going without advertisements, it is for very good reasons (like preserving the NPOV pillar). Any other attempt of circumventing these basic policies would be just trying to subvert the principles of Wikipedia itself, those that led it to become the flagship reference for open content that it is today.

As a final remark, let me clearly state that I’m not arguing against the business of social media management or PR. On the contrary, companies should care about building a better image in the virtual world, and creating more agile communication channels within their own community, as well as with customers. In fact, Wikipedia has developed some mechanisms for on-line social interaction, but this is wholly aimed to support debate on writing encyclopedic articles.

Categories: On-line Communities, Wikipedia Tags:

One for all, and all for one: 10 years of Wikipedia

January 15th, 2011 No comments

Today, January 15, 2011, Wikipedia is turning 10. Probably, you have read, listened or watched the news and reminders about this landmark. Maybe, you have also read about important milestones in Wikipedia history, some of its bizarre facts and traits, as well as good wishes from many people. Finally (as usual), you can also find lists of several hoaxes and pitfalls found in Wikipedia articles over this period.

But even this huge impact in mass media and social networks will eventually fade out. What will happen, then? Well, we will come back to our daily routine: going to work, attending high school or university, driving home,  hanging out with friends, going on vacations… Nonetheless, something will continue to make a difference. Wikipedia, the open encyclopedia that anyone can edit, will always be available, whenever we need it. Thanks to thousands of inidividual donors , it has truly become a fundamental tool of our networked information economy.

The best examples are real stories from real users. Today, I was interviewed with Raystorm (sysop of the Spanish Wikipedia) in La Ventana (Cadena SER), a national radio show. At some point, Gemma Nierga invited her audience to call the program, send tweets and write Facebook updates to share their opinion and thoughts about their daily experience with Wikipedia. It was really illuminating. Guillermo, from A Coruña confessed: “Wikipedia has established a turning point in our bar gatherings [...] It has simply ruined them in one fell swoop. You only have to look up the answer in Wikipedia, and you are done”. Yeap, I can remember many of those: which soccer player was the top scorer last season? In which year did that movie open? Amparo from Madrid is also “delighted”. She is over 65, she keeps on working and, today, “Wikipedia saved my life, twice!!” while working with a German colleague. She had to find the correct German Lander corresponding to several cities mentioned in a report. In just a few minutes, she was done. “It is impossible that I had a book about German States in my office!”, she concluded.

That is Wikipedia, in pure state. That is why, despite we all know that many articles could contain some flaws at a given moment, it receives more than 400 million unique visits per month, and Wikimedia Foundation projects (summing up all their traffic) are the 5th most visited websites in the world, and the only ones in the top-10 supported by a non-profit organization. That is why we use it at work, in education, writing blogs like this one, hanging out with friends, and in a myriad other different situations.

This is the past, and the present of one of the flagship projects of Internet, sustained by open collaboration, producing free content available for everyone, at no cost. Sometimes I smile when I remember how some people, back in 2005, stared at me with a strange, fascinated expression to come out with something like “Wikipedia… seriously? Is that the topic of your thesis?” I am glad that I chose Wikipedia.

With 17 million articles in more than 270 different languages, it is tempting to state that Wikipedia has already reached a well-established position. However, the project must continue to improve its quality and accuracy, and broaden its content, restlessly, fuelled by the spirit of dynamism, openess, collaboration and free content. Wikipedias with fewer articles will increase their number of entries. We wait for better participation from countries and region in the Global South. The editing interface will become easier and more intuitive, to make it accessible for a wider group of potential editors. The list is both challenging and encouraging.

Dartagnan and The Three Musketeers

Wikipedia is made by the people, for the people. Therefore, as a new digital incarnation of the commendable spirit of The Musketeers, Wikipedia depends on our work, and we now depend on its content. it will evolve to answer the needs of our interconnected society. Let’s work together to make Wikipedia a remarkable accomplishment of our open, collaborative, digital world.

Una para todos y todos para una…
Une pour tous, tous pour une…
One for all, all for one…
Una pro omnibus, omnes pro una…
[You can place here the translation in your own language]

Happy birthday, Wikipedia.

Categories: Wikipedia Tags:

OpenRespect.org: social guidelines for open communities

November 11th, 2010 No comments

OpenRespect.org

Support OpenRespect.org

Some time ago, I read a paragraph on the book “Producing Open Source Software“, by Karl Fogel, explaining the need to write down conventions and agreements that have become essential for daily life in an open source community. In this way, people joining your community at a later point can quickly grasp its folklore and tacit rules (not only techincal rules, but also for social interaction).

Since the book is licensed under CC-BY-SA 3.0, I can post the following excerpt to illustrate the above point:

Don’t try to be comprehensive. No document can capture everything people need to know about participating in a project. Many of the conventions a project evolves remain forever unspoken, never mentioned explicitly, yet adhered to by all. Other things are simply too obvious to be mentioned, and would only distract from important but non-obvious material. For example, there’s no point writing guidelines like “Be polite and respectful to others on the mailing lists, and don’t start flame wars,” or “Write clean, readable bug-free code.” Of course these things are desirable, but since there’s no conceivable universe in which they might not be desirable, they are not worth mentioning. If people are being rude on the mailing list, or writing buggy code, they’re not going to stop just because the project guidelines said to

Well, I completely agree with this point of view. However, over the past years FLOSS has become quite popular among a broader audience. And we have to acknowledge that some of these new participants may not have this very simple, but fundamental perspective in mind, for multiple reasons. There have always been many examples of this kind, since human relationships are complex and frequently not as precise as we would need them to be in the digital world, without direct face-to-face interaction. But there was a general perception about a growing number of cases were good manner and politeness were flagrantly obviated, and not only in open source communities, but also in other open coumminites around free knowledge production.

That’s why this recent post on Jono Bacon’s blog quickly got my attention. Jono is the Ubuntu community manager, and he’s quite respected for his extensive experience in this role. He’s also the author of the authoritative book about Community management, “The Art of Community“. Once a year, he also hosts the Community Leadership Summit. I think these are strong arguments for taking his word for this. I really love this part:

I love to have a good debate, and I am never afraid to shake hands and say “let’s just agree to disagree” or calmly not participate.

In fact, a growing number of participants in debates (not only in virtual communities, but also in live debates, let alone TV shows) think that the ultimate goal is to completely convince the other interlocutors who don’t share their own point of view. However, the most positive side of debates is actually to exchange different points of view. Of course, there are key differences, depending on the topic. Sometimes, you discuss really technical stuff, and there are quite clear arguments in favor of a certain solution (for efficiency reasons, development guidelines, readability, maintainability, compatibility, etc.). But some other times, the arguments just express opinions on a certain issue, and there may be different points of view.

One way or the other, I think that this call for respect in open communities is really in place, right now. And thus I fully support its aim. Please, help to spread the word and preserve the healthy spirit of open collaboration around free knowledge.

Seamless support for open content

November 2nd, 2010 No comments

Over the past 4 years, I’ve been an avid consumer of open content, mainly images and text licensed under CC-BY-SA (my favourite license ever). 90% of times, I collect it to prepare slides and other learning materials for university courses, training sessions, lectures or conferences (the other 10% is just for fun, since I love photography and I release all my works under CC-BY-SA). I think we still have a long way to go to faciliate the search, creation and reuse of open content. And now, I have a great opportunity to share my experience with other people and learn other points of view.

Mozilla Drumbeat logoStarting tomorrow till Friday Nov. 5, I’ll be in Barcelona attending Mozilla Drumbeat Festival 2010. I admit I have high expecations on this unconference/festival or whatever name you give to an event that will bring together ~400 persons around OER and the Web. You can check the program here. The Festival has been designed as a forum to foster participation and quick interaction (maybe, it reminds me our great Open Space sessions in WikiSym, but on a larger scale)

Some of the sessions I plan to attend will cover different perspectives of a very important topic: how people create and reuse open content on the Web. In this line, we have for instance a session on “How to encourage content reuse”, another one exploring how to build better platforms to find open content (“Pathways to open content”) and finally, a brainstorming session about “The next big thing in OER”. Thus, I’ve been thinking about these issues, what they have in common and how we can solve any problems that open content creators and users may find. This is my attept to summarize my thoughts, so far.

From my personal experience, and according to comments from other colleagues, there are 3 main issues impacting open content reuse:

  1. Understanding which license to choose: We have many different licenses to choose for our content. However, many people still feel ok licensing their work under a Non-Commercial clause. While it’s true that this is a positive step towards openess, I think we also need to remind why licenses including NC clauses are not compliant with the Open Knowledge Definition.
  2. Searching for open conent: Still today, almost a decade after CC was created, it is still a pain in the neck to find open content on the web. Well, I don’t mean it’s difficult to find any open content or good open content (just visit Wikimedia Commons and let me know what you think). I mean it’s very time-consuming to find the open content you need for a certain cituation (exercise: find an image depicting a fire flame, with decent quality, not including a candle, lincensed under CC-BY-SA. How much time did it take you?).
  3. Using and storing open content: Finally, you found that great image for your slides. OK, you save it on a local folder, you include the image and link the original author (if needed), and you include a licensing comment. You’re done. Now, say that 3 months after that, you need again some images you already downloaded. You go to your local folder and… you don’t remember neither the author nor the license for most of them (if not all). You need to open the file where you used them to search for that info, or you search the web again (and pray for the search results to remain unaltered over the past 3 months). Sometimes, you end up including a long string on the file name to record this info, but that’s not very handy to tidy your stuff, right?

What we find here is the absence of a standarized, seamless support for embedding critical information in open content files (specially author info and license type). What if your favourite text processor or presentations software already tracks for you the author and license info and includes a footnote automatically? What if you can automatically create a table of licenses and authors in LaTeX? And my favourite ones: file managers. How about opening a local folder with Dolphin (or Nautilus, or Gwenview…), right click on your mouse and select “arrange files by author and license type”? They could also present a small note with that info on mouse rollover.

In summary, the root of all these issues (educating your users, finding open content on the web and leveraging the use of open content in academia and other contexts) is the lack of a standarized support to embed open content relevant info in multimedia files. Pierre Far, who’s leading the session on “Pathways to open content”, suggested a possible solution: XMP. This is an example solution for standard support to include information of file contents in the file header. It also supports many different types of multimedia files (including making use of EXIF heders we photographers love in JPEG files). But there may be others. I don’t mind what we finally choose, as long as everyone agree to use the same standard.

Conclusion: if we aspire to get real support from end-users to open content, we must help them offering seamless suspport to perform daily tasks required in the new workflow (dealing with licenses and author info). With this apparently simple step, we would shoot down all problems above with a very simple but effective move. Time for other people to jump in the discussion, and stadards masters to start thinking about this.

Looking forward to meeting you in Barcelona!

A business model for Creative Commons

August 10th, 2010 1 comment
CC logo

By Yohei Yamashita (CC-BY 2.0)

Creative Commons is a non-profit organization aimed  to promote the CC licenses around the world, thus generating positive awareness and impact on the global issue of knowledge share. Joi Ito, CEO of Creative Commons, reflects this vision in an interview for TechRadar. This interview got my attention, and triggered this post.

How to build a good business model for non-profit organizations supporting open movements(*)?

(*) [Note: in the absence of a better term, I've been using open movements to encompass all sort of initiatives articulated around collaborative communities, open to the contribution of any person, pursuing the creation of physical or digital works and knowledge compliant with the definition of free cultural works. For instance, this includes Free/Libre/Open Source Software (FLOSS), as well as creative works (text, images, audio, video, etc.) released under a free license].

Read more…

Categories: Open Movements Tags:

Oracle is missing the Summer sun

August 10th, 2010 2 comments

The Illumos announcement, aired last August 3, may be the soap of this Summer. But it can also be the starting point of a story that will leave a memorable footprint in open source software. And that is not because of the project itself, which is cool. It is not because it shows some interesting advantages of open source software business models.

Illumos logoAbove all (and if everything stays the same), we will remember this affair due to the huge opportunity Oracle is missing in this precise moment. Oracle is missing a healthy project, OpenSolaris, with a committed community of users and developers around the world, in one of the most critical market segments at this moment: high performance operating system platforms. What a blunder…

Read more…

Categories: FLOSS, On-line Communities Tags:

Wikimania 2010 recap

July 22nd, 2010 1 comment

OK, now it’s time for Wikimania 2010 summary. I’ve been thinking a lot on the best way to concentrate my thoughts in a short way. I think the best one is this: whenever I attend a conference/meeting, and I have real difficulties to decide which session to attend (because all of them are terrific) is a good signal. Well, every minute I spent in Wikimania 2010, I felt like that. “Mmmm, look at this one….but wait! I wanted to attend that one, as well… Oh no! Strategy Plan at the same time I’m giving one of my talks … What the heck!”.

Wikimania 2010 Gdańsk, Poland.

I admit it was pretty easy that this occurred to me, because: a) this was my first Wikimania; b) I gave too many talks (3!), thus missing other interesting slots and c) I wasn’t ready for the really active ambient of Wikimania. But, let’s go on with some futher details, since I have some “mixed feelings” about certain points.

Read more…

WikiSym 2010 summary

July 19th, 2010 8 comments

Finally, I had some time to write about the experiences in WikiSym and Wikimania 2010. Let’s start with the first one.

WikiSym 2010

WikiSym 2010 has been special in many aspects. The Symposium and Program Committees were appointed between Dec. 2009 and Jan. 2010. Thus, we had only 6 months to rush into everything (CfP, venue location, logistics, proceedings, etc.). We decided that it was a good idea to search for synergies with another important conference celebrated every year: Wikimania 2010. Gdańsk was a very attractive city, and potential interactions between attendees to both events could be great. In the end, we packed a very interesting week, overlapping both events. However, the challenge was also to test if both communities would be able to find common points of interest. Besides this, WikiSym 2010 explicitly broadened the scope of the conference, to welcome  presentations on Open Collaboration in general, beyond the scope of wiki platforms. So, many things to discover!

Read more…

Categories: Conferences, On-line Communities Tags: