See our latest COVID-19 information Read more.
en.planet.wikimedia.org

How Wikidata is helping the CU Boulder library

15:11, Monday, 20 2021 September UTC
Chris Long
Chris Long

Wikidata’s overlap with cataloging has increased in recent times, prompting many librarians to transition into using it more. Chris Long, who is the Director of the Resource Description Services Team at the University of Colorado Boulder’s library, has been an avid user of Wikidata since 2019, creating and editing a variety of items.

Though Long’s experience with Wikidata is extensive, he participated in our recent Wikidata Institute course to learn more about how its sources and tools can be implemented in his university’s library. His institution is participating in the Library of Congress’s Program for Cooperative Cataloguing (PCC) pilot using Wikidata, so Long spent time editing items related to that initiative.

As a cataloger and cataloging manager, it is important to keep abreast of emerging cataloging trends. The Library of Congress and PCC are increasingly exploring the efficacy of Wikidata in cataloging work, so learning to use it is important to stay current,” Long says. “Being able to provide Wikidata training for my colleagues affords them the chance to do some hands-on linked data work.”

With Wikidata’s useful tools, Long learned the importance of constructing data models for effective querying. He believes the services Wikidata offer can immensely maximize the impact of any project or collection.

“While there are a number of library linked data projects in existence, many are either small-scale ‘proof of concept’ projects, or require a great deal of institutional support to participate,” Long says. “Conversely, Wikidata is a low-barrier way for librarians to actively create usable linked data that can have a large impact.”

Wikidata can enhance any library’s collection, he believes.

“It affords the opportunity to de-silo our library metadata and let it ‘play’ on the Semantic Web, allowing for the discovery of associations among persons and concepts that would otherwise not be possible,” Long says.

Long is currently preparing a Wikidata project for his team involving the University of Colorado Boulder faculty. The project consists of creating Wikidata items for faculty as well as doing revisions on existing ones to try and associate them with the university. Projects like this are common in universities as they allow faculty to be displayed on tools like Scholia.

To take a course like Chris took, please visit wikiedu.org/wikidata. Image credits: Gribeco, CC BY-SA 3.0, via Wikimedia Commons; Chris Evin Long, CC BY-SA 4.0, via Wikimedia Commons

Todd Siegel joins Wiki Education’s Advisory Board

15:46, Friday, 17 2021 September UTC
Todd Siegel
Todd Seigel
Image courtesy Todd Seigel, all rights reserved.

Todd Siegel, product designer, prototyper, and wordsmith, has been appointed to Wiki Education’s Advisory Board, which is focused on growing our network and generating new revenue.

“I’m excited to join Wiki Education’s Advisory Board to help expand its community, and further educate students about the full range of Wikipedia’s powers,” Todd says.

Todd has more than 15 years of experience serving startups as an independent contractor and advisor. He educates a wide range of audiences on rapidly expressing product ideas with prototypes that look and feel real — without needing to code. He gave live prototyping presentations at Xerox PARC, Copenhagen Institute of Interaction Design, and Product School, plus hackathons ranging from Cisco to Cedars Sinai Medical Center.

He played a pivotal role in designing and evangelizing Proto.io, the leading web based tool for prototyping apps without coding, used by more than 400,000 people.

As a wordsmith, Todd co-founded the literary series Word Performances, and performed his poetry in over 20 shows including the Litcrawl literary festival, and was called the ee cummings of San Francisco tech culture.

I’m thrilled to add Todd’s experience, skillset, and connections to the San Francisco community to Wiki Education’s Advisory Board.

Improving Asian American journalists’ biographies

15:25, Thursday, 16 2021 September UTC

Wiki Education recently collaborated with the Wikimedia Foundation and the Asian American Journalism Association (AAJA) to host a 6-week training course in order to give AAJA members and others the time and space to learn how to add Asian American and Pacific Islander journalists’ biographies to Wikipedia.

One course participant was Pamela Ng, homepage editor for Fox News Digital and part of the executive leadership program at AAJA. Previously, she worked for the New York Daily News and PIX11 News. She says she joined to improve her research skills and learn more about Wikipedia.  She also wanted to improve the dearth of coverage on Wikipedia of Asian American journalists.

“I was shocked to learn that only 4 percent of English Wikipedia’s biographies of American journalists are of people of Asian descent. Representation is important and I hope my contributions will encourage others to help make Wikipedia content more diverse,” says Ng.

Ng appreciates the vast number of Asian American journalists she learned about while doing her own research to add content to Wikipedia. As part of the course, Ng created the biography of CeFaan Kim, an ABC News correspondent and reporter for WABC-TV in New York City.  With her contributions on a website viewed by millions of people every day, she feels it is especially important that these biographies gain recognition. Ng is currently writing more biographies on Asian American journalists as well as expanding on existing pages in effort to increase Wikipedia’s Asian American representation.

Ng’s hands-on experience in the AAJA Wiki Scholars course increased her confidence in consulting Wikipedia for information, as she’s now familiar with the extensive work that goes into ensuring Wikipedia’s content is high quality and based on reliable sources. She hopes other detractors will soon join the hundreds of millions of people who use Wikipedia, and realize how impactful Wikipedia can be.

“Wikipedia is often viewed as an unreliable source because anybody can contribute to it. If more people learned about what goes into Wikipedia, I think there’d be less hesitancy in using it as a jumping off point for research or other projects,” says Ng.

To take or sponsor a course similar to the one Pamela took, please visit learn.wikiedu.org

NEW YORK — In a divided opinion, the Fourth Circuit dismissed an appeal brought by the Wikimedia Foundation, which challenges the National Security Agency’s mass interception and searching of Americans’ international internet communications. The American Civil Liberties Union, Knight First Amendment Institute at Columbia University, and the law firm Cooley LLP represent the Wikimedia Foundation in the litigation, Wikimedia Foundation v. NSA.

Although the court held that Wikimedia had provided public evidence that its communications with Wikipedia users around the world are subject to NSA surveillance, the court went on to hold that further litigation would expose sensitive information about the government’s spying activities — and that the “state secrets privilege” required dismissal of the suit. The court rejected Wikimedia’s argument that the special procedures Congress enacted in the Foreign Intelligence Surveillance Act (FISA) preempt the state secrets privilege and allow the case to go forward. 

“We are extremely disappointed that the court wrongly credited the government’s sweeping secrecy claims and dismissed our client’s case,” said Patrick Toomey, senior staff attorney with the ACLU’s National Security Project. “Every day, the NSA is siphoning Americans’ communications off the internet backbone and into its spying machines, violating privacy and chilling free expression. Congress has made clear that the courts can and should decide whether this warrantless digital dragnet complies with the Constitution.”

At issue in this lawsuit is the NSA’s “Upstream” surveillance, through which the U.S. government systematically monitors Americans’ private emails, internet messages, and web communications with people overseas. With the help of companies like Verizon and AT&T, the NSA has installed surveillance devices on the high-capacity internet circuits that carry Americans’ communications in and out of the country. It searches that traffic for key terms, called “selectors,” that are associated with hundreds of thousands of targets. In the course of this surveillance, the NSA copies and combs through vast amounts of internet traffic. 

“We respectfully disagree with the Fourth Circuit’s ruling. Now more than ever, it is crucial that people are able to access accurate, well-sourced information, without concern about government surveillance,” said James Buatti, senior legal manager at the Wikimedia Foundation. “In the face of extensive public evidence about NSA surveillance, the court’s reasoning elevates extreme claims of secrecy over the rights of Internet users. We call upon the United States government to rein in these harmful practices, and we will continue to advocate for the privacy and free expression rights of Wikimedia readers, contributors, and staff.” 

Judge Diana Gribbon Motz, who dissented from the court’s state secrets ruling, warned that the majority’s opinion “stands for a sweeping proposition: A suit may be dismissed under the state secrets doctrine, after minimal judicial review, even when the Government premises its only defenses on far-fetched hypotheticals.” 

“For years, the NSA has vacuumed up Americans’ international communications under Upstream surveillance, and to date, not a single challenge to that surveillance has been allowed to go forward,” said Alex Abdo, litigation director of the Knight First Amendment Institute at Columbia University. “The Supreme Court should make clear that NSA surveillance is not beyond the reach of our public courts.”

Wikimedia and its counsel are considering their options for further review in the courts.

For more information about the case: https://www.aclu.org/cases/wikimedia-v-nsa-challenge-upstream-surveillance-under-fisa-amendments-act 

The opinion is available here: https://www.aclu.org/legal-document/wikimedia-v-nsa-opinion-0 

CONTACTS:
Allegra Harpootlian, 303-748-4051, aharpootlian@aclu.org
Lorraine Kenny, 917-532-1623, lorraine.kenny@knightcolumbia.org
Gwadamirai Majange, press@wikimedia.org

Today marks the start of a Heritage Month focused on celebrating the history, culture, and influence of Latinx communities in the United States. 

The official name of the month itself (National Hispanic Heritage Month) is a living example of the power of language — its history and inequities in who controls it, and its impact on the perceptions and identities of people and their communities.

At the Wikimedia Foundation, the nonprofit that operates Wikipedia and its companion free knowledge projects, we know words matter. We are committed to creating an inclusive, equitable living record of history, stories, and contexts. This often includes righting the historical record — and expanding it to include the perspectives of people left out by systems of power and privilege. 

This Latinx Heritage Month — what we have chosen to call this annual celebration — we are expanding this traditionally US-specific commemoration to celebrate the richness of our global Latinx Wikimedia community, while recognizing the work still needed to be done to achieve authentic representation online. 

We invite you to explore the origins of the term Hispanic; consider the legacies of colonization and the impact of language; and to hear firsthand from some of our Latinx Wikimedia contributors around the world on the importance of filling knowledge gaps about Latinx people and topics on Wikipedia and other Wikimedia projects. 

Why “Latinx Heritage Month” 

The term Hispanic commonly applies to countries with a cultural and historical link to Spain. In other words, it applies to countries previously colonized by Spain. In the US, “Hispanic” has become a broad catchall, referring to persons with a historical and cultural relationship with Spain regardless of their race and ethnicity. For these reasons, many have contested the term and flagged its negative connotations and racist undertones

When it comes to describing their individual identities, recent research from Pew reveals that just over half of “Hispanic” and “Latino” people have no preference between the two terms. In some cases, the labels are used interchangeably. Another more recent identity label to emerge is “Latinx.” Although not widely adopted, it is considered a more gender- and LGBTQI-inclusive term — and what we have chosen to use during our celebration this month. 

Latinx content gaps on Wikimedia projects 

Wikipedia and other Wikimedia projects, sadly, do not currently reflect the world’s diversity. This results in a less rich, complex, and accurate picture of our world, its people, and its knowledge on our projects. 

Preliminary data from a recent Foundation survey of people in the US indicates that Latinx people, especially women, are dramatically underrepresented among Wikipedia contributors and readers in the United States. The data show that just 22% of Latinx women feel represented on Wikipedia, and only 31% of Latinx women in the US use Wikipedia. Data from our annual Community Insights Report also shows Latinx people in the US are severely underrepresented in our communities, representing only 5.2% of Wikimedia contributors. 

When it comes to the content represented on Wikipedia at large, we know from the Oxford Internet Institute that there are more Wikipedia articles written about Antarctica than many countries in Latin America.

Perspectives of Latinx Wikimedia contributors 

Nearly 20 years ago, the New York Times said that one day, the name of this month may change to “Colombian-Dominican-Cuban-Mexican-Puerto Rican-and-Other Heritage Month.” Why? Because the Latinx community is not monolithic. It is richly, beautifully complex, made up of an array of different identities, cultures, and experiences.

Our goal is for Wikimedia projects and contributors to reflect this rich diversity. Wikipedia is a mirror of the world’s biases — to deliver on our commitment to knowledge equity, we must address barriers that prevent people from both accessing and contributing to free knowledge.

To shed light on our efforts to do just that, we interviewed five Latinx Wikimedia contributors on their experiences in our movement, why they are committed to closing knowledge gaps, and what they want people to know about their heritage:

Carmen Alcázar

Carmen Alcázar (User:Wotancito) is a member of Wikimedia Mexico and a new Wikimedian of the Year 2021 Honourable Mention winner. She started the Editatona project to increase gender diversity on Spanish Wikipedia in 2015, which has since grown to host 60 events in Latin America. 

Mónica Bonilla

Mónica Bonilla-Parra (User:Mpbonillap) is on the board of Wikimedia Colombia. She is a linguist and researcher who uses Wikimedia projects to preserve and promote the culture and histories of Indigenous communities in Latin America. She coordinates the Wayuu Digital Project of ISUR and Fundacion Karisma, supporting media literacy processes in schools of the Wayuu community in the Colombo-Venezuelan Guajira. 

Carla Toro

Carla Toro Fernández (User:Soylacarli) works with Wikimedia Chile to host editing events on Wikimedia projects to improve content on gender, human rights, culture, science, heritage, and more. She also edits Wikipedia in a volunteer capacity, watching for vandalism and verifying information. On Wikidata, she does data quality control and uses queries to identify content gaps. 

Chola fashion in Gran Poder

User:carlillasa is a member of the Wikimedistas de Bolivia user group. She writes Wikipedia articles about Bolivia, uploads photos, and gives editing workshops in collaboration with fellow volunteers. One of the first articles she wrote was on her favorite Bolivian novel, Intimas, by author Adela Zamudio. 

Selene Yang, who works on the DEI team at the Wikimedia Foundation, is a co-founder of Geochicas, a group of women who work to close the gender gap in the OpenStreetMap community and also works towards bridging the mapping community with the Wikimedia community. She has also led edit-a-thons for the Art+Feminism initiative with TEDIC, a digital rights defender organization in Paraguay, to produce historiographic reviews on the roles of women in the construction of the modern Paraguayan state, and raise awareness of the importance of Wikipedia for the restoration of collective memory, respectively. 

Why should people care about filling knowledge gaps about Latinx people and topics on Wikipedia and other Wikimedia projects?

  • “As on Wikipedia, we need all versions of history. What we write, do, share is not complete without the vision of women, of the global south, of postcolonial realities, of dissent. We have to ensure that there are seats for women. We have to commit to them having a good experience in our space.” —Carmen Alcázar
  • “The history of Colombia and Colombians on Wikipedia has been told and narrated from places other than Colombia, a situation that generates many biases in the information, but that we can change by involving more Colombians in the projects, in the communities and in their construction. To the extent that we involve more people, more voices, more languages, we will truly fulfill the mission of the Wikimedia movement: to empower and encourage people around the world to gather and develop neutral educational content under a free content license or in the public domain, and to disseminate it effectively and globally. Ultimately, closing the gaps in content, participation, and representation will strengthen and grow the community of volunteers, who make the community exist, continue, and advance.” —Mónica Bonilla-Parra
  • “I feel this is very important because people access the internet — and particularly Wiki projects — to find information and to know more about a subject. So, what happens when the information is simply not there, or when the information provided is shown from an outsider’s perspective? It’s crucial for content about Latin American issues and people to be written from a local point of view, to avoid stereotypes. Furthermore, in these times when the internet is the place where we preserve our history, the fact that there are information gaps on Latin American topics makes us invisible and keeps us out of history. That is basically what information gaps do these days, they leave you out of history, which is unacceptable.” —Carla Toro Fernández
  • “We are not represented on the Wikimedia projects, which are the window to knowledge on the internet. It’s difficult for the rest of the world to understand 1) how complex and diverse our reality is, and 2) we, ourselves, can understand the diversity of the region that we live in. I believe it’s fundamental to be able to go on Wikipedia and see a photo of your city, a photo of your favorite regional dish, an article about your favorite national author. We need to create content for and by us, to not have to feel like orphans of the internet anymore.” —User:Carlillasa
  • “History is always told by those who have the privilege of narrating it; however, the struggle for the living memory of people, collectives and communities is what becomes invisible through epistemic injustice. This has its foundations in the systems of oppression that emerge in the face of any form of disruption of the established order. Closing the gap in the production of knowledge about Latin America not only breaks down the material and symbolic barriers on access to information and the visibility of our memory, but also empowers, from the recognition of ourselves, those of us who historically have not been able to tell our own story.” —Selene Yang

What is one thing you wish people knew about your community, culture, or history?

  • “There are many annual festivals in my country, but the one that I like the most for its cultural importance and its high importance to the family is the Day of the Dead—imagining that on that day my grandma comes to my house for a coffee with milk and a pan de muerto makes me smile. It’s a bit difficult to understand outside of Mexico, but that’s what Wikipedia is for.” —Carmen Alcázar
  • “In Colombia, there are currently about 68 Indigenous languages that have been affiliated to 13 different linguistic families. Wikipeetia is the Wikipedia in Wayuunaiki, a project that has been built by the Wayuu people, who are located in La Guajira Colombo-Venezonala (the ancestral territory of the Wayuu people).” —Mónica Bonilla-Parra
  • “The truth is that I’d like for them to know about so much!  Our history is composed by our many native civilizations who have diverse cultures, traditions, and histories of their own. There is a Wikipedia category titled Culture of Chile, where articles on Chilean culture —from Chilean tea culture to the article about cantineras, female soldiers who fought in the War of the Pacific.” —Carla Toro Fernández
  • “I would like for the world to know that Bolivia is a very diverse country and that all of its social and cultural representations (from the most popular to the most academic) are worthy of attention and respect. In that sense, I consider the article about Bolivian gastronomy and all the articles that have been created lately about food in Bolivia to be a valuable testament of not just the culinary diversity of my country, but also the cultural processes linked to food from prehispanic times, through the colonial times up until our current globalized reality.” —User:Carlillasa
  • El Güeguense is one of the first plays in America translated from Nahuatl into Spanish. It satirically represents through music, dances, and dramaturgy the convergence between Indigenous cultures and their relationship with the Spanish conquest. It is the force of comedy and wit in protest against the tragedy of the conquest. Currently the play comes to life during the patron saint festivities of my hometown city of Diriamba, Nicaragua.” —Selene Yang

What motivates you to contribute to Wikimedia projects?

  • “In addition to contributing to a greater common good, what motivates me most is that there is so much more to write. … At every opportunity, the story of an incredible woman whose trajectory has been overturned by the patriarchy jumps onto my edit list, so it renews my energy to keep doing this. I stay motivated even if not everything goes well and the attitudes of other male Wikipedians are not appropriate, although sometimes after organizing events and all that entails, there are still people in 2021 who, despite the explicit and clear rules of the projects, still think of Wikimedia projects that do not correspond to the world we live in.” —Carmen Alcázar
  • “The collective construction of humanity’s knowledge. I am passionate about understanding other ways of learning, teaching and building the world, and that is why I have worked and built projects with invisible communities, not only on the Internet but in society. I am also a fan of languages and technology and in Wikimedia I find a special place where my profession, my passion and my motivation connect.” —Mónica Bonilla-Parra
  • “I work in the field of science, where data is almost always kept behind paywalls that prevent people from accessing this information. The Wiki ecosystem changed this by making knowledge accessible to anyone with an internet connection, putting it at a click’s reach. Another thing that motivates me is the fight against fake news, and since everything in Wikipedia needs to have a reliable source, I feel it is the perfect place where trusted information can be found and used to counter the fake information that is generated around some issues, as was the case this last year with vaccines.” —Carla Toro Fernández
  • “It is important to me that my country, with all of its diversity, is well-represented in Wikimedia. Also, I like in general that articles are well written.”  —User:Carlillasa
  • “Currently I contribute more directly with the Openstreetmap community through the Geochicas collective; however, our projects are also intertwined with Wikipedia. For example, the Streets of Women initiative seeks to generate a visualization where you can count the nomenclature of city streets according to their gender and if the streets named after a woman have an article in Wikipedia. This initiative has led us to generate meetings, editatonas, and workshops to find those women that the public sphere has left out of history. The most motivating thing about these shared learning processes is to recognize the relevance of the relationships between communities and how we all somehow find ourselves fighting for the same goal, such as greater participation and representation of women both in the world’s largest encyclopedia (Wikipedia) as well as in today’s most important open and collaborative geographic database (OpenStreetMap).” —Selene Yang

Jorge Vargas is Senior Regional Partnerships Manager at the Wikimedia Foundation. You can follow him on Twitter at @jorgeavargas.

Improving Wikipedia’s coverage of OER

16:06, Wednesday, 15 2021 September UTC
Virginia Clinton-Lisell
Virginia Clinton-Lisell
Image courtesy Virginia Clinton-Lisell, all rights reserved.

For being the world’s largest open educational resource (OER), Wikipedia’s coverage of OER-related topics left something to be desired. That’s why Wiki Education collaborated with the GO-GN Global OER Graduate Network and the Hewlett Foundation to run two Wiki Scholars courses aimed at improving Wikipedia’s coverage of OER, broadly defined.

The call for participants snagged the attention of Virginia Clinton-Lisell, an associate professor of educational foundations and research at the University of North Dakota. Virginia’s one of the primary researchers at her institution’s Open Education Group, so she was a natural fit for the course.

“I think often Wikipedia is scoffed at because ‘anyone can edit and write,'” Virginia says. “But the process of learning how to edit and write is quite involved and there are very clear criteria. It was excellent to be taken step by step through everything and get feedback before making my changes live.”

During our Wiki Scholars courses, participants like Virginia work with the course instructor and training materials to learn about the steps involved in adding new content to Wikipedia. The aim is not only to teach participants how Wikipedia works, but also to give them the time and space to make a tangible impact to Wikipedia and the readers who come to learn about these topics.

Virginia improved the article on open textbooks because it’s the primary area of her research. Thanks to her additions, when someone comes to learn about open textbooks, they’ll see that while commercial textbooks produce no difference in learning performance compared to open textbooks, the costs continue to increase. Perhaps making this information more accessible to the public—like school administrators—will help increase further adoption of open textbooks.

During the course, Virginia also added information about North Dakota legislation to the policy section of the article on open educational resources.

“I liked getting to write about North Dakota’s legislation (even though it was a small addition) just because I’m excited about the initiatives the legislators have passed here,” she says. “I really hope that people who use Wikipedia to learn about OER realize that this movement is big and well researched.”

The course served another purpose for Virginia: It inspired her to incorporate Wikipedia editing into the courses she teaches, using Wiki Education’s Wikipedia Student Program. This fall, her introduction to the foundations of education students will further improve Wikipedia’s coverage.

“I realized that having my students edit Wikipedia would be a fantastic way to have them actively be involved as creators of Open Educational Resources,” she says. Having students create OERs — often called Open Educational Practice — is a hallmark of Wiki Education’s programmatic activities, and we’re thrilled when our Wiki Scholars alumni see the value in their own learning experience and choose to pass this on to their students.

Interview conducted by Reema Haque. Hero image credit: MatthewUND, CC BY-SA 3.0, via Wikimedia Commons; image of Virginia courtesy Virginia Clinton-Lisell, all rights reserved.

Wikimedia Projects & AI Tools: Vandalism Detection

10:20, Wednesday, 15 2021 September UTC

There is a machine learning service available to interested Wikimedia projects and communities called ORES. It aims to recognise if an edit, for instance on Wikipedia, is damaging or done in good faith. Of course, false predictions cannot be avoided and thus remain a major risk. Here’s how we try to handle it.  

ORES: A system designed to help detect vandalism

ORES (Objective Revision Evaluation Service) is a web service and API that provides machine learning as a service for Wikimedia projects and is designed to help automate critical wiki-work – for example, vandalism detection and removal. In practice it aims to help human editors, in this case patrollers (volunteers who review others’ edits), identify potentially damaging edits. Importantly, the decision whether an edit is kept or reverted isn’t made by the algorithm, it always remains with the human patroller. 

In order to make a prediction about edits, ORES looks at the edit history across Wikipedia and calculates two general types of scores – “edit quality” and “article quality”. In this post, we will focus on the former for the sake of simplicity. 

“Edit Quality” Scores

One of the most critical concerns about Wikimedia’s open projects is the review of potentially damaging contributions. There’s also a need to identify good-faith contributors – who are inadvertently causing damage – and offer them support. 

In its most basic functionality, the ORES machine looks at the history of edits on Wikipedia and assumes that most damaging edits have been reverted rather quickly by human patrollers, whereas good faith edits stay untouched longer. Based on this, ORES gathers statistical data on edits, which it then groups into “features” – things like “curse words added”, “length of edit”, “citation provided”, or “repeated words”. 

These features, the system assumes, help assess whether an edit is made in good faith or damaging. ORES makes a prediction and presents it to a human for further consideration. It is important to emphasise again and again what such machines do: predictions based on assumptions. The results might change considerably if were to use different features, which takes us to the next important functionality: feature injection.

Features: The numbers that let machines recognise correlations

The flow of a diff to features to a machine prediction is shown visually. Author: EpochFail, License: CC-BY-SA via Wikimedia Commons
A Wikipedia edit is analysed by the machine using features to predict its quality.

In the pictured example, features of an edit are measured (e.g. “words added” and “curse words added”), because under some circumstances, they correlate with damaging edits. A machine learning algorithm can use features like these to identify the patterns that correlate with vandalism and other types of damage in order to do something useful—like help with counter-vandalism work.

Features are how machines see the world. If some characteristic of an edit suggests that the edit is vandalism, but that characteristic is not captured in any of the features that are measured and provided to a machine learning algorithm, then the algorithm cannot learn any patterns related to it. 

Feature Injection

Features depend largely on human input and data available. They are thus, just like data and human history, prone to biases. This is why it is crucial that we are open and transparent about how a result was determined and which features were decisive. Wikimedia offers a functionality that allows us to ask ORES which features it used when it made a prediction about an edit. For instance, we can check which features it used when assessing revision number 21312312 of the article French Renaissance on English Wikipedia.

We can go even a step further. ORES lets us toy around with the assessment above by adding and removing features as we see fit. We can either use authentic data from Wikipedia or add synthetic data we came up with in order to see how that would influence the result. What if the article was properly referenced? What if the vocabulary contained more sophisticated terms? We can test those things on an actual article. 

Feature Injection for Everyone

To be ethical and human centred, machine learning systems must be open and offer people a way to test them, to figure out how they work and react. Not because of the things we know, but because of the things we don’t always immediately realise. We have a gender bias on Wikipedia. Users who identify as female have reported that their edits were seen more critically than when they didn’t report their gender. Biographies about women are fewer and shorter. This alone is enough for a machine learning tool to pick up biases and magnify them and it might be very hard for us to recognise this. Similar biases exist with race, language, age and many other categories. And that is precisely why we need systems that are transparent. Ideally they will be open to everyone, and ideally not only on Wikipedia. No, this won’t solve the issue of bias, but it might limit it. And it will give civil society and researchers a tool to counter negative developments.

A miniseries on machine learning tools:  Machine Learning and Artificial Intelligence technologies have the potential to benefit free  knowledge and improve access to trustworthy information. But they also  come with significant risks. Wikimedia is building tools and services  around these technologies with the main goal of helping volunteer  editors in their work on free knowledge projects. But we strive to be as  human centred and open as possible in this process. This is a  miniseries of blog posts that will present tools that Wikimedia develops  and uses, the unexpected and sometimes undesired results and how we try  to mitigate them.  

14 September 2021, San Francisco, California — Today, the Board of Trustees of the Wikimedia Foundation announced the appointment of Maryana Iskander as the organization’s new CEO. She is a globally recognized social entrepreneur and an expert in building cross-sector partnerships that combine innovative technology with community-led solutions to close opportunity gaps.  

As CEO of Wikimedia Foundation, the global nonprofit organization that supports Wikipedia and 12 other free knowledge projects, Maryana will champion the organization’s goal to ensure that people everywhere can access and share knowledge freely. She will formally begin on January 5, 2022 and report to the Foundation’s Board of Trustees. 

Since 2013, Maryana has served as CEO of Harambee Youth Employment Accelerator, a South African non-profit social enterprise focused on building African solutions for the global challenge of youth unemployment.  Under her leadership, Harambee received the prestigious Skoll Award for Social Entrepreneurship in 2019 for its model to support now over 1.5 million youth with access to learning and earning opportunities. Throughout her career, Maryana has sought to break down barriers that improve access to information and opportunity. 

“Maryana’s approach to leadership is based around collaboration and community,” said Nataliia Tymkiv, Acting Chair of the Board of Trustees of the Wikimedia Foundation. “She has deep appreciation for the role that volunteer-led communities can play in addressing social challenges. Throughout her career, she has driven tangible impact on issues from healthcare to unemployment. We believe that she will be a powerful champion to grow the Wikimedia movement and increase global access to free knowledge.”

“Today, societies across the world are confronted by systemic challenges that require the best of human-led and technology-enabled solutions. This remarkable global movement demonstrates how powerful that combination can be in ensuring that every human can freely share in the sum of all knowledge,” said Maryana Iskander. “I am honored to support this inspiring vision and build a more equitable future for knowledge together.”   

The Wikimedia Foundation currently has over 500 employees around the world, with an annual budget of over $100 million. The Foundation operates the technology infrastructure that enables more than 18 billion visits to Wikipedia monthly and advocates for policies globally that protect and advance access to information. In her role as CEO, Maryana will take on the urgent task of expanding access to, and participation in, free knowledge globally, as the threats of widespread misinformation and online censorship grow more dire.  

Wikimedia also supports a movement of over 280,000 volunteer contributors who edit Wikipedia and its sister sites every month. Maryana will collaborate closely with the volunteer movement to make progress towards a shared vision for the future of Wikimedia which prioritizes knowledge equity, helping to close knowledge gaps in content on Wikimedia projects and increasing the diversity of contributors to the sites by reducing barriers to knowledge that prevent women and marginalized communities from equitable participation. 

Maryana also brings experience from leadership roles in the public, private, and social sectors. She spent more than half a decade as the Chief Operating Officer of Planned Parenthood Federation of America, a volunteer-led social movement focused on access to healthcare. Maryana was also the Advisor to the President of Rice University, an associate at global consulting firm McKinsey & Company, and a law clerk on the United States Court of Appeals for the Seventh Circuit. 

Born in Cairo, Egypt, Maryana was educated in the United States and the United Kingdom. She holds a B.A. magna cum laude from Rice University, an M.Sc. from Oxford University as a Rhodes Scholar, and a J.D. from Yale Law School, where she received a Distinguished Alumna Award. Maryana is also a Truman Scholar, a Henry Crown Fellow, and a member of the Aspen Global Leadership Network. She serves on the board of World Education Services.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge freely. We host Wikipedia and the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. 

The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive donations from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

A little-known naturalist from Chikkaballapur

11:22, Tuesday, 14 2021 September UTC
Bangalore has historically, being an administrative centre with a mild climate, had a fair share of colonial natural history collectors and naturalists. We know a fair bit about the botanists who walked this region and a bit about hunters of larger game but rather little about those who studied insects. A few years ago I became aware of the Campbell brothers from Ireland (but of Scottish origin). It took some time to put together the Wikipedia entries on them which is where more straightforward biographical details may be found.

After a trip to the Nandi Hills [to examine a large number of heritage Eucalyptus trees (nearly 200 years old) that the Horticulture Department had decided to cut down to the stump, supposedly because falling branches were seen by the Archaeological Survey of India as a threat to heritage buildings nearby], some of us decided to visit Chikballapur to examine the place of work of  Dr Thomas Vincent Campbell (1863-16 December 1930) - "T.V." as he was known to his friends was a missionary doctor with the London Missionary Society and had worked briefly at Jammalamadugu where his older brother William Howard Campbell (20 September 1859 - 18 February 1910) had worked as a missionary. Another brother back in Derry, David Callender Campbell (1860-1926) was also a keen observer of moths and a botanist. In their younger days in Derry, they and their siblings had put together a "family" museum of natural history that was said to be among the best in the region! William was the oldest of nine siblings and appears to have been the sturdiest considering that he was a champion rugby player at Edinburgh University. He moved to Cuddapah in 1884 and he may well have been the first person to see Jerdon's courser in life - Jerdon, Hume, and others appear to have dealt only with specimens obtained from local hunters. William collected moths and many of them appear to have gone to Lord Rothschild and nearly 60 taxa were described on their basis by Hampson. In 1909, he was to become director United Theological College Bangalore but ill health (sprue) forced him to return to Europe and he died in 1910 in Italy. His Cuddapah-born son Sir David Callender Campbell (1891 – 1963) became a prominent Northern Ireland politician. William's life is covered in some detail by Alan Knox while examining the only known egg of Jerdon's courser. A biography (a bit hagiographic though) of William in Telugu also exists.

T.V.'s life on the other hand was hard to find information on, we knew of his insect specimens. He was in contact with E.A. Butler who specialized in the life histories of insects and T.V. seems to have taken off after him and not only colllected bugs (ie Hemiptera) but made notes on them which were used by Distant in the Fauna of British India. Several insects that T.V. collected have never been seen again. T.V. moved to Chikaballapur and worked at the Ralph Wardlaw Thompson Memorial Hospital which is now just known as the CSI Hospital and largely in disrepair. The hospital in its heyday was among the few in the region and treated a large number of patients. After suffering from tuberculosis, he also established a TB sanatorium at Madanapalli. Campbell treated nearly a thousand cases of cataract and was awarded a Kaisar-i-Hind medal for work in 1908. Campbell appears to have made a very large collection of insects from Cuddapah, Chikballapur, and from the Ooty area (where he would have spent summers). Many of these are now in the Natural History Museum in London and a good number are type specimens (ie, the specimens on the basis of which new species were described). Professor C.A. Viraktamath, entomologist and specialist on the leafhoppers, has for many years searched for a supposedly wingless Gunhilda noctua which was collected from the Nilgiris. Based on T.V.'s connections, I believe the place to look for them would be somewhere in the vicinity of the church in Ketti. Considering the massive alteration in habitats, there is a slight chance that the species has gone extinct but it is doubtful that it was so narrow in its distribution.
 
W.H. Campbell

 
Dr T.V. Campbell
T.V.'s former home in Chikaballapur

Dr TV attending to patients in Chikaballapur, c. 1912

A lane inside the hospital premises named after T.V.

Foundation stone of the hospital

The Wardlaw Thompson Hospital c. 1914

Gunhilda noctua - a monotypic genus never seen
since T.V. found them for W.L. Distant to describe in 1918
from The Fauna of British India. Rhynchota Vol.II

The Wikipedia entries can be found at T.V. Campbell and W.H. Campbell. Many people helped in the development of these articles. Roy Vickery kindly obtained a hard to find obituary of T.V., Alan Knox sent me some additional sources on W.H.C. and Susan Daniel, librarian at the United Theological College was extremely helpful. Arun Nandvar drove and S. Subramanya joined our little adventure in Chikaballapur. Dr Eric Lott made enquiries with the SOAS and LMS archives but found little. My entomologist friends and mentors, Prashanth Mohanraj and Yeshwanth H.M. shared their enthusiasm in discovering more about T.V.

Asian American Journalists on Wikipedia

15:54, Monday, 13 2021 September UTC

Heather J. Sharkey has been working with undergraduate and graduate students on Wikipedia projects since 2019, with the goal of promoting public-facing scholarship. She is a professor in the Department of Near Eastern Languages and Civilizations at the University of Pennsylvania.

Dr. Heather Sharkey
Image by CallMeBarcode, CC BY-SA 4.0 via Wikimedia Commons.

The Asian American Journalists Association (AAJA) partnered with Wiki Education to host a Wiki Scholars training course, funded by the Wikimedia Foundation, in July 2021. The goal was to increase representation of journalists of Asian origin by equipping participants with skills to improve existing articles or write new ones. Though neither a journalist nor a person of Asian heritage, I was privileged to join the AAJA group when a spot opened up.  I used the opportunity to write four new articles about Asian American journalists, including Nancy Yoshihara, a longtime reporter for The Los Angeles Times, who was one of the AAJA’s founders.  In the process, I strengthened editing and coding skills that I expect to apply at the University of Pennsylvania, where I teach modern and contemporary Middle Eastern history, and where I have been incorporating writing for Wikipedia into my courses during the past two years.

The AAJA has a mission, which is “advancing diversity in newsrooms [to] ensure fair and accurate coverage of communities of color.” Established in California in 1981, the AAJA has grown to welcome journalists of Asian and Pacific Islander heritage and supporters of other backgrounds within North America and the world.  It understands Asia widely: everything from the eastern Mediterranean region (western Asia, including part of the Middle East) to South, Central, East, and Southeast Asia, and into the Pacific arena.

The AAJA originally had strong American focus.  Its founders were responding to a history of popular American anti-Asian sentiment which went back to the nineteenth century and gained expression through laws like the Chinese Exclusion Act of 1882.  Conscious of this past, the AAJA’s website continues to affirm its goal of promoting “equitable and accurate coverage of Asian Americans and Pacific Islanders (AAPIs) and AAPI issues,” largely by encouraging AAPI students to enter media careers and by offering mentorship to journalists of Asian and Pacific Islander origin or heritage.

To prepare for this Wiki Education course, I read about the AAJA – and that was when I realized that its co-founder Nancy Yoshihara lacked a page on Wikipedia.  Given Wikipedia’s well-known gender gap, the absence did not entirely surprise me, and I was determined to address it.  In looking for sources about Yoshihara’s career, I found an interview that she gave on C-SPAN in 1997 in conjunction with an AAJA meeting in Boston that featured a panel on “The Price of Asian Political Involvement.”  Yoshihara, then president of the AAJA’s Los Angeles chapter, cited the 1996 election in Washington State of Gary Locke (b. 1950), who became the first Asian-American governor in the continental United States.  She also cited concerns about disturbing portrayals of Asian and Asian Americans in America mass media, which in some cases entailed propagation of Charlie-Chan- and martial-arts-style stereotypes and allegations of political manipulation through campaign donations. Nearly twenty-five years have passed since Yoshihara discussed these phenomena in her C-SPAN interview.  And yet, the recent upsurge in anti-Asian hate crimes in the United States – including the March 2021 Atlanta spa shootings – points to the persistence of the problem of American xenophobia towards people of Asian background and the continuing relevance of the AAJA’s efforts to promote inclusion and understanding via reporting.

Another journalist about whom I wrote for Wikipedia is Arun Venugopal, who grew up in Texas to parents who immigrated from India.  In print media and on radio, Venugopal has addressed issues facing Asian American and other communities.  He has discussed, for example, popular discourses about Asian Americans as a “model minority” and how such ideas have contributed to broader patterns of racism and xenophobia towards immigrants and people of color in the United States.

By participating in this Wiki Education course, I realized that while the AAJA may have been an American organization upon its foundation, its scope has steadily widened. Now stretching far beyond California, and counting more than 1,500 members, the organization is increasingly international. This point became clear in the weekly meetings that Wiki Education’s Will Kent led by Zoom for AAJA program participants, who tuned in from places ranging from Denver to Delhi and Seoul.

Developing these articles for this AAJA Wiki Scholars course also alerted me to a particular challenge about writing journalists’ biographies: journalists tend to write about others, not themselves, which can make it hard to find basic information about them.  Perhaps their instinct for security and confidentiality – for their sources, as for themselves – explains this discretion.  Despite my best efforts, for example, I could not find a birth year for either Yoshihara or Venugopal.  Fortunately, since Wikipedia articles are always works-in-progress, future researchers may find the information and fill these gaps later on.  These lessons about sourcing and revision are ones that I will pass on to my students at Penn.

Hero image of Penn campus: Kevin83002, Public domain, via Wikimedia Commons

Production Excellence #35: August 2021

13:23, Monday, 13 2021 September UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

Incidents

Zero documented incidents last month. Isn't that something!

Learn about past incidents at Incident status on Wikitech. Remember to review and schedule Incident Follow-up in Phabricator, which are preventive measures and other action items to learn from.

Image from Incident graphs.


Trends

In August we resolved 18 of the 156 reports that carried over from previous months, and reported 46 new failures in production. Of the new ones, 17 remain unresolved as of writing and will carry over to next month.

The number of new errors reports in August was fairly high at 46, compared to 31 reports in July, and 26 reports in June.

The backlog of "Old" issues saw no progress this past month and remained constant at 146 open error reports.

💡 Did you know:

You can zoom in to your team's error reports by using the appropriate "Filter" link in the sidebar of our shared workboard.

Take a look at the workboard and look for tasks that could use your help.

View Workboard


Progress

Last few months in review:

Jan 2021 (50 issues) 3 left.
Feb 2021 (20 issues) 6 > 5 left.
Mar 2021 (48 issues) 13 > 10 left.
Apr 2021 (42 issues) 18 > 17 left.
May 2021 (54 issues) 22 > 20 left.
Jun 2021 (26 issues) 11 > 10 left.
Jul 2021 (31 issues) 16 > 12 left.
Aug 2021 (46 issues) + 17 new unresolved issues.

Tally:

156 issues open, as of Excellence #34 (July 2021).
-18 issues closed, of the previously open issues.
+17 new issues that survived August 2021.
155 issues open, as of today (3 Sep 2021).

For more month-over-month numbers refer to the spreadsheet.


Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

Tech News issue #37, 2021 (September 13, 2021)

00:00, Monday, 13 2021 September UTC
previous 2021, week 37 (Monday 13 September 2021) next

weeklyOSM 581

09:28, Sunday, 12 2021 September UTC

31/08/2021-06/09/2021

lead picture

Mapping individual parking spaces [1] © Lejun | map data © OpenStreetMap contributors

Mapping campaigns

  • Радченко Алексей invited (ru) us to participate in the OpenStreetMap markup contest (ru) > en running from 10 to 30 September. There are simple instructions and dozens of prizes to be won. Join in.
  • The OSM-US project to map playgrounds in Philadelphia (we reported earlier) was completed on 1 September.

Mapping

  • Jeroen Hoek noticed a page on the wiki relating to a proposal to regularise the tag china_population. The tag seems to designed to change the way some larger cities in China are rendered.
  • The authorities of the French commune of Torsac have recently completed a project to name streets and assign house numbers. Wilfried, a commune representative, asked (fr), on the OSM-FR forum, why these addresses and street names do not yet appear in OSM. The answer turned out to be somewhat complex, as the long thread in the forum shows. However, the issue was resolved by Christian Quest.
  • Andrés (User AngocA) from Bogotá explained (es) > en in a diary entry how to analyse Notes and how, if there are many Notes open in the area of interest, you can develop a strategy to close them.
  • [1] User Lejun has written up some interesting tips for mapping individual parking spaces.
  • Monika Tota asked (pl) members of the OpenStreetMap Polska group how they tag properties that are somewhat subjective, like smoothness=good vs intermediate or wheelchair=yes vs limited. People responded with many methodologies they use, pointing out that reality often resists clear-cut classification and that this question is related to the issue of cartographic generalisation.
  • Voting is open until Sunday 19 September for the proposal club=cadet, intended to be used to map the locations where various Youth Cadet groups meet, together with details of each group.

Community

  • User Assange expressed their dismay at mapping changes in China and Taiwan which they perceive to be politically driven. For example, is this a vacation school or concentration camp?
  • GeOsm has launched its open-source, globally distributed map database service for countries around the world.
  • JaLooooNz wrote a diary entry about the use of service=driveway and the ‘need’ for a service=residential_driveway.
  • OpenStreetMap Belgium has chosen Constantine Tumwine from Tanzania as Mapper of the Month and interviewed him.
  • Noé discovered (fr) that a town in Ivory Coast with a population of over 30,000 bears his first name, Noé. It wasn’t mapped on #OpenStreetMap, so he took care of it.
  • Feye Andal shared that Youthmappers in the Philippines grew from four to nine local chapters in just one year.

Humanitarian OSM

  • HOT published their Annual Report for 2020 (covering the period from July 2020 to June 2021). Note the report is large and may load slowly.
  • The Open Mapping Hub – Asia Pacific, organised by HOT, is launching a newsletter to share highlights of open mapping projects, communities, and opportunities across the region and invites you to subscribe.

Maps

  • SmokyMountains.com has released a 2021 Fall foliage prediction map showing the progressive change of leaves’ colour through the USA.
  • Tif.hair used (fr) > en data from the French national statistical institute, INSEE, in order to map 4,000 hairdressers in France with, mostly bad, puns in their names. While using OpenStreetMap as a baselayer, clicking on a shop marker links to Google Maps.
  • The Ordnance Survey, Britain’s national mapping agency, withdrew support for OS Open Space on 31 August. OpenSpace was launched in 2008 using OpenLayers. OSM’s founder, Steve Coast, worked as a consultant to the OS on the initial project. Replacement services continue to be available via the new hub, but a number of existing applications and websites no longer work.

Open Data

  • After a small pilot project in 2017, the Church of England announced their intention to map graves in all 19,000 churchyards in their care. The mapping, funded by Historic England, the National Lottery Heritage Fund and Caring for God’s Acre, uses LiDAR and is expected to take seven years. It is hoped that the resulting data will be open.

Software

Programming

  • Christian Quest wrote (fr) > en about his summer work on optimising performance for the new French tile server.

Releases

  • Bryan Housel announced release 1.1.7 of RapiD, including bugfixes and validation improvements.
  • A September release (2021.09.01-6-android) of Organic Maps on Android is now available on Play Store.

Did you know …

  • … the OpenParkingMap? Forked by jakecoppinger from zlant’s Parking Lanes Viewer, this version has Australian parking signs.
  • OSMCha is a tool for checking OSM changesets for data quality? The tool was developed by Wille Marcel in 2015, and Wille remains the maintainer.
  • … that you can find user names or the user id associated with a partial user name using Who’s That? Particularly helpful if you can’t remember how a user name is capitalised.

OSM in the media

  • Natfoot provided an audio-visual version of weeklyOSM 580 on YouTube.

Other “geo” things

  • The Guardian previewed the new book Atlas of the Invisible by UCL geographer James Cheshire and graphics designer Oliver Uberti. The book contains many visualisations of geographic data relating to climate change.
  • Episode 84 of the Geomob podcast featured Dave Gee, creator of hand-drawn Doodle Maps.
  • OpenCage’s latest #geoweirdness thread is focused on Canada.

Upcoming Events

Where What Online When Country
OSM Africa Monthly Mapathon: Map Malawi osmcalpic 2021-09-04 – 2021-10-04
Bogotá Distrito Capital Agreguemos y editemos rutas de transporte en OpenStreetMap osmcalpic 2021-09-11 flag
Zürich Mapping-Party/132. OSM-Treffen Zürich osmcalpic 2021-09-11 flag
Arlon Réunion des contributeurs OpenStreetMap, Arlon osmcalpic 2021-09-13 flag
臺北市 OpenStreetMap x Wikidata 月聚會 #32 osmcalpic 2021-09-13 flag
Hamburg Hamburger Mappertreffen osmcalpic 2021-09-14 flag
PHXGeo Meetup (Phoenix, AZ, US) osmcalpic 2021-09-15
The ISPRS SC Webinar Series: Collaborative Humanitarian Mapping with PoliMappers and UN Mappers osmcalpic 2021-09-16
Karlsruhe Karlsruhe Hack Weekend osmcalpic 2021-09-17 – 2021-09-19 flag
Nantes Journées européennes du patrimoine 2021, Nantes osmcalpic 2021-09-18 flag
Grenoble Atelier OpenStreetMap – retrouvailles et initiation ! osmcalpic 2021-09-20 flag
Lyon Rencontre mensuelle Lyon osmcalpic 2021-09-21 flag
Bonn 143. Treffen des OSM-Stammtisches Bonn osmcalpic 2021-09-21 flag
Berlin OSM-Verkehrswende #27 (Online) osmcalpic 2021-09-21 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2021-09-21 flag
DRK Missing Maps Online Mapathon osmcalpic 2021-09-23
[Online] OpenStreetMap Foundation board of Directors – public meeting osmcalpic 2021-09-24
Düsseldorf Düsseldorfer OSM-Treffen (online) osmcalpic 2021-09-24 flag
Amsterdam OSM Nederland maandelijkse bijeenkomst (online) osmcalpic 2021-09-25 flag
FOSS4G 2021 Buenos Aires – Online Edition osmcalpic 2021-09-27 – 2021-10-02
Bremen Bremer Mappertreffen (Online) osmcalpic 2021-09-27 flag
Grenoble Mapathon Missing Maps – Cartographier des cartes humanitaires sur un mode collaboratif et libre. osmcalpic 2021-09-28 flag
Bruxelles – Brussel Virtual OpenStreetMap Belgium meeting osmcalpic 2021-09-28 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Supaplex, TheSwavu, YoViajo, derFred.

This Month in GLAM: August 2021

00:31, Sunday, 12 2021 September UTC

Improving Wikipedia’s coverage of 9/11

18:46, Friday, 10 2021 September UTC

Tomorrow is the 20th anniversary of the September 11 attacks — and as many people reflect on the milestone, some will turn to Wikipedia to read about this moment in history and the widespread impacts of it. The attacks occurred in Wikipedia’s first year of existence, and played an important role in shaping the culture of the nascent encyclopedia project. A recent article in Slate by Stephen Harrison provides a nice overview of Wikipedia’s coverage and explores how Wikipedia and the War on Terror “grew up together”. But as the 20th anniversary approaches, Wikipedia’s articles related to the attacks and their aftermath don’t get the sort of editing attention they once did, and it shows. The Guantanamo military commission article, for example, had a banner informing readers that its “factual accuracy may be compromised due to out-of-date information” — a banner that someone added to the page in November 2010.

For the last two months, Wiki Education, in collaboration with our partners at ReThink Media, has been addressing content gaps within Wikipedia’s articles related to September 11, the War on Terror, and related topics. We’ve been leading a ReThink Media Wiki Scholars course, where we brought together a group of peace and security studies experts to identify content gaps in Wikipedia’s coverage. We taught them to edit Wikipedia and navigate policy, something that’s especially important when working in an area where strong feelings persist.

One of Wikipedia’s most active WikiProjects, or collectives of editors tackling a particular topic area, is WikiProject Military History. Articles related to the military often have extensive coverage of the specifics of war — but this approach has led to gaps in the context of humanitarian implications. During the course, we had several conversations about whether Wikipedia articles should include this kind of information, or whether the goal was to primarily provide accounts of campaigns and operations. We came to the consensus that Wikipedia’s goal is to provide an overview of all relevant information, which necessarily includes the humanitarian impacts of war. As a result of this, the participants updated information in the article on the War on terror, including adding a previously absent section on civilian casualties in various countries and war zones.

Other articles improved by the group include the September 11 attacks article, in which a contributor added subsections to the “domestic response” section about discrimination and racial profiling of Arab Americans and interfaith efforts to educate people about the Muslim faith. Another tackled the Post-9/11 article, adding a section about discriminatory backlash. And the Islamophobia in the United States article now has a section on Islamophobia in places of worship, thanks to a participant in the course.

A previously short article about Holy Shrine Defenders got an overhaul from another participant, resulting in a significant expansion. And information related to the Guantanamo Bay detention camp, and the United States v. Khalid Sheikh Mohammed case also saw significant edits from course participants. One Wiki Scholar updated and rewrote the Guantanamo military commission article, finally allowing the removal of that 11-year-old warning banner.

But sometimes smaller changes can have a big impact. In the lead section of the September 11 attacks article, al Qaeda is described as “Wahhabi”. One participant removed that term because it was inaccurate. Their edit was reverted by a Wikipedian because the statement was sourced, and the discussion on the article’s talk page didn’t come to a resolution. In our class session, the Wiki Scholar asked how best to proceed. Looking at the sources, it was fairly obvious that two were weak, but one came from an academic source, which meant it wasn’t the sort of thing that could be dismissed out of hand. But then a course participant who had the book on their own bookshelf referenced the cited page and found the relevant quote: Because Osama bin Laden and most of the hijackers are Saudi nationals, it was assumed that al-Qaeda is an expression of Wahhabism. That is not the case. Once the precise quote was supplied, the editors engaging on the talk page were able to reach consensus quickly.

Real world events overtook our course as people had to miss sessions to do press interviews after the Fall of Kabul, and many of them were personally impacted as they worried about the safety of colleagues and friends who were trying to escape Afghanistan. But despite that, they continued to work to improve Wikipedia, understanding that improvements like these were critical in the weeks leading up to the 20th anniversary of the September 11 attacks, when readership of these articles is skyrocketing. In a few short weeks, the articles our subject-matter experts improved have received more than 1.4 million page views – and we expect that number to rise even more tomorrow and in the coming weeks. That means millions of people searching for neutral, fact-based information around this anniversary now get a more nuanced picture of the impacts the attacks have had over the past 20 years.

For as Slate’s Stephen Harrison writes, “As we approach the 20th anniversary of Sept. 11, Facebook users are likely to see 9/11 tributes selected by an algorithmic assessment of that user’s content preferences, part of the personalized, polarized social media experience. On the other hand, every English Wikipedia user who visits the current page for the September 11 attacks this week will see the same article regardless of their demographic profile.”

Interested in partnering with Wiki Education to improve Wikipedia’s coverage of a subject area? Visit partner.wikiedu.org. Image credit: Carol M. Highsmith, Public domain, via Wikimedia Commons

How writing for Wikipedia helps journalists

16:16, Friday, 10 2021 September UTC

Yiwen Lu is a student journalist based in Chicago and the Communications Director of the Asian American Journalists Association (AAJA). She was a participant in Wiki Education’s recent AAJA Wiki Scholars course, which was made possible by the Wikimedia Foundation. In the course, AAJA’s members worked together to increase Wikipedia’s coverage of Asian American and Pacific Islander journalists’ biographies. 

“I use Wikipedia a lot personally and professionally, and it struck me that minority journalists are underrepresented in Wikipedia biographies. That inspired me to take this opportunity and learn more about how people edit Wikipedia articles that end up being a full encyclopedia,” Lu says.

Lu created an entry for Jiayang Fan, who is a staff writer for The New Yorker. She compiled many of Fan’s interviews and other sources to be able to write a comprehensive biography.

Taking the AAJA Wiki Scholars course provided Lu a fresh perspective on the contributing process on Wikipedia. As an avid user of Wikipedia previously, she transformed from a consumer to a writer confident in writing from scratch.

“Previously as a user/reader of Wikipedias, I would only pay attention to the actual Wikipedia page, but the course taught me about different sections in the writing process, various pages in addition to the main Wikipedia, as well as many fun resources,” Lu says. “It introduced me to the behind-the-scenes parts of Wikipedia. So after taking the course, I am comfortable with creating an article from scratch, adding citations, as well as improving existing articles through using talk pages, for example.” 

As journalists emphasize the importance of reporting accurate information from reliable sources, the AAJA Wiki Scholars course taught Lu that Wikipedians share similar values. A major feature of Wikipedia is including citations in every article, a tool journalists can use to pinpoint sources for their own work.

“Beyond learning about the topic itself through contributing, I think learning how citations are made in Wikipedia would be helpful for journalists when we are looking into an issue, as it helps us to trace back to the original sources,” Lu says. “I personally found the process of creating a Wikipedia entry helpful for me to read about multiple sources on the topic.”

Throughout the course, Lu enjoyed engaging in genuine conversations with other Wiki Scholars while picking up on how new Wikipedia articles come to life.

“Looking at the back end of things – the talk pages and the edit history have been really fun to look at. Those helped me visualize how one builds a Wikipedia from scratch,” Lu says. “Reading user pages of individual users has also been a really fun part; I would never imagine that Wikipedia can also serve this additional social function to bring the community together.” 

With Wikipedia being a website filled with publicly accessible information, Lu sees the vast amount of benefits that come along with teaching others about Wikipedia contributions and the importance of publishing information in an effort to give credit where it’s due. 

“As something that is openly accessible to everyone on the internet, Wikipedia is in a really unique and important position to educate people, so having diverse perspectives makes sure that we are telling the stories that need to be heard,” Lu says.

 Lu hopes her contribution on Wikipedia provides representation for other aspiring journalists, especially Asian Americans. Her experiences starting out on her journalism career reflects on why she took the initiative to enroll in the AAJA Wiki Scholars course.

“Personally, when I started to get interested in journalism during college, there were few AAPI journalists around me, and there was no role model for me to look at. As a result, I have been lost for the first couple of months of navigating this career,” Lu says. “The more I report and write, the more I realized that there were few coverages of the AAPI community because there are not many AAPI journalists. A lot of the time, journalists don’t look like the community they cover. Therefore, for the sake of both helping aspiring journalists who hope to get into the industry as well as helping the community find the right person to tell their stories, I found it important to participate in the AAJA Wiki Scholars course and contribute to the representation of minority journalists – and not even journalists, but people of color in all professions.”

To take a course like Yiwen’s, please visit learn.wikiedu.org. Image courtesy Yiwen Lu, all rights reserved.

Outreachy report #24: August 2021

00:00, Friday, 10 2021 September UTC

Highlights We said goodbye to many of our wonderful May 2021 cohort interns We had to deal with an increasing amount of extensions, one including a CoC incident We processed their final feedback We found wonderful initial application reviewers! May 2021 cohort Interns (now alums) who didn’t have extensions finished their internships this month. In the last conversation we all had, many of them expressed their appreciation for our bi-weekly chats.

Sharing the accomplishments of an amateur scientist

15:33, Thursday, 09 2021 September UTC
Britt Forsberg
Britt Forsberg

Britt Forsberg has had an extensive amount of experience with science. Currently, she coordinates training and service opportunities for the Minnesota Master Naturalist program where she prepares volunteers for service in conservation and connects them to stewardship, research, and education volunteer opportunities.

When Forsberg learned about the Wiki Scientists course through 500 Women Scientists, she was eager to take up this opportunity to increase representation of women in STEMM on Wikipedia.

Forsberg edited the page for Miriam Rothschild, who is a British natural scientist that has contributed to zoology, entomology, and botany. She selected this scientist because, she says, she wanted to acknowledge so-called ‘amateur’ scientists, whom she believes deserve equal recognition as those with traditional academic credentials.

“Even though she didn’t have the academic credentials that many people find necessary, she was incredibly knowledgeable and made huge contributions to entomology,” Forsberg said.  “I also found her previous page disappointing in the ways it called attention to her lack of educational background and her husband’s remarriage after their divorce instead of her scientific achievements.”

Forsberg says that if it had not been for the Wiki Scientists course, she would not have had the chance to properly dive into researching Rothschild. Because of her dedicated work, within a short period of time after publishing on Wikipedia, there were many views.

“I was amazed at the number of page views our articles had just in the small time we worked in the cohort so I think it’s clear that Wikipedia is a major player and that people pay attention to what is posted there.  It’s very important that Wikipedia users can see themselves somewhere in Wikipedia,” Forsberg says.

What made Forsberg’s time in the Wiki Scientists course memorable was the chance to work and connect with others towards similar goals. She hopes others in academia who are still denouncing Wikipedia as a good starting place will soon see its purpose and place in the information landscape.

“I think some people in academia can dismiss Wikipedia as a source but participating in the course would show them what a rich resource it is,” she says.

Forsberg looks forward to spreading the word and knowledge about Wiki Education’s useful services and how contributions like this impact others.

We’re starting to look at how we can use Wikipedia in our program,” she says. “We’re trying to represent more diverse perspectives in our field and while we could manage that information, having our participants work in Wikipedia means that their information will find a much larger audience. It also solves some technical problems for us in that we don’t have to maintain a website, host server space, etc., and other Wikipedia users can help us stay on top of things like plagiarism. Everyone benefits this way; our program, our participants, and Wikipedia users across the world.” 

To take a course like Britt’s, please visit learn.wikiedu.org. Image credits: Open Media Ltd., CC BY-SA 3.0, via Wikimedia Commons; Mountainairy, CC BY-SA 4.0, via Wikimedia Commons.

By Miriam Redi, Fabian Kaelin, Tiziano Piccardi

Colossal octopus by Pierre Denys de Montfort, Public Domain, and EMS VCS 3 by Standard Deviant, CC BY-SA 2.0

It’s often said that an image is worth a thousand words, but for the millions of images and billions of words on Wikipedia, this idiom doesn’t always apply. Images are essential for knowledge diffusion and communication, but less than 50% of Wikipedia articles are illustrated at all! Moreover, images on articles are not stand-alone pieces of knowledge: they often require large captions to be properly contextualized and to support meaning construction. 

More than 300M people in the world have visual impairments, and billions of people in the Global South with limited internet access would benefit from text-only documents. These groups rely entirely on the descriptive text to help contextualize images in Wikipedia articles. But only 46% of images in English Wikipedia come with a caption text, and only 10% have some form of alt-text, with 3% having an alt-caption that is appropriate for accessibility purposes. This lack of contextual information not only limits the accessibility of visual and textual content on Wikipedia, but it also affects the way in which images can be retrieved and reused across the web.

Several Wikimedia teams and volunteers have successfully deployed algorithms and tools to help editors fix the problem of lack of visual content on Wikipedia articles. While very useful, these methods have limited coverage. An average of only 15% of articles find good candidate image matches. 

Existing automated solutions for image captioning are also difficult to incorporate in editors’ workflows: the most advanced computer vision-based image to text generation methods aren’t suitable for the complex, granular semantics of Wikipedia images and are not generally available for languages other than English.

The Wikipedia Image/Caption Matching Competition on Kaggle

As part of our initiatives to address Wikipedia’s knowledge gaps, we are organizing the “Wikipedia Image/Caption Matching Competition.” We are inviting communities of volunteers, developers, data scientists, and machine learning enthusiasts to help us solve the hardest problems in the image space. 

The  “Wikipedia Image/Caption Matching Competition” is designed to foster the development of systems that can automatically associate images with their corresponding image captions and article titles. The Research Team at the Wikimedia Foundation will be hosting the competition through Kaggle starting September 9th, 2021. This competition was made possible thanks to collaborations with Google Research,  EPFL, Naver Labs Europe, and Hugging Face, who massively helped with the data preparation and the competition design. Given the highly novel, open, and exploratory nature of the challenge proposed, the first edition of the competition comes in a “playground” format.

Participation is completely online and open to anyone with access to the internet. In this competition, participants will be provided with content from Wikipedia articles in 100+ language editions. They will be asked to build systems that automatically retrieve the text (an image caption, or an article title) closest to a query image. The best models will account for the semantic granularity of Wikipedia images and operate across multiple languages. 

The collaborative nature of the platform helps lower barriers to entry and encourages broad participation. Kaggle is hosting all data needed to get started with the task, example notebooks, a forum for participants to share and collaborate, and submitted models in open-sourced formats. With this competition, we hope to provide a fun and exciting opportunity for people around the world to grow their technical skills while contributing to one of the largest online collaborative communities and the most widely used free online encyclopedia. 

A large dataset of Wikipedia image files and features

Space-time distortion made by Earth, GNU Free Documentation License

bn: সাধারণ আপেক্ষিকতা তত্ত্ব অণুয়ায়ী সময় এবং কাল এর বক্রতা একটি দ্বি-মাত্রিক চিত্রের সাহায্যে উপস্থাপন করা হয়েছে।ja: 一般相対性理論によって記述される、2次元空間と時間の作る曲面。地球の質量によって空間が歪むとして記述して、重力を特殊相対性理論に取り入れる。実際の空間は3次元であることに注意すべし。
ko: 일반상대성이론에서 묘사된 시공의 곡률을 2차원으로 표현한 그림.
it: Una celebre illustrazione divulgativa della curvatura dello spaziotempo dovuta alla presenza di massa, rappresentata in questo caso dalla Terra.
en: Two-dimensional projection of a three-dimensional analogy of spacetime curvature described in general relativity
ckb: دەرھاوێشتەیەکی دووڕەھەندی لە چەمانەوەی کاتـجێ لە بۆشایییەکی سێڕەھەندیدا، کە لە تیۆریی ڕێژەیی ئاینشتایندا دێتە بەر باس.
my: နှိုင်းရသီအိုရီအရ သုံးဖက်မြင် အာကာသအချိန် ကွေးညွတ်ပုံအား နှစ်ဘက်အမြင်ဖြင့် ဖော်ပြထားပုံ

Participants will work with one of the largest multimodal datasets ever released for public usage. The core training data is taken from the Wikipedia Image-Text (WIT) Dataset, a large curated set of more than 37 million image-text associations extracted from Wikipedia articles in 108 languages that was recently released by Google Research.

The WIT dataset offers extremely valuable data about the pieces of text associated with Wikipedia images. However, due to licensing and data volume issues, the Google dataset only provides the image name and corresponding URL for download and not the raw image files.

Getting easy access to the image files is crucial for participants to successfully develop competitive models. Therefore, today, the Wikimedia Research team is releasing its first large image dataset. It contains more than six million image files from Wikipedia articles in 100+ languages, which correspond to almost1 all captioned images in the WIT dataset. Image files are provided at a 300-px resolution, a size that is suitable for most of the learning frameworks used to classify and analyze images. The total size of the dataset released stands around 200GB, partitioned into 200 files of around 1GB.

With this large release of visual data, we aim to help the competition participants—as well as researchers and practitioners who are interested in working with Wikipedia images—find and download the large number of image files associated with the challenge, in a compact form.

While making the image files publicly available is a first step towards making Wikipedia images accessible to larger audiences for research purposes, the sheer size of the raw pixels makes the dataset less usable in lower-resource settings. To improve the usability of our image data, we are releasing an additional dataset, containing an even more compact version of the six million images associated with the competition. We compute and make publicly available the images’ ResNet-50 embeddings. We describe each image with a 2048-dimensional signature extracted from the second-to-last layer of a ResNet-50 neural network trained with Imagenet data. These embeddings contain rich information about the image content and layout, in a compact form. Images and their embeddings are stored on Kaggle, and on our Wikimedia servers

Here is some sample PySpark code to read image files and embeddings:

# File Format:
## Pixels columns: image_url, b64_bytes, metadata_url
### b64_bytes are the image bytes as a base64 encoded string 
## Embedding columns: image_url, embedding
### Embedding: a comma separated list of 2048 float values


# embeddings 
 
@F.udf(returnType='array<float>')
def parse_embedding(emb_str):
   return [float(e) for e in emb_str.split(',')]
# parse embedding array
first_emb = (spark.read
   .csv(path=resnet_embeddings_training+'*.csv.gz',sep="t")
   .select(F.col('_c0').alias('image_url'), parse_embedding('_c1').alias('embedding'))
   .take(1)[0]
)
print(len(first_emb.embedding))
# 2048
 
 
# pixels
first_image = (spark
   .read.csv(path=image_pixels_training+'*.csv.gz',sep="t")
   .select(F.col('_c0').alias('image_url'), F.col('_c1').alias('b64_bytes'),F.col('_c2').alias('metadata_url'))
   .take(1)[0]
)
 
# parse image bytes
import base64
from io import BytesIO
from PIL import Image
pil_image = Image.open(BytesIO(base64.b64decode(first_image.b64_bytes)))
print(pil_image.size)
# (300, 159)

This is an initial step towards making most of the image files publicly available and usable on Commons in a compact form. We are looking forward to releasing an even larger image dataset for research purposes in the near future!

We encourage everyone to download our data and participate in the competition. This is a novel, exciting, and complex scientific challenge. With your contribution, you will be advancing the scientific knowledge on multimodal and multilingual machine learning. At the same time, you will be providing open, reusable systems that could help thousands of editors improve the visual content of the largest online encyclopedia. 

Acknowledgements

We would like to thank everyone who contributed to this amazing project, starting with our WMF colleagues: Leila Zia, head of Research, for believing in this project and for overseeing every stage of the process, Stephen La Porte and Samuel Guebo who supported the legal and security aspects of the data release, Ai-Jou (Aiko) Chou for the amazing data engineering work, Fiona Romeo for the data about alt text quality, and Emily Lescak and Sarah R. Rodlund for helping with the release of this post.

Huge thanks to the Google WIT authors (Krishna Srinivasan, Karthik Raman, Jiecao Chen, Michael Bendersky, and Marc Najork) for creating and sharing the database, and for collaborating closely with us on this competition, and to the Kaggle team (Addison Howard, Walter Reade, Sohier Dane) who worked tirelessly for making the competition happen.

All this would not have been possible without the valuable suggestions and brainstorming sessions with an amazing team of researchers from different institutions: thank you Yannis Kalantidis, Diane Larlus, and Stephane Clinchant from Naver Labs Europe; Yacine Jernite from Hugging Face, and Lucie Kaffee from the University of Copenhagen, for your excitement and dedication to this project!

Footnotes

  1. We are publishing all images having a non-null “reference description” in the WIT dataset.  For privacy reasons, we are not publishing images where a person is the primary subject, i.e., where a person’s face covers more than 10% of the image surface. To identify faces and their bounding boxes, we use the RetinaFace detector. In addition, to avoid the inclusion of inappropriate images or images that violate copyright constraints, we have removed all images that are candidate for deletion on Commons from the dataset.

About this post

Featured image credit: Wikipedia20 Knowledge.svg, Wikimedia Foundation, CC0 1.0

8 September 2021 — Today, the Wikimedia Foundation, the global nonprofit organization that supports Wikipedia and other free knowledge projects, announced six inaugural grants as part of the newly launched Knowledge Equity Fund, an effort to close knowledge gaps and address racial inequities in its projects. The first round of grants will be given to six global nonprofit organizations: Arab Reporters for Investigative Journalism (ARIJ), the Borealis Philanthropy Racial Equity in Journalism Fund, Howard University School of Law and the Institute for Intellectual Property and Social Justice (IIPSJ), InternetLab, STEM en Route to Change (SeRCH) Foundation, and the Media Foundation of West Africa. 

“As a movement dedicated to the sum of all knowledge, we must take a more active role in breaking down the barriers to knowledge that have disproportionately impacted communities of color throughout history,” said Lisa Gruwell, Chief Advancement Officer at the Wikimedia Foundation and an advisor on the Equity Fund Committee. “Racism has skewed the historical record and continues to deny communities of color access to knowledge as a human right. Through the Equity Fund, we are thrilled to support organizations working directly to address these inequities, so that the work of free knowledge can finally reflect the world’s rich diversity.”

The Equity Fund is a $4.5 million fund created by the Wikimedia Foundation to advance more equitable, inclusive representation in Wikimedia projects, including Wikipedia. Through the fund, the Foundation will build a robust ecosystem of institutional partners working at the intersection of free knowledge and racial justice. The Equity Fund extends the Foundation’s explicit goal to support communities that have been left out by structures of power and privilege. It was conceptualized in June 2020, in the wake of global protests about police brutality and racial injustice in the United States. 

The first grant recipients of the Equity Fund are:

  • Arab Reporters for Investigative Journalism (ARIJ), Jordan ($250,000): To provide a one-year investment to expand the investigative journalism ecosystem in 16 countries in the Middle East. With our support, ARIJ will expand the training and support they provide for Arab journalists around racial equity and accessibility, and advocate for increased coverage of marginalized communities throughout the region.
  • Borealis Philanthropy’s Racial Equity in Journalism Fund, United States ($250,000): To provide a one-year investment to support US-based journalism organizations led by and for people of color, helping expand news and public affairs coverage in communities of color. Through the Racial Justice in Journalism fund, we will seek to increase media coverage and, subsequently, source citations for Wikimedia projects about issues and leaders that impact diverse communities.
  • Howard University School of Law and the Institute for Intellectual Property and Social Justice (IIPSJ), United States ($260,000): To create a two-year Wikimedia Race and Knowledge Equity Fellowship to produce white papers and academic research exploring how free knowledge can be used to advance racial equity and socio-economic empowerment throughout the intellectual property landscape. This Fellowship would also develop recommendations to address gaps in the free knowledge ecosystem that exacerbate systemic racism and block progress to advance racial equity.  
  • InternetLab, Brazil ($200,000): To create a two-year Wikimedia Race and Knowledge Equity Fellowship to research the impact of systemic racism and digital access for African descendants in Brazil, explore the most pressing barriers to the participation of Black people in knowledge online, and identify how racial inequality is reflected in the availability of online content in Portuguese and in Brazil.  The Fellowship will work to identify how national and local policies create barriers related to online knowledge, and potential policy solutions to address intellectual property, access, and education among others.
  • Media Foundation for West Africa (MFWA), Ghana ($150,000): To provide a one-year investment to support MFWA’s work providing journalist training and advocacy for journalist rights. With this grant, MFWA will expand their work to cover racial equity through funding for investigative journalism, promoting and protecting freedom of expression and digital rights in the region.  
  • STEM en Route to Change (SeRCH) Foundation, United States ($250,000): To provide a two-year investment to the SeRCH Foundation to support the expansion of their signature program, #VanguardSTEM, which amplifies the voices of Black, Indigenous, women of color and non-binary people of color in STEM fields. The SeRCH Foundation will leverage cultural production, including multimedia storytelling, to advance non-traditional forms of knowledge creation, to build freely licensed and open rich media content about STEM leaders of color, and address inequitable representation throughout scientific fields.

Racial equity is directly tied to our movement’s focus on knowledge equity, part of our long-term strategy for 2030. Knowledge equity is defined as supporting the knowledge and communities that have been excluded by historical structures of power and privilege. Many of the barriers that prevent people from accessing and contributing to knowledge are rooted in systems of racial oppression. Due to colonization and slavery, knowledge from Black and Indigenous communities, along with other historically marginalized groups, has been systematically excluded and erased from the historical canon. The Equity Fund will directly address the barriers to free knowledge experienced by Black, Indigenous, and communities of color around the world. Investments from the Equity Fund will address one or more of five focus areas: 

  • Supporting scholarship & advocacy focused on free knowledge and racial equity; 
  • Expanding media and journalism efforts focused on people of color around the world; 
  • Addressing unequal internet access; 
  • Improving digital literacy skills that impede access to knowledge; 
  • Investing in non-traditional records of knowledge such as oral histories. 

Grant recipients are chosen based on their past record of impact, their alignment to Wikimedia’s vision of access to knowledge, and their potential to benefit free knowledge. Following this first round of grantees, the Equity Fund will continue to look for additional grantees that align to our goals of addressing racial inequities in free knowledge through subsequent rounds of funding. The next round will likely take place in the next year.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and the other Wikimedia free knowledge projects. Our vision is a world in which every single human can freely share in the sum of all knowledge. We believe that everyone has the potential to contribute something to our shared knowledge, and that everyone should be able to access that knowledge freely. We host Wikipedia and the Wikimedia projects, build software experiences for reading, contributing, and sharing Wikimedia content, support the volunteer communities and partners who make Wikimedia possible, and advocate for policies that enable Wikimedia and free knowledge to thrive. 

The Wikimedia Foundation is a charitable, not-for-profit organization that relies on donations. We receive donations from millions of individuals around the world, with an average donation of about $15. We also receive donations through institutional grants and gifts. The Wikimedia Foundation is a United States 501(c)(3) tax-exempt organization with offices in San Francisco, California, USA.

Adding inclusive historical biographies to Wikipedia

18:27, Tuesday, 07 2021 September UTC
Serene Williams
Serene Williams at the Pauli Murray Center for History and Social Justice.

Serene Williams is a full-time high school teacher and an independent public historian who documents the women’s suffrage movement. For 20 years, she has been teaching U.S. history, women’s history, and political science courses at both the high school and college level.

Williams recently participated in Wiki Education’s LGBTQ+ Wiki Scholars Course, making it the third course she has taken with Wiki Education. She found out about Wiki Education through attending the National Women’s Studies Association (NWSA) conferences. Wiki Education has partnered with NWSA since 2014, working together with their members to expand Wikipedia’s coverage of women’s, gender, and sexuality studies, and the NWSA Wikipedia initiative has paved the way for dozens of their members to make information accessible to the public through Wikipedia.

Inspired by the documentary Pride on Hulu, Williams created a new page for Madeleine Tress, a government employee in the 1950s who lost her job because of the lavender scare. By creating a page on Wikipedia on Tress, this would educate both Williams and her students about Tress’s story.

“My students are often curious to know more about the lavender scare and are always surprised it is not commonly taught in U.S. history classes so I decided to create a new page for Madeleine Tress so I could educate myself about her story and share it with my students,” Williams says.

Williams also created a shorter page on Cleo Bonner of the Daughters of Bilitis and improved on the coverage of Rev. Dr. Pauli Murray, a women’s rights legal scholar.

Accurate representation of information on Wikipedia is important to Williams because it provides substantial help for students doing research. In order to give students the most accurate information that’s representative of history, it’s crucial that we continue the work to reduce the many gaps in coverage on Wikipedia about historically excluded groups. Contributions like Williams and her peers in the LGBTQ+ Wiki Scholars course made are a big step toward making information on the internet more equitable and accessible to students and anyone with internet access.

“As a U.S. history and AP government teacher, I think it is so important that students have accurate and inclusive information when they are researching historic topics,” Williams says. “After teaching for so many years I consistently see students consulting Wikipedia first when they are researching a new topic so I feel it is essential Wikipedia articles include significant coverage of LGBTQ+ activists and organizations. I have also been troubled to learn the vast majority of biographies on Wikipedia are about men so I try to consistently work on pages about women’s history.”

Throughout the course, Williams improved her Wikipedia editing skills thanks to the course instructor’s guidance and the other help resources we provide to novice editors through our Wiki Scholars courses.

“Will Kent was an excellent instructor in my LGBTQ+ course and I learned a great deal from his weekly sessions,” Williams says. “Although I already knew how to edit and create Wikipedia pages I was less familiar with Wikipedia standards when it comes to writing about gender identity. Will shared numerous helpful resources about this that will help me in the future when I create and edit new Wikipedia pages.”

Currently, Williams continues to teach students and high school faculty on how to edit Wikipedia and has witnessed hundreds of them doing outstanding work. She strongly encourages other K-12 educators to teach their students about the benefits of using Wikipedia during the learning process.

“I love the feeling of contributing to the public’s knowledge on important yet lesser-known topics in history. It is incredibly gratifying to write content and post images on Wikipedia and see them immediately get posted for public consumption,” Williams says.

To take a course similar to this one, please visit learn.wikiedu.org. Image credits: Phil Roeder from Des Moines, IA, USA, CC BY 2.0, via Wikimedia Commons; Serenewilliams, CC BY-SA 4.0, via Wikimedia Commons.

Tech News issue #36, 2021 (September 6, 2021)

00:00, Monday, 06 2021 September UTC
previous 2021, week 36 (Monday 06 September 2021) next

weeklyOSM 580

10:17, Sunday, 05 2021 September UTC

24/08/2021-30/08/2021

lead picture

prettymaps [1] © Marcelo Prates | map data © OpenStreetMap contributors

Mapping

  • The speed limit on most Parisian streets has been lowered to 30 km/h (18 mph). Christian Quest tweeted (fr) that this has resulted in a major update to OSM that should propagate through to route planners over the next few days.
  • Kontur tweeted a reminder that in the wake of Hurricane Henri, mappers could help by mapping densely populated yet poorly mapped areas (e.g., parts of Connecticut). See the Disaster Ninja app for more details.
  • Jan (user: Lübeck) suggested (de) > en adding three additional damage:*=* tags to areas destroyed in the heavy rainfall-induced flooding on 14 July in western Germany.
  • User messpert pondered the most appropriate tagging for historical surface mining sites.
  • A request for comments has been made for currency:crypto:*=yes,no, a proposal to extend the currency key to support cryptocurrencies.
  • The proposal man_made=video_wall, for mapping large digital screens, was approved with 9 votes for, 1 vote against and 2 abstentions.

Community

  • User PlayzinhoAgro wrote about the lack of popularity of OpenStreetMap compared to other market solutions, based on a small survey (pt) > en of the Brazilian market.

Imports

  • Vladislav Kugelevich has added information to the OSM wiki about Latvia’s datasources that are potentially suitable for import into. He plans to import some of this data but is still studying the software and the import guidelines.

OpenStreetMap Foundation

  • The LCCWG moderation subcommittee held two online public meetings on 2 and 3 September about revisions to the current Etiquette Guidelines, which are now open for public comment until Wednesday 8 September.
  • This year’s annual general meeting of the OSMF will take place online on Saturday 11 December 2021 at 16:00 UTC. Voting for the board election will start one week before the annual general meeting and this year there will be at least four positions available. To be eligible to vote you must have been a member or associate member for 90 days prior to the date on which the meeting is held, that is, paid up by Saturday 11 September.
  • The OWG is planning to host vector tiles generated by Tilemaker as an experiment. The goal is to get something others can work with to see where more work needs to be done.

Education

  • User Lübeck was seeking IBIS-Budget Hotels for a trip. They asked (de) > en for assistance in writing an Overpass query to find all the variations in naming the hotels have in OSM.

OSM research

  • Afraid of the number 13? DeBigCs looked at the shortfall of houses numbered 13 in Dublin. Andrew Davidson followed up with an analysis that showed a much greater fear of 13 in the USA.
  • It looks like frzbrmnd2 has written to all/many mappers who have mapped in Iran. He is using the data for his dissertation where he is investigating the relationship between participants’ local knowledge and the quality of data collected.

Humanitarian OSM

  • On 27 August, the Open Mapping Hub in Eastern and Southern Africa was launched. The Hub is dedicated to increasing the use of OpenStreetMap tools to power and influence local and regional interventions. Find more information by following them on Twitter or Facebook.

Software

  • [1] Marcelo Prates has created software used to draw pretty maps from OpenStreetMap data. The software, Pretty Maps, is based on the OSMnx, Matplotlib and Shapely libraries.
  • Pierre Béland published a Compare Map Before / After for the Nippes (Haiti) August earthquake, where post-disaster drone footage (2 cm) for five localities can be compared with the CNIGS Haitian Geospatial Agency orthophotos from 2014–2015 (20 cm). Other drone-collected images should be added as they become available. CNIGS official road network, OSM, altitude and relief layers are also available.

Did you know …

  • … the three most important tags for describing tracks? tracktype is used for a rough classification, surface to specify the material, and smoothness to specify the condition.
  • Cartography Playground? A simple and interactive website for explaining cartographic algorithms, problems and other matters. It is aimed at students of cartography who want to refresh and deepen their knowledge.
  • … that the animal rights organisation, PETA, tried to have the name of Fishkill, a town in New York State, changed to Fishsave? -kill or -kil is a common place name element in the area deriving from a Dutch word meaning creek or stream.

OSM in the media

  • Cezch Mappers have finished the mapping of post boxes in the Czech Republic. Miroslav Suchý gave an interview to Lupa.cz (cz) > en about it and OpenStreetMap.
  • Nat Foot reads out weeklyOSM 578.

Other “geo” things

  • Got some spare reading time? Allan Mustard recommends The Address Book by Deirdre Mask. The book looks at the complex and sometimes hidden stories behind street names and their power to name, to hide, to decide who counts, who doesn’t – and why.
  • Ian Dickson talked with Ben Hinze about ‘Ambient Maps’, a company mapping noise and air pollution in Australian and New Zealand major cities and regions. By combining a number of inputs and models they can produce predictions of air traffic noise, road traffic noise, rail noise, air quality and future road noise impacts.
  • Alan McConchie tweeted a link to Kenneth Field’s new book Thematic Mapping: 101 Inspiring Ways to Visualise Empirical Data. Using 101 maps, graphs, charts, and plots of the 2016 United States presidential election data, the book explores the rich diversity of thematic mapping and the visual representation of data.
  • As a test, Anthony Stephens likes to ask ‘anybody under 30 years old who lives in the city’ to name the suburb directly to the north of where they live. Anthony is the owner of ‘The Map Shop’ in Adelaide (Australia) and in an ABC article he discusses the future of paper maps in a world of satnav.

Upcoming Events

Where What Online When Country
Fortaleza Encontro de usuários OSM do Ceará, Brasil. osmcalpic 2021-09-04 flag
Bogotá Distrito Capital Resolvamos notas de Colombia creadas en OpenStreetMap osmcalpic 2021-09-04 flag
京田辺市 京都!街歩き!マッピングパーティ:第26回 一休寺 osmcalpic 2021-09-04 jp
OSM Africa Monthly Mapathon: Map Malawi osmcalpic 2021-09-04 – 2021-10-04
Greater London Missing Maps London Mapathon osmcalpic 2021-09-07 flag
DRK Missing Maps Mapathon – JOSM Einführung osmcalpic 2021-09-07
Landau an der Isar Virtuelles Niederbayern-Treffen osmcalpic 2021-09-07 flag
Stuttgart Stuttgarter Stammtisch (Online) osmcalpic 2021-09-07 flag
OpenStreetMap Michigan Meetup osmcalpic 2021-09-09
Decatur County OSM US Mappy Hour osmcalpic 2021-09-09 flag
Berlin 159. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2021-09-09 flag
Nordrhein-Westfalen OSM-Treffen Bochum (September) osmcalpic 2021-09-09 flag
München Münchner OSM-Treffen osmcalpic 2021-09-09 flag
Bezirk St. Johann im Pongau 2. Virtueller OpenStreetMap Stammtisch Österreich osmcalpic 2021-09-09 flag
Bogotá Distrito Capital Agreguemos y editemos rutas de transporte en OpenStreetMap osmcalpic 2021-09-11 flag
Arlon Réunion des contributeurs OpenStreetMap, Arlon osmcalpic 2021-09-13 flag
臺北市 OpenStreetMap x Wikidata 月聚會 #32 osmcalpic 2021-09-13 flag
Hamburg Hamburger Mappertreffen osmcalpic 2021-09-14 flag
Karlsruhe Karlsruhe Hack Weekend osmcalpic 2021-09-17 – 2021-09-19 flag
Nantes Journées européennes du patrimoine 2021, Nantes osmcalpic 2021-09-18 flag
Bonn 143. Treffen des OSM-Stammtisches Bonn osmcalpic 2021-09-21 flag
Berlin OSM-Verkehrswende #27 (Online) osmcalpic 2021-09-21 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2021-09-21 flag
DRK Missing Maps Online Mapathon osmcalpic 2021-09-23
[Online] OpenStreetMap Foundation board of Directors – public meeting osmcalpic 2021-09-24
Düsseldorf Düsseldorfer OSM-Treffen (online) osmcalpic 2021-09-24 flag
Amsterdam OSM Nederland maandelijkse bijeenkomst (online) osmcalpic 2021-09-25 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, SK53, Sammyhawkrad, TheSwavu, YoViajo, derFred.

Status Report for August 2021

21:49, Friday, 03 2021 September UTC

Howdy all! I hope everyone is riding out this delta covid surge reasonably well.

This will likely be my last Pywikibot report. My code reviews are stuck and working on Pywikibot is remarkably lonely. Pywikibot is neat, but it’s difficult to stay interested when my contributions dawdle on a shelf. C’est la vie.

Roman Colosseum

In lieu of code I’m binging Death Throes of the Republic. Rome’s collapse began with the senate’s murder of the reformer Tiberius Gracchus. Breaking the norm against political violence just once spiraled out of control into a tit-for-tat revenge cycle that must have horrified its original perpetrators.

Rome offers a troubling warning of what the January 6th lynch mob could have begun. Folks, lets not play with fire.


Type Hints

Prior to my Rome binge I doubled down on Pywikibot’s type hints. Poor Xqt. I kinda buried him in code reviews…

A peek backstage at Wikipedia: Irwin DeVries

15:26, Friday, 03 2021 September UTC

Irwin DeVries is an online learning and technology instructor in a Masters program at Royal Roads University. He specializes in open education, instructional design, curriculum development, and learning technologies, along with associated research and publication. His experience working with wikis goes back several years, when he participated in the open design and development of an online course while he collaborated with the international OERu network which is based in New Zealand. DeVries received an invitation from the Global OER Graduate Network to participate in Wiki Education’s OER Wiki Scholars course, in which scholars researching open educational resources (OER) would work together to improve related content on Wikipedia. DeVries wanted to use this chance to connect with colleagues as well as find ways his graduate classes can utilize Wikipedia as part of a digital pedagogy.

Before joining the group, DeVries thought Wikipedia and its structure hadn’t changed much since his earlier involvement with the wiki. But as he got hands-on experience creating an article about the history of the Open Learning Institute of British Columbia, where he felt there was an important gap in the historical record of open education, he began noticing each distinct element of Wikipedia and the efforts that go into the content there.

“It’s a bit like buying a new car; you suddenly notice all the same models out there when you’re driving, which you’d never seen before,” says DeVries. “When I go to Wikipedia – and I’m now surprised how often I do – I see the structure, referencing, graphics, and other elements that I took for granted before. It’s like you’ve had a peek backstage. I also want to fix things when I see them, and at the same time appreciate the tremendous work and collaboration that go into creating them.”

Now that DeVries has become a contributor to OER information on Wikipedia, he has reflected on the opportunities Wikipedia offers to invite new and diverse knowledge production.

“When we’re contributing we’re either filling in a gap in an existing set of categories, or opening up a new category that then invites others to broaden it out. It’s infinite,” says DeVries.

In fact, the contributions of DeVries and his colleagues in the OER Wiki Scholars course increased the recognition of OER and its benefits. Thanks to Wikipedia’s accessibility, the public can learn about the wide range of topics related to OER, even helping instructors join the movement by utilizing OER in their own classrooms or helping researchers see the value of publishing open materials.

“It promotes a larger and highly visible OER. However, it also makes visible that OER and their supporting communities still mirror social inequities and require intentional work to address representational and voice imbalances from a social justice perspective,” says DeVries.

DeVries is fascinated by the quick changes anyone can make on Wikipedia and how these changes are made sure they are accurate.

“It was particularly gratifying both to link to related articles, and to edit other articles and link them back to the one I developed as part of the process,” says DeVries. “It’s quite amazing to step into an encyclopedia and fix things in the larger ecosystem without going through a bureaucracy. Obviously there’s a lot of trust that contributors will operate in good faith, but also where there is irresponsible behaviour there is a community that deals with it.”

DeVries encourages instructors to transform their students from consumers of information to actively edit articles with Wikipedia assignments, which they can do by joining Wiki Education’s Wikipedia Student Program. This way students can practice their research skills in an engaging learning environment while also publishing information that can leave a global impact.

“This changes the learners’ role from only consuming information written by others to also creating, revising, collaborating and sharing it in some meaningful way in a communal setting,” says DeVries. “In the process they learn more about the idea of a commons, content licensing, and more profoundly the contested nature and ongoing development of knowledge.”

When considering how Wikipedia can continue to strive toward accurate and inclusive representation of knowledge, DeVries noted that there should be improvement toward recognition of contributions by and about women and the BIPOC community. In his classroom, he plans on emphasizing the importance of representation when incorporating Wikipedia assignments, using the knowledge he gained from the Wiki Scholars course.

To take a course similar to Irwin’s, please visit learn.wikiedu.org. Image credit: Worldneedpeace, CC BY 3.0, via Wikimedia Commons.

Digging deeper into Quarry

19:23, Thursday, 02 2021 September UTC

By Andrew Bogott and Joaquin Oltra Hernandez

As you may know from previous blog posts, in the Cloud Services team, we devote the majority of our time to maintaining existing services and infrastructure— working on security upgrades, paying down technical debt, and fixing bugs. Occasionally we get to add new features, but those features are often the result of months of invisible infrastructure work. The actual user-facing announcements are few and far between.

Lately, it feels like we may finally be catching up! Our team has grown, and over the last year, we have made some drastic tech-debt paydowns that have resulted in a much more sustainable tech stack. This year, in our annual planning meetings, we had the unusual experience of deciding what new things we wanted to do next, instead of just focusing on the things that we’ve known we needed to do for years.  The list of possible new features and use-cases is vast, and many of the things we really want to do will take months or years to implement. We need to invest a fair bit of time in road mapping, surveying, and talking to users to figure out what tasks are next and what changes are worth it. 

We have two areas of work that we already know are important and that have long, beautiful lists of triaged bugs and feature requests just waiting for developer attention: Quarry and PAWS.

File:Quarry-logo.svg, Husky, CC BY 4.0
File:PAWS.svg, User:Barquill0, CC BY-SA 3.0

Both projects were started more than 5 years ago as an effort to democratize programming by applying Wikipedia’s principles. They are widely used by developers at all levels, from beginners to experts, and make powerful programming tools available for everyone.

File:Quarry queries run per month 2014-08 to 2021-08.svg, JHernandez (WMF), CC BY 4.0

Quarry and PAWS have been almost entirely volunteer-maintained for years. Both were created by Yuvipanda, with contributions from many others, and both have been largely maintained by volunteers like Chico Venancio and Framawiki. The maintainers have done a fantastic job keeping the services up and running, and they deserve more support.

Quarry won the coin-toss with PAWS, so we are spending this quarter working our way down a list of long-standing Quarry tasks. A couple of new features–Add a stop button to halt the query and Validate and autocomplete database names in the database input field– have already made it to release. That road was a bit bumpy; we are now working on tidying up the CI and deployment pipeline, upgrading dependencies, and moving the back-end database to Trove so that future patches can roll out smoothly.

We’re optimistic that, by the new year, Quarry will be less rickety in every way, with fewer hanging queries, easier workflows, and —with luck—better collaboration tools. As importantly, everyone on the team will get a good long look at the insides of Quarry so the next time it misbehaves, we will face the problem with a great deal more context and interest.

If you want to hack on a straightforward UI and collaborate with a team that is actually paying attention to your work, now is the time! Quarry already has a good local test/dev environment. If you are a Python developer, you should claim a bug today! If you are a frequent Quarry user, keep an eye out for new features or bugs; your feedback is always welcome.

About this post

File:Threlkeld Quarry Steam Galas Countless Excavators.jpg, ARG_Flickr, CC BY 2.0

Wiki Loves Monuments in 2021

11:28, Tuesday, 31 2021 August UTC

It is time for Wiki Loves Monuments again, starting on 1 September! Wiki Loves Monuments is an annual photo competition celebrating built cultural heritage. It is organized by volunteers around the world, and up to the top ten photographs from each country are selected for an international finale. These may be regular sights for some people, but thanks to Wiki Loves Monuments photographers, they will be more documented on Wikipedia and more accessible to everyone around the world, free of cost, forever. Photos submitted through WLM illustrate the more than 1.3 million monuments on Wikipedia and help more people around the world to learn about the history and national heritage of all participating countries.


WLM is for everyone! If you ever wondered how to start giving back to the wealth of knowledge on Wikipedia that all of us use on a daily basis, this is a great way to start. Everybody can join the competition by submitting a photograph of a nationally registered monument on Wikimedia Commons, following the instructions for each country. You can participate in as many national competitions as you wish. The national and international winning entries in WLM normally enjoy exposure by making national and international headlines.


Wiki Loves Monuments is built on three simple criteria. First, all photos are freely licensed, like all other contributions to Wikipedia and Wikimedia Commons. By giving permission to the public to share these photos, it ensures that the results can remain widely available forever. Second, all photos must contain an identified monument, e.g., a building or art of historic significance – we want to know what heritage is on the photo so that we can actually use it. Each country maintains a list of registered historic sites that are eligible for the competition. Third, the photo must be uploaded in the month of September/October (based on the country). You are always welcome to contribute your photography to Wikimedia Commons, but photos uploaded before or after the month of September and October may not be considered for the competition. If you would like more details on Wiki Loves Monuments in your country, you can visit wikilovesmonuments.org/participate.


You can expect an international announcement on the winners in January 2021. Good luck!

A buggy history

10:24, Monday, 30 2021 August UTC
—I suppose you are an entomologist?—I said with a note of interrogation.
—Not quite so ambitious as that, sir. I should like to put my eyes on the individual entitled to that name! A society may call itself an Entomological Society, but the man who arrogates such a broad title as that to himself, in the present state of science, is a pretender, sir, a dilettante, an impostor! No man can be truly called an entomologist, sir; the subject is too vast for any single human intelligence to grasp.
The Poet at the Breakfast Table (1872) by Oliver Wendell Holmes, Sr. 
A collection of biographies
with surprising gaps (ex. A.D. Imms)
The history of Indian interest in insects has been approached by many writers and there are several bits and pieces available in journals and various insights distributed across books. There are numerous ways of looking at how people viewed insects over time. One of these is a collection of biographies, some of which are uncited verbatim accounts from obituaries (and not even within quotation marks). This collation by B.R. Subba Rao who also provides a few historical threads to tie together the biographies. Keeping Indian expectations in view, both Subba Rao and the agricultural entomologist M.A. Husain play to the crowd in their early histories. Husain wrote in pre-Independence times where there was a need for Indians to assert themselves before their colonial masters. They begin with mentions of insects in ancient Indian texts and as can be expected there are mentions of honey, shellac, bees, ants, and a few nuisance insects. Husain takes the fact that the term Satpada षट्पद or six-legs existed in the 1st century Amarakosa to make the claim that Indians were far ahead of time because Latreille's Hexapoda, the supposed analogy, was proposed only in 1825. Such one-up-manship (or quests for past superiority in the face of current backwardness?) misses the fact that science is not just about terms but  also about structures and one can only assume that these authors failed to find the development of such structures in the ancient texts that they examined. Cedric Dover, with his part-Indian and British ancestry, interestingly, notes the Sanskrit literature but notes that he is not competent enough to examine the subject carefully. The identification of species in old texts also leave one wondering about the accuracy of translations. For instance K.N. Dave translates a verse from the Atharva-veda and suggests an early date for knowledge on shellac. This interpretation looks dubious and sure enough, Dave has been critiqued by an entomologist, Mahdihassan. One organism in ancient texts as the indragopa (Indra's cowherd) which supposedly appears after the rains. Some Sanskrit scholars have, remarkably enough, identified it, with a confidence that no coccidologist ever had, as the cochineal insect (the species Dactylopius coccus is South American!), while others identify it as a lac insect, a firefly(!) or as Trombidium (red velvet mites) - the last for matching blood red colour mentioned in a text attributed to Susrutha. To be fair, ambiguities in translation are not limited to those dealing with Indian writing. Dikairon (Δικαιρον), supposedly a highly-valued and potent poison from India was mentioned in the work Indika by Ctesias 398 - 397 BC. One writer said it was the droppings of a bird. Valentine Ball thought it was derived from a scarab beetle. Jeffrey Lockwood claimed that it came from the rove beetles Paederus sp. And finally a Spanish scholar states that all this was a gross misunderstanding and that Dikairon was not a poison, and - believe it or not - was a masticated mix of betel leaves, arecanut, and lime! 
 
One gets a far more reliable idea of ancient knowledge and traditions from practitioners, forest dwellers, the traditional honey-harvesting tribes, and similar people that have been gathering materials such as shellac and beeswax. Unfortunately, many of these traditions and their practitioners are threatened by modern laws, economics, and cultural prejudice. These practitioners are being driven out of the forests where they live, and their knowledge was hardly ever captured in writing. The writers of the ancient Sanskrit texts were probably associated with temple-towns and other semi-urban clusters and it seems like the knowledge of forest dwellers was never considered merit-worthy by the book writing class of that period.

A more meaningful overview of entomology may be gained by reading and synthesizing a large number of historical bits, and there are a growing number of such pieces. A 1973 book published by the Annual Reviews Inc. should be of some interest. I have appended a selection of sources that are useful in piecing together a historic view of entomology in India. It helps however to have a broad skeleton on which to attach these bits and minutiae. Here, there are truly verbose and terminology-filled systems developed by historians of science (for example, see ANT). I prefer an approach that is free of a jargon overload or the need to cite French intellectuals. The growth of entomology can be examined along three lines - cataloguing - the collection of artefacts and the assignment of names, communication and vocabulary-building - social actions involving the formation of groups of interested people who work together building common structure with the aid of fixing records in journals often managed beyond individual lifetimes by scholarly societies, and pattern-finding a stage when hypotheses are made, and predictions tested. I like to think that anyone learning entomology also goes through these activities, often in this sequence. Professionalization makes it easier for people to get to the later stages. This process is aided by having comprehensive texts, keys, identification guides and manuals, systems of collections and curators. The skills involved in the production - ways to prepare specimens, observe, illustrate, or describe are often not captured by the books themselves and that is where institutions play (or ought to play) an important role.

Cataloguing

The cataloguing phase of knowledge gathering, especially of the (larger and more conspicuous) insect species of India grew rapidly thanks to the craze for natural history cabinets of the wealthy (made socially meritorious by the idea that appreciating the works of the Creator was as good as attending church)  in Britain and Europe and their ability to tap into networks of collectors working within the colonial enterprise. The cataloguing phase can be divided into the non-scientific cabinet-of-curiosity style especially followed before Darwin and the more scientific forms. The idea that insects could be preserved by drying and kept for reference by pinning, [See Barnard 2018] the system of binomial names, the idea of designating type specimens that could be inspected by anyone describing new species, the system of priority in assigning names were some of the innovations and cultural rules created to aid cataloguing. These rules were enforced by scholarly societies, their members (which would later lead to such things as codes of nomenclature suggested by rule makers like Strickland, now dealt with by committees that oversee the  ICZN Code) and their journals. It would be wrong to assume that the cataloguing phase is purely historic and no longer needed. It is a phase that is constantly involved in the creation of new knowledge. Labels, catalogues, and referencing whether in science or librarianship are essential for all subsequent work to be discovered and are essential to science based on building on the work of others, climbing the shoulders of giants to see further. Cataloguing was probably what the physicists derided as "stamp-collecting".

Communication and vocabulary building

The other phase involves social activities, the creation of specialist language, groups, and "culture". The methods and tools adopted by specialists also helps in producing associations and the identification of boundaries that could spawn new associations. The formation of groups of people based on interests is something that ethnographers and sociologists have examined in the context of science. Textbooks, taxonomic monographs, and major syntheses also help in building community - they make it possible for new entrants to rapidly move on to joining the earlier formed groups of experts. Whereas some of the early learned societies were spawned by people with wealth and leisure, some of the later societies have had other economic forces in their support.

Like species, interest groups too specialize and split to cover more specific niches, such as those that deal with applied areas such as agriculture, medicine, veterinary science and forensics. There can also be interest in behaviour, and evolution which, though having applications, are often do not find economic support.

Pattern finding
Eleanor Ormerod, an unexpected influence
in the rise of economic entomology in India

The pattern finding phase when reached allows a field to become professional - with paid services offered by practitioners. It is the phase in which science flexes its muscle, specialists gain social status, and are able to make livelihoods out of their interest. Lefroy (1904) cites economic entomology in India as beginning with E.C. Cotes [Cotes' career in entomology was cut short by his marriage to the famous Canadian journalist Sara Duncan in 1889 and he shifted to writing] in the Indian Museum in 1888. But he surprisingly does not mention any earlier attempts, and one finds that Edward Balfour, that encyclopaedic-surgeon of Madras collated a list of insect pests in 1887 and drew inspiration from Eleanor Ormerod who hints at the idea of getting government support, noting that it would cost very little given that she herself worked with no remuneration to provide a service for agriculture in England. Her letters were also forwarded to the Secretary of State for India and it is quite possible that Cotes' appointment was a direct result.

As can be imagined, economics, society, and the way science is supported - royal patronage, family, state, "free markets", crowd-sourcing, or mixes of these - impact the way an individual or a field progresses. Entomology was among the first fields of zoology that managed to gain economic value with the possibility of paid employment. David Lack, who later became an influential ornithologist, was wisely guided by his father to pursue entomology as it was the only field of zoology with jobs. Lack however found his apprenticeship (in Germany, 1929!) involving pinning specimens "extremely boring".

Indian reflections on the history of entomology

Kunhikannan died at the rather young age of 47
A rather interesting analysis of Indian science is made by the first native Indian entomologist, with the official title of "entomologist" in the state of Mysore - K. Kunhikannan. Kunhikannan was deputed to pursue a Ph.D. at Stanford (for some unknown reason two pre-Independence Indian entomologists trained in Stanford rather than England - see postscript) through his superior Leslie Coleman. At Stanford, Kunhikannan gave a talk on Science in India. He noted in that 1923 talk :
In the field of natural sciences the Hindus did not make any progress. The classifications of animals and plants are very crude. It seems to me possible that this singular lack of interest in this branch of knowledge was due to the love of animal life. It is difficult for Westerners to realise how deep it is among Indians. The observant traveller will come across people trailing sugar as they walk along streets so that ants may have a supply, and there are priests in certain sects who veil that face while reading sacred books that they may avoid drawing in with their breath and killing any small unwary insects. [Note: Salim Ali expressed a similar view ]
He then examines science sponsored by state institutions, by universities and then by individuals. About the last he writes:
Though I deal with it last it is the first in importance. Under it has to be included all the work done by individuals who are not in Government employment or who being government servants devote their leisure hours to science. A number of missionaries come under this category. They have done considerable work mainly in the natural sciences. There are also medical men who devote their leisure hours to science. The discovery of the transmission of malaria was made not during the course of Government work. These men have not received much encouragement for research or reward for research, but they deserve the highest praise., European officials in other walks of life have made signal contributions to science. The fascinating volumes of E. H. Aitken and Douglas Dewar are the result of observations made in the field of natural history in the course of official duties. Men like these have formed themselves into an association, and a journal is published by the Bombay Natural History Association[sic], in which valuable observations are recorded from time to time. That publication has been running for over a quarter of a century, and its volumes are a mine of interesting information with regard to the natural history of India.
This then is a brief survey of the work done in India. As you will see it is very little, regard being had to the extent of the country and the size of her population. I have tried to explain why Indians' contribution is as yet so little, how education has been defective and how opportunities have been few. Men do not go after scientific research when reward is so little and facilities so few. But there are those who will say that science must be pursued for its own sake. That view is narrow and does not take into account the origin and course of scientific research. Men began to pursue science for the sake of material progress. The Arab alchemists started chemistry in the hope of discovering a method of making gold. So it has been all along and even now in the 20th century the cry is often heard that scientific research is pursued with too little regard for its immediate usefulness to man. The passion for science for its own sake has developed largely as a result of the enormous growth of each of the sciences beyond the grasp of individual minds so that a division between pure and applied science has become necessary. The charge therefore that Indians have failed to pursue science for its own sake is not justified. Science flourishes where the application of its results makes possible the advancement of the individual and the community as a whole. It requires a leisured class free from anxieties of obtaining livelihood or capable of appreciating the value of scientific work. Such a class does not exist in India. The leisured classes in India are not yet educated sufficiently to honour scientific men.
It is interesting that leisure is noted as important for scientific advance. Edward Balfour, also commented that Indians were "too close to subsistence to reflect accurately on their environment!"  (apparently in The Vydian and the Hakim, what do they know of medicine? (1875) which unfortunately is not available online)

Kunhikannan may be among the few Indian scientists who dabbled in cultural history, and political theorizing. He wrote two rather interesting books The West (1927) and A Civilization at Bay (1931, posthumously published) which defended Indian cultural norms while also suggesting areas for reform. While reading these works one has to remind oneself that he was working under Europeans and may not have been able to discuss such topics with many Indians. An anonymous writer who penned a  prefatory memoir of his life in his posthumously published book notes that he was reserved and had only a small number of people to talk to outside of his professional work. Kunhikannan came from the Thiyya community which initially preferred English rule to that of natives but changed their mind in later times. Kunhikannan's beliefs also appear to follow the same trend.

Entomologists meeting at Pusa in 1919
Third row: C.C. Ghosh (assistant entomologist), Ram Saran ("field man"), Gupta, P.V. Isaac, Y. Ramachandra Rao, Afzal Husain, Ojha, A. Haq
Second row: M. Zaharuddin, C.S. Misra, D. Naoroji, Harchand Singh, G.R. Dutt (Personal Assistant to the Imperial Entomologist), E.S. David (Entomological Assistant, United Provinces), K. Kunhi Kannan, Ramrao S. Kasergode (Assistant Professor of Entomology, Poona), J.L.Khare (lecturer in entomology, Nagpur), T.N. Jhaveri (assistant entomologist, Bombay), V.G.Deshpande, R. Madhavan Pillai (Entomological Assistant, Travancore), Patel, Ahmad Mujtaba (head fieldman), P.C. Sen
First row: Capt. Froilano de Mello, W Robertson-Brown (agricultural officer, NWFP), S. Higginbotham, C.M. Inglis, C.F.C. Beeson, Dr Lewis Henry Gough (entomologist in Egypt), Bainbrigge Fletcher, Bentley, Senior-White, T.V. Rama Krishna Ayyar, C.M. Hutchinson, Andrews, H.L.Dutt


Entomologists meeting at Pusa in 1923
Fifth row (standing) Mukerjee, G.D.Ojha, Bashir, Torabaz Khan, D.P. Singh
Fourth row (standing) M.O.T. Iyengar (a malariologist), R.N. Singh, S. Sultan Ahmad, G.D. Misra, Sharma, Ahmad Mujtaba, Mohammad Shaffi
Third row (standing) Rao Sahib Y Rama Chandra Rao, D Naoroji, G.R.Dutt, Rai Bahadur C.S. Misra, SCJ Bennett (bacteriologist, Muktesar), P.V. Isaac, T.M. Timoney, Harchand Singh, S.K.Sen
Second row (seated) Mr M. Afzal Husain, Major RWG Hingston, Dr C F C Beeson, T. Bainbrigge Fletcher, P.B. Richards, J.T. Edwards, Major J.A. Sinton
First row (seated) Rai Sahib PN Das, B B Bose, Ram Saran, R.V. Pillai, M.B. Menon, V.R. Phadke (veterinary college, Bombay)
 

Note: As usual, these notes are spin-offs from researching and writing Wikipedia entries. It is remarkable that even some people in high offices, such as P.V. Isaac, the last Imperial Entomologist, and grandfather of noted writer Arundhati Roy, are largely unknown (except as the near-fictional Pappachi in Roy's God of Small Things)

Further reading
An index to entomologists who worked in India or described a significant number of species from India - with links to Wikipedia (where possible - the gap in coverage of entomologists in general is large)
(woefully incomplete - feel free to let me know of additional candidates)

Carl Linnaeus - Johan Christian Fabricius - Edward Donovan - John Gerard Koenig - John Obadiah Westwood - Frederick William Hope - George Alexander James Rothney - Thomas de Grey Walsingham - Henry John Elwes - Victor Motschulsky - Charles Swinhoe - John William Yerbury - Edward Yerbury Watson - Peter Cameron - Charles George Nurse - H.C. Tytler - Arthur Henry Eyre Mosse - W.H. Evans - Frederic Moore - John Henry Leech - Charles Augustus de Niceville - Thomas Nelson Annandale - R.C. WroughtonT.R.D. Bell - Francis Buchanan-Hamilton - James Wood-Mason - Frederic Charles Fraser  - R.W. Hingston - Auguste Forel - James Davidson - E.H. AitkenO.C. Ollenbach - Frank Hannyngton - Martin Ephraim Mosley - Hamilton J. Druce  - Thomas Vincent Campbell - Gilbert Edward James Nixon - Malcolm Cameron - G.F. Hampson - Martin Jacoby - W.F. Kirby - W.L. DistantC.T. Bingham - G.J. Arrow - Claude Morley - Malcolm Burr - Samarendra Maulik - Guy Marshall
 
 - C. Brooke Worth - Kumar Krishna - M.O.T. Iyengar - K. Kunhikannan - Cedric Dover

PS: Thanks to Prof C.A. Viraktamath, I became aware of a new book-  Gunathilagaraj, K.; Chitra, N.; Kuttalam, S.; Ramaraju, K. (2018). Dr. T.V. Ramakrishna Ayyar: The Entomologist. Coimbatore: Tamil Nadu Agricultural University. - this suggests that TVRA went to Stanford at the suggestion of Kunhikannan.

    Tech News issue #35, 2021 (August 30, 2021)

    00:00, Monday, 30 2021 August UTC
    previous 2021, week 35 (Monday 30 August 2021) next

    weeklyOSM 579

    09:06, Sunday, 29 2021 August UTC

    17/08/2021-23/08/2021

    lead picture

    How old is that building in Penza (Russia)? [1] map data © OpenStreetMap contributors | © 2021 how-old-is-this.house, data sources: Росреестр, Минкультуры, «МинЖКХ», Викигид, Викиданные, Викимапия

    Mapping campaigns

    Mapping

    • Anim Mouse noticed a changeset that spans four continents and contains only a single change. Although it was pointed out in the discussion that a changeset should contain as small an area as possible, sometimes it is not possible when the element itself is very large, in this case France, including its overseas territories and départements.
    • Bora Can wrote (tr) > de about his efforts to clear up divergent district boundaries in Turkey.
    • Russ Garrett summarised his thoughts on how various recently released government open datasets might be used to improve address mapping the UK.

    Community

    • State of the Map Africa 2021 is happening online on 19 to 21 November. Listen to the July episode of the Geospatially podcast, featuring Geoffrey Kateregga who discussed SotM Africa and how they will bring OSM communities in Africa together.
    • Manoj, Ark, and Awantika took a look at the history of OSM Kerala and its continued efforts to strengthen the use of OSM in Kerala (India) for disaster response and community development.
    • One has learnt that Christoph Hormann’s ‘couple of words’ tend to be a bit longer than the phrase implies. His theme discussed this time is the implications of the draft OSMF Etiquette Guidelines. Although a DeepL translation to English is provided, he emphasises that this will lack nuances of the original German text: an issue which he feels affects any contributor needing to use English when it is not their native language.

    OpenStreetMap Foundation

    • weeklyOSM strives to notify you, our readers, about upcoming OSMF board meetings. Unfortunately, at present announcements of upcoming meetings are being made within a week of the meeting, so it is not possible for us to announce them accordingly. If you are interested in the public board meetings, you should follow the OSMF mailing list, subscribe to the Board’s iCalendar feed, or keep an eye out for them in our Upcoming Events section.
    • Former OSMF Board member Peter Barth expressed his concern about the attribution of OSM on maps provided through Mapbox services.

    Events

    • The Community Working Group (Humanitarian Open Mapping) organised by HOT invites local OSM community organisers, leaders and members (new and old) to come together and share their tips, tricks and challenges related to starting and sustaining local OSM communities. Two Zoom meetings will be held on Friday 3 September at 06:00 UTC and 16:00 UTC. Register if you wish to participate.

    Humanitarian OSM

    • HeiGIT is working on routing that incorporates current data from disaster areas, such as areas subject to flooding or destroyed streets. To help with the current flood disaster in Germany and neighbouring countries, HeiGIT has taken up-to-date flood data derived from satellite images by the European Copernicus Emergency Mapping Service (EMS) and integrated them into the routing services developed by HeiGIT.
    • In mid-August HOT, in collaboration with local communities in the respective regions, activated two disaster response campaigns: Haiti Earthquake 2021 and Mediterranean Fires 2021. You can check the updates here and see other active responses.
    • The Humanitarian OpenstreetMap Team (HOT) Summit 2021 Call for Proposals has been extended. The new deadline is Tuesday 31 August.
    • The finals of the 2021 HOT Staff OSM FIGHT Heavyweight Championship were held at a recent HOT all staff meeting.

    Maps

    • [1] Ever wondered how old that building in Penza (Russia) is? Then this is the website for you. In addition to OSM, data from wikivoyage, wikidata and wikimapia, among others, have been used.
    • Nicolelaine is still having trouble getting their uMap of world post boxes to work, due to Overpass timeouts. Any help would be appreciated.
    • PlayzinhoAgro shared (pt) > de the results of a survey of online map use and knowledge of OpenStreetMap in Brazil.

    Software

    • Recently we covered the KGeoTag application. There is a similar application called geotagging, a tool for manipulating the geographical information stored in JPEG images (Exif metadata). The geographical information could be obtained from GPX or manually by drag-and-drop. The application is packaged as an EXE or as a flatpak. Feel free to contribute translations or code.
    • Trufi’s OSM-powered bike app will put bikes at the centre of multi-modal journeys like no other app. The app will launch in Hamburg at the ITS World Conference.

    Programming

    • The final reports for projects that were part of the 2021 Google Summer of Code are in:
      • Vuong Ho’s opening hours tag evaluator
      • Zohaib Ansari’s stand-alone library to display multiple versions of any selected OSM element and allow comparison between different versions
      • Antonin Jolivat’s Nominatim QA reports extraction tool.

    Did you know …

    • The Panorama of the River Thames? This is a website which shows panoramic views of parts of the River Thames through Greater London both from watercolours of 1829 and photographs from the present day.
    • … that you can use OSM Smart Menu in Google Chrome or Mozilla Firefox to open links related to OpenStreetMap, based on parameters from the current page? It helps OpenStreetMap contributors to easily switch between different maps and analysis tools.
    • … that LearnOSM is dedicated to helping people learn how to map in OpenStreetMap? The guides start from the basics of what OSM is and how to get mapping, through to advanced topics such as osm2pgsql and manipulating data with Osmosis.
    • … you can check out who is the busiest on MapRoulette by viewing the leaderboard?

    OSM in the media

    • Woxx carried a couple of articles on OpenStreetMap. The first (de) > en is an introduction to OpenStreetMap and features an interview with OSMF Board member Guillaume Rischard. The second (de) > en article describes to readers a number of ways that they can also participate in OpenStreetMap.

    Other “geo” things

    • Médecins Sans Frontières (MSF) is seeking a Geographical Information Systems Advisor. Based preferably in Amsterdam, Berlin, London, or Nairobi. Other locations where MSF has offices will also be considered.

    Upcoming Events

    Where What Online When Country
    潍坊市 OpenStreetMap(中国大陆)第一次线下聚会 osmcalpic 2021-08-20 – 2021-08-30 flag
    Autrans-Méaudre en Vercors WikiCamp L’Escandille osmcalpic 2021-08-28 – 2021-08-29 flag
    Bogotá Distrito Capital Resolvamos notas de Colombia creadas en OpenStreetMap osmcalpic 2021-08-28 flag
    OSMUS Slack Summer mapping party osmcalpic 2021-08-28 – 2021-08-29
    Hlavní město Praha Missing Maps CZ Mapathon 2021 #5 osmcalpic 2021-08-31 cz
    San Jose South Bay Map Night osmcalpic 2021-09-03 flag
    Kurmin Musa Community Webinar: Local OSM community building: Tips, tricks and challenges osmcalpic 2021-09-03 ng
    Fortaleza Encontro de usuários OSM do Ceará, Brasil. osmcalpic 2021-09-04 flag
    Bogotá Distrito Capital Resolvamos notas de Colombia creadas en OpenStreetMap osmcalpic 2021-09-04 flag
    京田辺市 京都!街歩き!マッピングパーティ:第26回 一休寺 osmcalpic 2021-09-04 jp
    OSM Africa Monthly Mapathon: Map Malawi osmcalpic 2021-09-04 – 2021-10-04
    Greater London Missing Maps London Mapathon osmcalpic 2021-09-07 flag
    Landau an der Isar Virtuelles Niederbayern-Treffen osmcalpic 2021-09-07 flag
    Stuttgart Stuttgarter Stammtisch (Online) osmcalpic 2021-09-07 flag
    OpenStreetMap Michigan Meetup osmcalpic 2021-09-09
    Nordrhein-Westfalen OSM-Treffen Bochum (September) osmcalpic 2021-09-09 flag
    Arlon Réunion des contributeurs OpenStreetMap, Arlon osmcalpic 2021-09-13 flag
    Hamburg Hamburger Mappertreffen osmcalpic 2021-09-14 flag
    Karlsruhe Karlsruhe Hack Weekend osmcalpic 2021-09-17 – 2021-09-19 flag

    Note:
    If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

    This weeklyOSM was produced by Nordpfeil, SK53, Sammyhawkrad, TheSwavu, derFred, s8321414.

    Planet Wikimedia is made from the blogs by Wikimedia contributors. The opinions it contains are those of the contributor.

    Privacy Policy | This site is powered by Rawdog