There are several proposals for how to do categorization in Wikimedia sites (and, by extension, for other MediaWiki sites). This page discusses some possible implementations. See Categorization requirements for some abstract requirements for categorization in MediaWiki.
Ideas on improvements
I think that categories in wikipedia are getting rather confusing, and they get more confusing when you mess it with the lists. I see many categories like Brazilian Political figures or Dutch painters of the XIX century.
A simple way to clear those and make all more useful would be adding an intersection feature on the categories. For example, I could put van gogh in many differents categories, like famous XX century people, Painters, Dutch guys and Self mutilated crazy nutballs. then when I go to Category:Painters there would be a automatically generated link like
- famous XX century people
- Dutch guys
Then when I click the first I would go to a page, something like
wich would show all articles which were painters AND XX century. This way, you can categorize wikipedia articles in a way that:
- people could do more things: Someone new to wikipedia could discover any relation on articles
- computers can understand better: i can program a computer to retrieve all information about political leaders from bulgaria even if no one created that topic.
- wikipedians work would be minimized. When you put Che guevara in argentine and comunist and political leader and reolutionary you are in fact making many new categories. He we be side to side with marx in one page, and with maradona in another. You don't need to make a category:argentine_revolutionaries | category:argentinians | category:comunist_revolutionaries | category:argentine_famous_asthmatics and etc.
well that's the idea. Comments? --Alexandre Van de Sande 18:58, 21 Aug 2004 (UTC)
- I agree. I think this feature is badly needed, both for the Wikipedias and especially for the Wiktionaries. Suppose you're writing a poem in Spanish and need to find a word that the metrics of your poem require to be trisyllabic, paroxytone, rhyming in "azo", and additionally you need it to not have to do with blows (many Spanish words rhyming in "azo" refer to blows, but suppose a word dealing with blows doesn't fit in your poem for thematic or stylistic reasons, so you need a word that _doesn't_ belong in the semantic category of words referring to blows). You could do such a search very easily if there was a search engine for categories where one could enter the categories to be matched (positive search) and the categories to be shunned (negative search), and that would generate a customized listing with all the entries that simultaneously belong to all of the required positive categories and that at the same time do not belong to any of the specified negative categories. So, for the given example, you would go to the Category Search form on the Spanish Wiktionary, select the categories "Español", "Trisílaba", "Paroxítona" and "Rima:Azo" on the Positive Search field, and the category "Golpes" on the Negative Search field, and a list of all the entries in the database matching those criteria would be listed. Without this handy feature, you would have to perform a manual search under one of the categories, say under "Rima:Azo", checking one by one each entry in that category to see if it also belongs to the other required categories and does not belong to the category to be avoided. I'm not sure, but I think it wouldn't be very difficult to implement such a search engine for categories, which would, on the one hand, allow for a more powerful and flexible searching than just browsing through the listings in individual categories, and on the other, drastically reduces the number of needed categories, since subcategories of the kind you have mentioned wouldn't be needed, because the intersection of "20th Century", "Painter" and "French" would be performed by the search engine instead of having to be manually produced by creating a myriad intersection categories such as "20th Century French Painters" or "Spanish Paroxytone Trisyllabic Words Rhyming In -Azo And Not Having To Do With Blows". Uaxuctum 17:58, 8 Dec 2004 (UTC)
This comment is 13 years old and this is still the best way to categorize stuff. On Internet, what is the point to have a hierarchical organisation? The core of web use is transversal linking. Intersecting stuff like [Bridge][Wooden construction][Country:UK] or [Pedestrian bridge][Town:Paris] or [Architecture:Castel][Region:France-Burgundy] will be much more efficient while searching and much easier for page categorization, as you will not have to search through an unknown hierarchy to establish categories. Some predefined characteristics shall be set, especially for geographical stuff, as [Continent:xxx] [Sea:zz] [Town:ttt] [Region:fff], which will help a lot research and categorization. Other predefines can be done, e.g. for people with [Person:Politic/Singer/Musician/Scientist/Writer/Painter/Sportmen/BusinessMen/etc.], knowing that one person could belong to multiple categories. Every type of article may have some predefine: [Architecture:Building/House/Mansion/Castel/Fortress/...]. Also that will help a lot Internationalisation/Localization which is basically non existent for categories. Such categorizing rework may have a significant impact on required computer resources, so a required computing power evaluation shall be done, but the present system is definitely not appropriate. PRZ (talk) 14:45, 5 June 2018 (UTC)
See on same topic the page : https://en.wikipedia.org/wiki/Wikipedia:Category_intersection PRZ (talk) 09:58, 9 June 2018 (UTC)
Magnus Manske did an automated categorization tool using a "Category:" pseudo-namespace. It was tested on test.wikipedia.org and some bugs were wrinkled out.
It's currently mostly commented, due to some merge problems in the MediaWiki code.
- Update: I fixed it, but didn't do extensive testing. --Magnus Manske 10:46, 2 Mar 2004 (UTC)
Some similar proposals:
Categorization with field-value pairs
(Magnus Manske proposed a Technical categories in Wikipedia which is pretty close to the same, except with colons (":") instead of equals signs ("=")).
Require all new articles to be placed in at least one category, and allow them to be placed in multiple categories. Categories can be displayed in a drop down list for the editor to choose from and created during the article creation process. Categories can also be arranged heirarchically to help keep the lists to a reasonable size. This idea was developed to help organize wikibate (wiki-debate), as a way of avoiding duplicate articles due to slightly different wording of a proposition ("Wikipedia is better than Brittanica" vs. "Wikipedia is the best encyclopedia"). AdamRetchless 02:42, 15 Mar 2005 (UTC)
IEG proposal on category systems in WMF wikis
I have submitted a proposal for an Individual Engagement Grant for the first phase of a project looking at the category systems in Wikimedia wikis. In this first phase I will research the nature of the English Wikipedia's category system, as the first step in designing ways to optimize category systems throughout WMF wikis. In later phases, I plan to
- Research how readers and editors utilize the category system in the English Wikipedia.
- Investigate the category systems in other language Wikipedias and in other WMF projects.
- Explore the value and feasibility of using Wikidata as the basis for the category system across WMF wikis. If deemed appropriate by the community, work with the community to develop and implement this.
- Utilize user-centered design methodologies to prototype various enhancements to the category system to improve the user experience. If deemed appropriate by the community, work with the community to develop and implement such enhancements.