BraveNewWiki - UnityWiki

This page hold discussions relating to the future design of the UnityWiki implementation; the UnityWiki page holds more general information, including feature requests, and UnityWikiBugs is the place to go if something is going wrong.

Modularity And Plugins
X M L On Disk Representation
- <<< One problem with using anchors as headers; Maybe we need something like ##[This Syntax] -- RobHague. >>>
Database Integration
Authentication

Modularity And Plugins

UnityWiki is a significant improvement PikiPiki in terms of features. However, with each additional feature, the code becomes more and more labyrinthine. Along this road lies MoinMoin, and avoiding the complexity of that was the main reason for choosing PikiPiki in the first place.

<<< I don't actually have too much of a problem with MoinMoin - it's a very useful piece of software. However, it's also very complex, in my opinion far more than it needs to be, and I'm convinced we can do better by coming up with a decent architecture early on. -- RobHague >>>

Hence, the thrust of the present reworking of UnityWiki is to introduce a more modular framework. Ideally, as much functinoality as possible should be removed from the "core" script, and put into modules. NooParser, the redisgned Wiki notation parser, is the first step towards this.

The basis of the framework, as used in NooParser, is that an extension defines one or more Python classes that subclass particular base classes defined by the framework. When run, the script imports all of these classes into the top-level namespace (or possibly some other namespace, such as "extensions"), and then goes through that namespace searching for subclasses of the known classes and performing any action neccesary to register them. Hence, the extension module doesn't have to provide any manifest of its contents.

The following is a provisional list of the extension classes defined by the framework:

WikiNode: A page is represented as a tree of WikiNode sublasses, representing markup on text (e.g., "bold", "table cell"). Each subclass defines, in a class attribute, a regexp that defines all tokens that are used as markup, and a method to parse the stream of tokens following the first encountered such token (which probably returns before the stream is exhausted). A subclass, MacroNode, is designed to make it easy to define new macros.
ActionExtension: Defines an action that can be performed on a page (the main default action is "edit"). If, for example, you wanted to produce an RSS feed for RecentChanges, it'd be an ActionExtension.
EditorExtension: Provides an extra button on the editor page, for, e.g., a spell-checker. Not sure how this should work.
PageExtension: Provides hooks to allow extension-writers to veto, or react to, changes in a page's content, and possibly modify the way pages are loaded or saved (e.g., putting them in a database, or updating a cache of metadata)
PageFurnitureExtension: Provides "head" and "foot" methods to insert HTML into the result before and after the page proper. These will be called in a FIFO way to allow correct nesting. <<< Can anyone think of a better name? Could this functionality be subsumed into the vague-as-yet PageExtension? -- RobHague>>>

<<<How about LookAndFeelExtension or ThemeExtension?
Also, perhaps instead of PageExtension, call it EventExtension, since the description of PageExtension seems to be "things which trigger on people writing to the wiki". -- SimonBooth >>>
<<< Thinking about it, the thing previously described as PageExtension might be better labelled PageStoreExtension, as in the main it appears to be monkeying around with what goes to or comes from whatever is storing the pages. The header/footer/css plugin could be RenderExtension, as that covers things that add content as opposed to changing appearance. -- RobHague >>>
<<< I'm not sure about PageStoreExtension - Alden once wanted to hook something to page reads, not just writes. In fact, that would probably be a good place to put the RecentChanges, so that one could Build a Better RecentChanges when versioning was added. -- SimonBooth >>>
<<< I was using Store as a noun, not a verb. -- RobHague (split) >>>
- <<< I know.:-) -- SimonBooth >>>
<<<They'd be extending some nebulous thingy called the Page Store, where the pages were stored - currently a filing system, but it could be a database, or Gnutella, or whatever. Hence it would have hooks for when pages were read (which is how vetos would work - "You're not allowed to see this page"). -- RobHague (split) >>>
- <<< But suppose we introduced, say, user preferences, not stored as pages, or allowed image uploads; and you wanted to hang things off those being changed.. -- SimonBooth >>>
  <<< Possibly a general StorageExtension would be better; we can either have separate methods for binary data, or say that things aren't neccesarily wiki text, or have a way for other extensions to poke storage to say that something has changed. On the subject of user preferences, though, we get them for free with attributes and a vetoer that forbids editing of other people's homepage. -- RobHague >>>
<<< Having RecentChanges as a method of this is possibly a good idea, but I'm not sure that it wouldn't be sufficient for people to just redefine the appropriate MacroNode. This brings up some issues of extension ordering that would need to be sorted out, but seems like the simplest approach. -- RobHague >>>
- <<< What's the difference between a MacroNode and a WikiNode -- SimonBooth >>>
  - <<< Convenience; it handles things like parameter parsing so you just have to implement one method. -- RobHague >>>
<<< ...and, of course, this is the type of extension that you'd tie a versioning system to. -- RobHague >>>

Each class overides various methods. Instances of the classes are instantiated if and when needed (at some unspecified point between module import and a method being called); if, for example, a particular ActionExtension is not needed for the current request, the class will not be instantiated, but it will be imported.

<<< Any suggestions about this would be gratefully received. -- RobHague >>>

<<< Taking SisterSites as something I would like to make a module for, it's clear enough that the names list would be an ActionExtension, but what would the link-to-another-wiki come under? -- SimonBooth>>>

<<< You'd define a WikiNode subclass that defined the syntax for such links, and the replacement HTML. Depending on how it worked, you may cache values such as foreign lists-of-pages in the file system somewhere for efficiency. In any case, both the WikiNode and AcitionExtension could go into the same file. -- RobHague >>>

<<< Sorry, overuse of the word 'link'. The link-to-another-wiki appears automatically if the page exists elsewhere (either in the margin (which could be a WikiNode, I suppose, but it would have to override the default link node), or on a holding page, or (as I've implemented it) in the header. There's no special syntax, it's just something the wiki does. Either we'd want a HeaderExtension or some way of replacing the default ActionExtension, but neither of those seem like great solutions. -- SimonBooth>>>

<<< Ah. In that case, some sort PageFurniture plugin to define page header/footer information is probably in order. This would also be useful for, say, tables of contents that listed all of the anchors in a page, and more flexible customization of the HTML before and after the page, allowing you to alter the layout (add a margin frame or whatever). -- RobHague >>>

The (proposed) sequence of loading things is this:

unity.cgi locates and loads UnityWiki.py
UnityWiki.py imports UnityConfig, which combines _UnityDefaults and _Unity_Site into a single namespace (but, unlike the current arrangement, doesn't import * from it)
It then loads all .py files from the directory extensions (in alphabetical order), omitting any names listed in UnityConfig.omit_extensions (initially empty), into the "Extensions" namespace (this may have to be handled in another file, Extensions.py). This directory comes out of CVS and contains the system extensions (links, tables, other basic stuff)
It then loads everything from siteextensions in a similar manner. This directory does not come from CVS, but a parallel directory, optionalextensions, might. The latter would contain things like the SQL bridge. Combined with symlinks, this allows you to get some, but not all, of your site extensions from the distribution.
It then goes through the symbol table of Extensions, finding any significant classes (currently, only WikiNode subclasses), and registering them with the appropriate handlers.
Finally, it gets on with handling the request.

<<< This would involve splitting the functionality in WikiFormat.py between UnityWiki.py, Extensions.py, and a new module or two to go in extensions. WikiFormat.py itself would remain, to contain the definitions of WikiNode and a minimum set of other node types (ones that are mentioned explicitly in the parser, like TextNode, BlanklineNode and indentNode). Anyone see any problems with this scheme? -- RobHague >>>

X M L On Disk Representation

This is quite esoteric, and probably only matters to people who actually have installed copies of UnityWiki, either mirrors or independent sites. The proposal (which I've discussed with Ben a little) is that users would still edit pages in Wiki notation, but they'd be tree-based internally, and stored as XML on disk. The advantages of this would be:

Simpler, more uniform code
Easier data format for external processors
Fine-grained diffs that reflect the logical structure of the page

However, there are also disadvantages:

Wiki source is no longer canonical
In some cases, the source you get back may not be the source you put in (it might be possible to avoid this)
For other external processors (e.g., line-based ones such as diff(1)), XML may be more difficult to process

<<< Thoughts? -- RobHague >>>

Database Integration

<<<I thought it'd be nice to centralise information about NPCs, so I've added support on my own unity wiki for embedded SQL SELECT commands - so we could just have the NPCs page be $``$``$ SELECT * FROM npcs ORDER BY surname $``$``$, and have something like $``$``$ SELECT firstname,surname,description FROM npcs WHERE description LIKE '%Jenkin University%' $``$``$. What's the likelyhood of having access to postgresql on the box Unity runs on?
Also, what do people think is a good way of handling non-select SQL commands? -- SimonBooth >>>
- <<<tying to a database is going a bit far (and is sort of outside the scope of the original intentions of pikipiki which were to not require anything else...). If we were going to be tied to a database, we might as well store all our pages in it as well. -- BenChalmers; continues in next comment>>>
  - <<< I may be trying to solve a something that isn't a problem; I just worry that the big lists we have, of the same things presented differently in three places, might become a little hard to maintain as Unity grows. -- SimonBooth >>>
    - <<< I'm not sure it isn't a problem, more that I think the issue is that a wiki isn't the way to solve the problem (or at least the normal sort of wiki isn't the way to solve the problem). I think the approach to solving this might be some sort of plugin architecture (so SQL would be in python code, and we would just be including new tags)... Ideally the database ought to be generated by the wiki in some way (attributes on pages or grabbing names from one place frex.)... -- BenChalmers >>>
      - <<< I was thinking of having some way to refer to a table on another page in a SQL stylee - as we don't need efficiency, it should be fairly easy to implement without any external requirements. And I will get round to doing some Wiki coding some day soon, honest. -- RobHague>>>
        <<< This is a good idea -- BenChalmers >>>
        <<< On another note, I was planning to shunt macros off into a sepearate file that only gets imported when a macro is used. This would further import a SiteMacros file if it existed, in a similar way to the site config file. Tis might be a better place to put stuff that has external dependencies, such as the SQL hook. -- RobHague >>>
        <<< May I suggest you have a look at the plugin architecture in Bloxsom, not because you should copy it, but because it is a small aplication which does a job similar to that of the wiki and has become imensely powerful thanks to its plugins. -- BenChalmers >>>
        <<< It occurs to me that I need to add a bit of context to the previous comment. Firstly, macros will have parameters, and hence you can put a query in there. However, the syntax as Simon has it (which would seem to be a good one) isn't a macro, but this won't be a long-term problem. Once I've got my rewritten parser working (the basics are done, and it handles half of the node types, including the difficult ones), then it'll be equally straightforward to add new syntax, either to the main code or on a site-by-site basis.
        (Speaking of which, Simon, I'm planning to move the PageFormatter class to a seperate file as a precurosor to rewriting it with my new parser architecture. However, I guess this would cause you problems if you have significant diffs from the version in CVS. Hence, you might want to commit the SQL stuff to CVS (which I can't see a problem with, as long as it doesn't break anything else when it's not used). Alternatively, you could just migrate the diffs (which would probably not be hard - in the first instance, I'm just going to cut and paste the whole class into a new file verbatim). I'll hold off moving things 'til you let me know. Oh, and if such a move is likely to cause anyone else problems, shout.) -- RobHague >>>
        <<< I've fixed the problem with the tables (I'd missed one of the prints); if you don't have the SQL module installed, it should just leave the text alone. I'll check it into CVS, then. -- SimonBooth >>>
  <<<FWIW you tend to find MySQL provided by more hosting companies than Postgres, so whatever database access you provide really ought to be abstracted. -- BenChalmers >>>
  - <<< Python pretty much does that for you. Making it work with MySQL should be a one-line edit (line 35, for those in the New Paradigm); you would lose the column headings, but it should still run happily.
    It's at http://homepage.ntlworld.com/linnorm/unitysql.diff - thought I seem to have slightly broken tables. oops. -- SimonBooth >>>
    - <<< fixed. -- SimonBooth >>>
      <<< Grr - I broke preformatted text, too.. I wasn't outputting newlines any more. CVS is fixed. -- SimonBooth >>>
  <<< The SRCF provides MySQL, and might do PostgreSQL as well. However, I agree with Ben about the external dependencies thing - while I'm thinking of running an SQL database on my laptop anyway, it'd be nice not to have this as a requirement. -- RobHague >>>
  <<< Yes. I do have a requirment that Unity's UnityWiki runs under windows 2k and xp. This dosn't preclude MySQL, but does mean I'll be unhappy about the time I spend having to get it to work --BenChalmers >>>
  - <<< I have a suggestion as to how to achieve this - $$ bar $$ fum $$ baz $$ should work as |``| bar |``| fum |``| baz |``| if there's no SQL module, but as INSERT INTO foo VALUES ('bar', 'fum', 'baz') where foo is the last sql table you used (like, say, one you create in the line above the table) if you do. Then with some nifty SQL above and below, you'll get the correct table on the page it's defined whether or not you have sql installed. The derivative tables won't appear, but that's probably unavoidable, and acceptable for offline backups -- SimonBooth >>>
    <<< But not useful for offline browsing (which is the key use of backups... bringing them along to the game on laptops - at least until Steven gets wireless & broadband...). I still think tying Unity's UnityWiki to a database is a BadIdea. I still think in general the problems Wikis try to solve and the problems databases try to solve are different, if you want to solve both you want a new solution (which possibly doesn't exist yet). In any event the idea that altering one page could cause another page somewhere else in the wiki to break goes completely against the grain IMHO. Also any integration should be properly and modularly put into the wiki, not hacked on to the end of the existing code (in other words, we need reengineering of the existing code beofre we think about features like this) -- BenChalmers >>>
    <<< Fair enough. I do think that things like the npcs page require either a database or something similar. If this is because the problem of maintaining the sort of information which Unity generates isn't one which wikis try to solve... well, that'd certainly explain why Piki Piki doesn't support tables.
    Anyway, I agree on the last point. I'll have the code for the "$$" tables ready to go as soon as the modular system is in place. -- SimonBooth >>>
    <<< I wrote a long answer here, but it rambled and didn't really say anything I hadn't said before. I think we need to have this debate in person so that we can come up with a good solution that satisfies both of us, rather than you implementing things your way without actually providing an answer to my worries -- BenChalmers >>>
    <<< Another thought - rather than a database, could we have an xml file soring the info and use xql rather than sql queries. That way we could edit the xml file (which could be far freer in format than a traditional database) and write little macros to parse the output for embedding in pages -- BenChalmers >>>
    <<< Assuming we decide that a DB in the way you suggest is the way to go and we do it at the same time as we make the wiki -> xml -> html processing move, we could add an additional change xml -> preprocessed_xml before exporting files for offline use. In preprocessed_xml we could have all the queries already resolved, removing the need for a database in those situations -- BenChalmers >>>
    <<< I'm not sure of the distinction between editing the XML and having derived data on another page, and editing a Wiki page and having derived data on another page. However, as has been said, it'd all be better done as plugins when the code is more modularized. To begin with, I've created the NooParser CVS branch, and I'll add a preliminary version of the modular parser code when I've got it tidied up (probably some time tomorrow). -- RobHague >>>
    - <<< The distinction was in order to make a distinction (so people know what they are editing is the database, so the whole database lives in the same place and in order to put the database name outside of the wiki pages namespace). Still, my newer theory (described at last nights game) using xml namespaces to add attributes to pages and using categories as the equivalent of tables (so we get something like
      from (category(npc)&&category(unitysociety)&&attribute('dead')==false) get attribute('name') as a way to find particular items of data fits with the wiki way of doing things more, since you only need to alter the page relevent to the entry and everything else comes for free, is better - I think! --BenChalmers >>>
      <<< I think that the most important point made in the discussion last night was that all of the data about an object, say "Jack Proctor", would be on the page JackProctor. In that case, tables such as the one on the NonPlayerCharacters page are simply merges of all of the tables (in the Wiki sense - in fact, I'd suggest the term "relations" to mean database tables, to avoid confusion) on pages in a particular category. We'd need to add table headers, and probably table labels, but I think it would suffice for all of the agregate tables we use. Do we need anything more complicated? Actually, re-reading you're suggestion, I think they amount to basically the same thing, bar presentation. A third way would be have name-value pairs on pages with a syntax such as lines of the form:
      %Status: Dead
      These would be fairly easy to search for, and it seems to fit with the rest of the Wiki. -- RobHague >>>
      <<<
      Yeah - I quite like that. It might mean we need to have lots of tiny pages to do relationships between characters and books they have read though (for example). Not sure about the solution to this atm. Categories (And ability to search on them would be nice (you could have the category "ReadByLisa", which you put the book into). For some reason the equivalent idea of using %ReadByLisa: True doesn't appeal. You could also allow multiple categories
```
        %Read: Unaustelefonable Kulten%
        %Read: The Pages in Yellow%
        %Read: The Sussex Telephone Directory%
        %Read: The Blue Box of Eibon%
```
      But that would make the sql equivalent more complicated. Plus categories seem the nice way of doing that sort of relation to me... no idea why.
      --BenChalmers>>>
      <<< I was thinking of the latter, ie having a "Read" attribute for each book that has actually been read on the character's page (or, equivalently, a "ReadBy" for each reader on the book page). The alternative, having a "WhosReadWhat" page corresponding to the relation (both in the database and maths/functional programming sense) of books and readers would require something more than an unordered set of name/value pairs (or some really, really horrible abuse of them). Personally, I think sticking with attributes, and having something like "PAGES IN CatergoryPlayerCharacter WITH Read AS 'People Of The Monospace'", is preferable. This might be simplified by making category just another attribute: "PAGES WITH Read=='People Of The Monospace' AND Category==PlayerCharacter". -- RobHague >>>
      <<< OK; my current thinking on this is, briefly: Each page object (in Python) has an attribute "meta", which is a dictionary that holds metadata in the form of name-value pairs. Some of this data could be used internally, or by extensions. One extension would simply add a syntax for adding mappings to the dictionary according to the page source. Another extension could implement queries, possibly using a metadata cache, or alternatively directly. I think that having page metadata that's both user-manipulable and available to extensions would be very powerful; for example, adding "%subscribe: rob@rho.org.uk" would provide a very easy hook into e-mail notification. -- RobHague >>>

Authentication

Currently, the Wiki relys on HTTP Basic authentication, which is pretty appaling (it sends the username and password, in the clear, with every request).

<<< We could improve things a little by moving to HTTP Digest. Alternatively, I could knock something up based on cookies. Anyone have any preference either way? -- RobHague >>>