Migrating Wiki to GitHub Pages

Hi everyone,

Despite the low level of action, I’ve been spending a lot of time
thinking about Avogadro and the future. There are quite a few exciting
things, although it may take a while before they fully appear.

One thing I can talk about is pushing for tutorials and manuals.

The University of Pittsburgh is putting money and student-power towards
improving tutorials and manuals for Avogadro. We’re converting from a
proprietary program to Avogadro over the next two years. So we’ll be
putting up workflows, learning exercises, etc.

One thing I’m considering is moving away from MediaWiki. It’s insecure and
spammy. The whole point was to let people easily edit the pages, and that
doesn’t work now.

I’m open to suggestions, but I’d like to suggest GitHub pages. These are
generated from a repo, so it’s easy to add a pull request to modify them,
translate, etc. As static pages, they’re easier to package and distribute
too.

One catch is converting from MediaWiki to Jekyll, so if someone can help
with that, eg parsing the XML dump, please let me know.

Thoughts? Better alternatives?

Thanks,
Geoff

It’s worth noting that if you want to keep the wiki functionality but
move it into GitHub, you can also clone the repo wiki and alter it
that way:
git clone https://github.com/user/reponame.wiki.git

Also, if you can get the Mediawiki markup out of the XML, you can
convert that to GitHub-flavoured markdown (as well as a bunch of other
formats) with pandoc, which would work for either the GitHub wiki, or
the Jekyll-based pages. Something like a combination of pandoc and the
CPAN MediaWiki::DumpFile::Pages package might do the job, though it’d
probably require some checking and clean-up; the translation probably
won’t be perfect.

-Ian

On 16 October 2014 21:33, Geoffrey Hutchison geoff.hutchison@gmail.com wrote:

Hi everyone,

Despite the low level of action, I’ve been spending a lot of time thinking
about Avogadro and the future. There are quite a few exciting things,
although it may take a while before they fully appear.

One thing I can talk about is pushing for tutorials and manuals.

The University of Pittsburgh is putting money and student-power towards
improving tutorials and manuals for Avogadro. We’re converting from a
proprietary program to Avogadro over the next two years. So we’ll be putting
up workflows, learning exercises, etc.

One thing I’m considering is moving away from MediaWiki. It’s insecure and
spammy. The whole point was to let people easily edit the pages, and that
doesn’t work now.

I’m open to suggestions, but I’d like to suggest GitHub pages. These are
generated from a repo, so it’s easy to add a pull request to modify them,
translate, etc. As static pages, they’re easier to package and distribute
too.

One catch is converting from MediaWiki to Jekyll, so if someone can help
with that, eg parsing the XML dump, please let me know.

Thoughts? Better alternatives?

Thanks,
Geoff


Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho


Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

On Thu, Oct 16, 2014 at 4:33 PM, Geoffrey Hutchison geoff.hutchison@gmail.com wrote:

Hi everyone,

Despite the low level of action, I’ve been spending a lot of time thinking
about Avogadro and the future. There are quite a few exciting things,
although it may take a while before they fully appear.

One thing I can talk about is pushing for tutorials and manuals.

The University of Pittsburgh is putting money and student-power towards
improving tutorials and manuals for Avogadro. We’re converting from a
proprietary program to Avogadro over the next two years. So we’ll be putting
up workflows, learning exercises, etc.

One thing I’m considering is moving away from MediaWiki. It’s insecure and
spammy. The whole point was to let people easily edit the pages, and that
doesn’t work now.

I’m open to suggestions, but I’d like to suggest GitHub pages. These are
generated from a repo, so it’s easy to add a pull request to modify them,
translate, etc. As static pages, they’re easier to package and distribute
too.

One catch is converting from MediaWiki to Jekyll, so if someone can help
with that, eg parsing the XML dump, please let me know.

Thoughts? Better alternatives?

This sounds good to me, I think being able to issue pull requests
would be useful along with easier packaging, offline editing.

Marcus

It’s worth noting that if you want to keep the wiki functionality but
move it into GitHub,

Indeed, I think this would be a nice way to separate out developer documentation and user documentation. The dev documentation can be put into the GitHub repository wiki.

Also, if you can get the Mediawiki markup out of the XML, you can convert that to GitHub-flavoured markdown (as well as a bunch of other formats) with pandoc, which would work for either the GitHub wiki, or the Jekyll-based pages. Something like a combination of pandoc and the CPAN MediaWiki::DumpFile::Pages package might do the job, though it’d probably require some checking and clean-up; the translation probably won’t be perfect.

Agreed. If anyone is willing to take a stab, I’d be very grateful. There are a few packages claiming to do this, but they seem very brittle, e.g.
https://github.com/clioweb/mediawiki-jekyll
http://medialab.di.unipi.it/wiki/Wikipedia_Extractor

The latter one (in Perl) definitely works (e.g., from the XML dump) but needs to be hacked to separate things into separate page files.

In any case, I’m glad there’s some positive reception for this idea. It might be shelved for a bit, and I’m sure it’ll need clean-up (e.g., migrating images).

Worst case scenario, once Pitt gives me the OK to hire some undergrads, I’ll let them loose on this.

-Geoff

OK, a first pass is here:


e.g.
https://github.com/AvogadroChem/AvogadroChem.github.io/blob/master/features.md

This includes all the images I could grab, although I’m sure there are errors. It’s not a functioning site yet, but if someone has worked with Jekyll or GitHub Pages before, please send me an e-mail so we can get something up and running.

Also, thanks to those who wrote off-list about educational use. I might start a separate e-mail list for that, but I think we can coordinate to get some really nice exercises up for everyone to use.

-Geoff

To reiterate what Ian said, I think that pandoc + maybe a shell script will
get you most of the way there. If you have the XML, you can try (untested):

pandoc -f mediawiki -t markdown file.xml

(f is “from”, t is “to) I also just tried parsing a URL directly and it’s
not bad. This works decently well:

pandoc -f html -t markdown http://avogadro.cc/wiki/Tutorials:Getting_started

Regarding Jekyll / Liquid, I don’t think the AvogadroChem.github.io
approach is the way to go. There’s a little wiki button to the right of any
github repo (ex. d3’s wiki https://github.com/mbostock/d3). My vote is to
start with an official avogadro git repo (https://github.com/cryos/avogadro
maybe?), git clone cryos/avogadro.wiki.git, dump in pandoc markdown, and
then review the contents.

Regards,
Pat

On Fri, Oct 17, 2014 at 12:30 PM, Geoffrey Hutchison <
geoff.hutchison@gmail.com> wrote:

OK, a first pass is here:

https://github.com/AvogadroChem/AvogadroChem.github.io
e.g.

https://github.com/AvogadroChem/AvogadroChem.github.io/blob/master/features.md

This includes all the images I could grab, although I’m sure there are
errors. It’s not a functioning site yet, but if someone has worked with
Jekyll or GitHub Pages before, please send me an e-mail so we can get
something up and running.

Also, thanks to those who wrote off-list about educational use. I might
start a separate e-mail list for that, but I think we can coordinate to get
some really nice exercises up for everyone to use.

-Geoff


Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho


Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

I turned on wiki and issues for the repo if that helps. The GitHub
wikis are backed by git, and have editing facilities built in, so that
might be the easiest (I haven’t used the GitHub wikis much, but
willing to give it a try).

Marcus

On Fri, Oct 17, 2014 at 1:47 PM, Patrick Fuller patrickfuller@gmail.com wrote:

To reiterate what Ian said, I think that pandoc + maybe a shell script will
get you most of the way there. If you have the XML, you can try (untested):

pandoc -f mediawiki -t markdown file.xml

(f is “from”, t is “to) I also just tried parsing a URL directly and it’s
not bad. This works decently well:

pandoc -f html -t markdown http://avogadro.cc/wiki/Tutorials:Getting_started

Regarding Jekyll / Liquid, I don’t think the AvogadroChem.github.io approach
is the way to go. There’s a little wiki button to the right of any github
repo (ex. d3’s wiki). My vote is to start with an official avogadro git repo
(https://github.com/cryos/avogadro maybe?), git clone
cryos/avogadro.wiki.git, dump in pandoc markdown, and then review the
contents.

Regards,
Pat

On Fri, Oct 17, 2014 at 12:30 PM, Geoffrey Hutchison
geoff.hutchison@gmail.com wrote:

OK, a first pass is here:

https://github.com/AvogadroChem/AvogadroChem.github.io
e.g.

https://github.com/AvogadroChem/AvogadroChem.github.io/blob/master/features.md

This includes all the images I could grab, although I’m sure there are
errors. It’s not a functioning site yet, but if someone has worked with
Jekyll or GitHub Pages before, please send me an e-mail so we can get
something up and running.

Also, thanks to those who wrote off-list about educational use. I might
start a separate e-mail list for that, but I think we can coordinate to get
some really nice exercises up for everyone to use.

-Geoff


Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho


Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel


Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho


Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/avogadro-devel

I turned on wiki and issues for the repo if that helps. The GitHub wikis are backed by git, and have editing facilities built in

As I said before, I think we should consider separating developer-oriented docs and user-centric docs. The MediaWiki lumped them together.

My proposal is to split the current wiki content:

  • Developer-centric docs -> GitHub Wiki
  • User-centric docs -> Static pages @ http://avogadro.cc

The problem with GitHub wiki for a project is that it clearly “lives” at GitHub. You can’t re-skin it, add your logo, or serve from a custom domain.

That said, I think it would be great to have the “Compiling under Visual Studio” guides, “Working with Git” and similar developer-oriented docs here. The API overview is another good candidate.

For user-centric documentation, (tutorial, educational exercises, etc.) I’m merely suggesting GitHub pages. I’m open to other possibilities, but…

  • Keep it simple (e.g., easy to generate static HTML to distribute)
  • Make it easy to contribute
  • No spam
  • Needs to be themed, so we can at least hang an Avogadro logo
  • Ideally responsive web design so people can view easily on mobile or tablet

I think GitHub pages (and thus Jekyll) could qualify. I don’t do much web development so I’m very open to suggestions and discussion. A wiki would be great, except that MediaWiki response is fairly slow, and it’s easily spammed.

Let’s continue the discussion… Other ideas to fix the user documentation?

-Geoff

P.S. I already dumped the Pandoc markdown, so it can go wherever. For now, it’s here:

The only other option I can think of is to create a gh-pages branch in the
avogadro git repo, as these automatically get rendered and served on
github. That being said, I think your approach is better.

The web dev aspect should be pretty straightforward. Jekyll takes markdown
files and runs them through templates, and their default blog template
should get you most of the way. It’s appx an hour of tutorials, and I’d be
willing to provide some detail if needed.

On Fri, Oct 17, 2014 at 2:50 PM, Geoffrey Hutchison <
geoff.hutchison@gmail.com> wrote:

I turned on wiki and issues for the repo if that helps. The GitHub wikis
are backed by git, and have editing facilities built in

As I said before, I think we should consider separating developer-oriented
docs and user-centric docs. The MediaWiki lumped them together.

My proposal is to split the current wiki content:

  • Developer-centric docs -> GitHub Wiki
  • User-centric docs -> Static pages @ http://avogadro.cc

The problem with GitHub wiki for a project is that it clearly “lives” at
GitHub. You can’t re-skin it, add your logo, or serve from a custom domain.

That said, I think it would be great to have the “Compiling under Visual
Studio” guides, “Working with Git” and similar developer-oriented docs
here. The API overview is another good candidate.
https://github.com/cryos/avogadro/wiki

For user-centric documentation, (tutorial, educational exercises, etc.)
I’m merely suggesting GitHub pages. I’m open to other possibilities, but…

  • Keep it simple (e.g., easy to generate static HTML to distribute)
  • Make it easy to contribute
  • No spam
  • Needs to be themed, so we can at least hang an Avogadro logo
  • Ideally responsive web design so people can view easily on mobile or
    tablet

I think GitHub pages (and thus Jekyll) could qualify. I don’t do much web
development so I’m very open to suggestions and discussion. A wiki would
be great, except that MediaWiki response is fairly slow, and it’s easily
spammed.

Let’s continue the discussion… Other ideas to fix the user documentation?

-Geoff

P.S. I already dumped the Pandoc markdown, so it can go wherever. For now,
it’s here:
https://github.com/AvogadroChem/AvogadroChem.github.io