The ITSD Wiki is an old perl application that’s no longer supported as far as I can see. The pages are written in a non-standard markup, a disincentive to copying information to/from the Kwiki, and perhaps a disincentive to contributing to the Kwiki. I’ve looked around for converters. All I could find is the code demonstrated below, that deals with a few of the formatting issues. It doesn’t however deal with links.
use strict; my $text=qq| -This is a h1 title ''This is italized'' and here, ''''this is bold'''' on this line ''''''this is bold italic'''''' on this line ---- *list1 **list2 ***list3 #unordered list1 ##unordered list2 ###unordered list3 [''''name''''][''color''][''''''age''''''] [bob][red][32] [fred][orange][55] |; my $html=wiki2html($text); print $html; exit; ############# sub wiki2html{ my $str=shift || return; #http://wiki.beyondunreal.com/wiki/Wiki_Markup #---- <hr> #At the beginning, - <h1>, -- <h2>, --- <h3> #Unorderd lists *, ** #Ordered lists #, ## #tables [col1][col2] #''sdfs'' <i>, ''''sdfs'''' <b>, ''''''sddfs'''''' <b><i> my @lines=split(/[\r\n]/,$str); my $linecnt=@lines; my $dot='&#' . '9679' . ';'; my $space='&' . 'nbsp' . ';'; my %Marker=( ul => 0, ol => 0, table => 0, ); for(my $x=0;$x<$linecnt;$x++){ $lines[$x]=strip($lines[$x]); my $original=$lines[$x]; my $break=1; #---- <hr> $lines[$x]=~s/\-{4,4}/\<hr size\=\"1\"\>/sig; #At the beginning, - <h1>, -- <h2>, --- <h3> if($lines[$x]=~/^(\-{1,3})(.+)/s){ my $dashes=$1; my $txt=$2; my $count = $dashes =~ tr/\-//; $lines[$x]=qq|<h$count>$txt</h$count>|; $break=0; } #Unorderd lists *, ** if($lines[$x]=~/^([\*\.]+)(.+)/s){ my $stars=$1; my $txt=$2; my $count = $stars =~ tr/[\*\.]//; #$txt .= qq|<!-- ul: $Marker{ul} ,$count -->|; if($count > $Marker{ul}){ $lines[$x]="<ul><li>" . $txt; $Marker{ul}=$count; } elsif($count == $Marker{ul}){ $lines[$x]="<li>" . $txt; } else{ $lines[$x] = "</ul>"; if(length($txt)){$lines[$x] .= "<li>" . $txt ;} $Marker{ul}=$count; } $break=0; } elsif($Marker{ul}){ for(my $u=0;$u<$Marker{ul};$u++){ $lines[$x] = "</ul>" . $lines[$x]; } $Marker{ul}=0; } #Orderd lists #, ## if($lines[$x]=~/^([\#]+)(.+)/s){ my $stars=$1; my $txt=$2; my $count = $stars =~ tr/[\#]//; #$txt .= qq|<!-- ul: $Marker{ul} ,$count -->|; if($count > $Marker{ol}){ $lines[$x]="<ol><li>" . $txt; $Marker{ol}=$count; } elsif($count == $Marker{ol}){ $lines[$x]="<li>" . $txt; } else{ $lines[$x] = "</ol>"; if(length($txt)){$lines[$x] .= "<li>" . $txt ;} $Marker{ol}=$count; } $break=0; } elsif($Marker{ol}){ for(my $u=0;$u<$Marker{ol};$u++){ $lines[$x] = "</ol>" . $lines[$x]; } $Marker{ol}=0; } if(length($lines[$x])==0){$break=0;} #''sdfs'' <i>, ''''sdfs'''' <b>, ''''''sddfs'''''' <b><i> $lines[$x]=~s/\'{6,6}(.+?)\'{6,6}/<b><i>\1<\/i><\/b>/sig; $lines[$x]=~s/\'{4,4}(.+?)\'{4,4}/<b>\1<\/b>/sig; $lines[$x]=~s/\'{2,2}(.+?)\'{2,2}/<i>\1<\/i>/sig; #links [[ ]] if($lines[$x]=~/\[\[(.+?)\]\]/s){ $lines[$x]=~s/\[\[(.+?)\]\]/\<a href=\"\1\"\ target="_new">\1<\/a>/sg; } #tables [col1][col2] my @cols=(); while($lines[$x]=~/\[(.+?)\]/sg){ push(@cols,$1); } my $colcnt=@cols; if($colcnt){ $break=0; $lines[$x]=''; if(!$Marker{table}){ $lines[$x] .= qq|<table cellspacing="0" cellpadding="2" border="1" style="border-collapse:collapse">\n|; $Marker{table}=1; } $lines[$x] .= "<tr>"; foreach my $col (@cols){$lines[$x] .= qq|<td>$col</td>|;} $lines[$x] .= "</tr>"; } elsif($Marker{table}){ $lines[$x] = "</table>"; $Marker{table}=0; } #$lines[$x]=qq|<!--colcnt:$colcnt, table:$Marker{table}, Original: $original -->\n| . $lines[$x]; if($break){$lines[$x] .= "<br>";} } if($Marker{ul}){ for(my $u=0;$u<$Marker{ul};$u++){ push(@lines,"</ul>"); } $Marker{ul}=0; } if($Marker{ol}){ for(my $u=0;$u<$Marker{ol};$u++){ push(@lines,"</ol>"); } $Marker{ol}=0; } if($Marker{table}){ push(@lines,"</table>"); $Marker{table}=0; } return join("\r\n",@lines); } ############### sub strip{ #usage: $str=strip($str); #info: strips off beginning and endings returns, newlines, tabs, and spaces my $str=shift; if(length($str)==0){return;} $str=~s/^[\r\n\s\t]+//s; $str=~s/[\r\n\s\t]+$//s; return $str; }
Patrick Gosling says
One of the more surprising features of modern browsers is their ability to preserve markup information (links, tabulation, etc.) in copy-and-paste.
A plausible approach for small number of pages is to just “select-all” and “copy” in the kwiki and “paste” in the other-wiki.
For larger sets of pages, I wonder how automatable that might be (there are GUI automation tools out there).
Tim Love says
There are about 500 pages. I think it’s easier to iterate thru the pages at the database level, and for most of the pages I think the convertor above will do.
I hadn’t considered resorting to page-scraping. I could wget the pages. The HTML might be awful though. Anyway I’ll have to do something about the internal links.
Stephen Shorrock says
A while ago I wrote a script to download and produce an offline backup of the KWIKI this used certain Perl libraries to download all teh files referenced in the KWIKI’s index page.
If it is of use it is attached below;
use WWW::Mechanize;
use Config::General;
use Getopt::Long;
my $mech = WWW::Mechanize->new( autocheck => 1 );
my $kwiki_root="http://www.eng.cam.ac.uk/app/kwiki/";
my $kwiki_index=$kwiki_root."index.cgi";
my $tgt_dir="./DOWNLOADS/";
$mech->agent('Mozilla/5.0');
my %config=(tgt_dir=>"./DOWNLOADS/",verbose=>0);
my @required_args=qw/username password pages/;
map { $config{$_} = "" } @required_args;
#search for config files:
my %config_conf;
if (-e ".kwiki_spider.conf"){
%config_conf = (new Config::General(".kwiki_spider.conf"))->getall;
}
map { $config{$_} = $config_conf{$_} } keys(%config_conf);
GetOptions("password|p=s"=>\$config{password},"username|u=s"=>\$config{username},"pages|t=s"=>\$config{pages},"verbose|v"=>\$config{verbose},"directory|d=s"=>\$config{tgt_dir});
#split into an array and reference - unless value is equal to 'all'
unless ($config{pages} eq "all"){
my @arr = split(/,/,$config{pages});
$config{pages}=\@arr;
}
map { usage() if (!defined($config{$_}) || $config{$_} eq "") } @required_args;
sub usage(){
print "Usage: $0 -p -u -t [-d ] [-verbose]\nOr supply config file .kwiki_s
pider.conf\n\n";
exit;
};
#map { print $_." = ".$config{$_}."\n" } keys(%config);
#does our target directory exist?
mkdir($config{tgt_dir}) || die("unable to make target directory: $tgt_dir") unless (-e $config{tgt_dir});
$mech->get( $kwiki_index );
#we should now have the login page
#goto form
$mech->form_number(1);
#supply the fields:
$mech->field('uname',$config{username});
$mech->field('credential_pin',$config{password});
$mech->click_button(name=>'submit');
#next search blank to retrieve all files
#goto form
$mech->form_number(1);
$mech->field('search_term','');
$mech->submit();
#we should now have a list of all pages:
#within each td class=page_name is a href linking to a page
#store an array of these and retrieve each one
#do this via regex rather than follow links as there are also links to user pages
foreach ( split('\n',$mech->content()) ){
push @links, $kwiki_root."$1" if (m/page_name.*href="(.*)"/);
}
foreach (@links){
my $filename = $1 if (m/\?(.*)$/);
next unless $filename;
next unless (grep(/$filename/,@{$config{pages}}) || $config{pages} eq "all");
print STDOUT "$0 retrieving $_\n" if ($config{verbose} == 1);
$mech->get( $_ );
open(TMPFILE,">".$config{tgt_dir}."/$filename.html") || die("unable to open file: ".$config{tgt_dir}."/$filename.html\n");
my $content = $mech->content();
$/='';
$content =~ s/index.cgi\?([^'"]*)(['"])/.\/$1.html$2/g;
#$content =~ s/index.cgi\?//g;
$content =~ s/.*//s;
$content =~ s/.*//s;
print TMPFILE "";
print TMPFILE $content;
print TMPFILE "";
close(TMPFILE);
}