Migrating from Drupal 6 to Nikola    Geplaatst:


(I recorded an episode for Hacker Public Radio about this subject, which will be 'on air' on 2014-09-30. HPR1507)

As promised, some details about how I migrated my blog from Drupal 6 to Nikola.

First of all some details about the Drupal site I migrated from.

  • I did not have node revisions on my Drupal site.
  • My Drupal site had one 'vocabulary', which was used to assign tags to each post.
  • I did not use page aliases, so every post I had, had an url like johanv.org/node/195.

Also worth mentioning is that I used pandoc 1.12.3.3 to convert the html of my nodes to RestructuredText, the format that Nikola uses by default. If you have another version of pandoc installed, you will possibly need to tweak the script I used.

I created a new subdomain for my blog: blog.johanv.org. I used a script that creates a blog post for every published Drupal node on my Drupal site.

This script was created with a lot of trial and error. It can probably use some improvements, so I invite you to look at it, and to send me pull requests

#!/bin/bash

# migrate articles and stories from drupal 6 to nikola
# Copyright 2014 Johan Vervloet
# You can use and distribute this script under the terms of the
# GNU General Public License version 3 or later.

# Notes:
#  * my drupal site has no multiple revisions of posts.
#  * I had one vocabulary that I used for tagging posts.
#  * you need to have pandoc installed for this script to work.


# please change the two variables below
# according to your needs

# mysql command to connect to your drupal database
MYSQL_CMD="mysql -N -s -u root johanv6"
# directory to save the files
OUT_DIR="/tmp/out"

mkdir -p $OUT_DIR

nodes=$(echo "
      SELECT nid
      FROM node
      WHERE status > 0
      " | $MYSQL_CMD);

for nid in $nodes
do
      out_file=$OUT_DIR/$nid.rst

      details=$(echo "
              SELECT FROM_UNIXTIME(created),title
              FROM node
              WHERE nid=$nid
              " | $MYSQL_CMD | sed 's/\t/;/g');

      created=`echo $details | cut -f 1 -d\;`
      title=`echo $details | cut -f 2 -d\;`

      tags=$(echo "
              SELECT GROUP_CONCAT(td.name)
              FROM term_node tn JOIN term_data td ON tn.tid=td.tid
              WHERE tn.nid=$nid
              " | $MYSQL_CMD);

      cat > $out_file << EOF
.. title: $title
.. slug: node-$nid
.. date: $created
.. tags: $tags
.. link:
.. description:
.. type: text

EOF


      echo "SELECT body FROM node_revisions WHERE nid=$nid" | \
      $MYSQL_CMD | \
# convert node from html to rst
      pandoc --from=html --to=rst | \
# some trial and error for newlines
      sed 's/\\\\n/\n/g' | \
# convert references to other posts
      perl -p -000 -e 's;`((\s|[^<])*)</node/([0-9]*)>`__;:doc:`\1<node-\3>`;g' | \
# lots of trial and error to convert inline code
        perl -p -000 -e 's/``([^`]*\\n[^`]*)``/\n\n::\n\n\1\n\n/g' | \
      sed 's/\\n/\n  /g' | sed 's/\\t/\t/g' | sed 's/\\ / /g' | \
# convert video-links to youtube-links
# I did the conversion of \_ to _ manually
      sed 's/\[video:.*[/=]\([^/=]*\)\]/.. youtube:: \1/g' >> $out_file

# some output to show progress
      echo -n .

done

The script queries the database of the Drupal instance, to do the following for each node:

  • get the timestamp, title and tags for the node.
  • create a rst document with this metadata.
  • do a lot of manipulations on the node content, and add the content to the rst document.

Those manipluations are the following

  • conversion from html to restructured text (rst)
  • fixing line ending issues
  • conversion of references to other blog posts
  • handling issues with blocks with literal code in the blog posts
  • conversion of links to youtube videos

I won't read out the whole script, that wouldn't make interesting radio, but I will put a link in the shownotes. It's an ugly script: you'll have to edit the first lines, describing how you can connect to the database (put your credentials in my.cnf), and where the output files should go. (By default they go in /tmp/out.)

You probably have to tweak the script to adapt it to your needs, but hey, you have a starting point.

The script converts each node e.g. johanv.org/node/195 to a blog post blog.johanv.org/posts/node-195.html. This way I could easily convert hyperlinks to other posts to the corresponding html page of the new blog.

On the location of my old blog, I put an .htaccess file, that redirects all requests /node/number to the correct page on the new blog

RewriteEngine On
RewriteCond %{HTTP_HOST} !^blog\.
RewriteRule ^(.*)node/(.*)$ http://blog.johanv.org/posts/node-$2 [R=301,L]
RewriteRule ^(.*)$ http://blog.johanv.org/$1 [R=301,L]

You will have to With the combination of the script and the .htaccess file, 90% of the migration was very easy. But - as always - the remaining 10% needs some manual work. Like e.g. converting the youtube links containing underscores. Those underscores were prefixed with a backslash, which wasn't correct. Because there weren't too many of those errors, I fixed them manually.

Another thing you should do manually, is migrating attachments and images to your new site. Let's hope you don't have too many of them. And if so, you can probably write a script as well.

Comments

I used to have a blog in Drupal 6    Geplaatst:


At last I migrated my blog. It used to be a Drupal 6 blog, and I wasn't sure whether I wanted to migrate to Drupal 7 or Drupal 8. And eventually, I didn't migrate to either of those.

Now I am using Nikola, a static site generator. I got the idea from guitarman, who did an HPR episode about Nikola.

I see a lot of great features:

  • I can store my content in a git repository.
  • I can use my favourite text editor to edit my blog posts.
  • I can save blog posts when I'm not connected to the internet.
  • I don't need to install security updates for Drupal and Drupal modules.

Don't get me wrong, I think Drupal is a great product, especially when content on a website has to be provided by people who don't like to use git. Or when you need to have some forms on your site. Or when the server has to do some intelligent work. But my personal blog has other use cases, like displaying the texts I wrote. So I think a compiled site might not be such a bad solution.

This site uses Disqus for managing comments, which is of course evil. But hey, if you don't want to use Disqus, just send me a pull request with your comment :-D

I more or less migrated all articles of my drupal site to this new instance. (I will provide more details on that in a later post.) But there is still work to do, as you might already found out.

This is work in progress. I will keep you informed on the progress.

Comments

The set of prime numbers is infinite    Geplaatst:


(This article is more or less a transcript of a show I sent to hacker public radio.)

In this short article I want to talk about prime numbers. In praticular: about the fact that there exist an infinite number of prime numbers. This has been proven more than 2000 years ago, but I noticed that a lot of my friends that don't have a mathematical background, aren't aware of this fact.

Yet it is rather easy to prove. So that is what I'll be doing in this article. If you are afraid of math, don't worry, it won't take more than 10 minutes.

First of all I am going to define a prime number. I won't go into technical details, but a positive integer is a prime number if it has exactly 2 positive divisors: 1 and the number itself.

For the proof that the sequence of prime numbers is infinite, I am going to cheat a little. I am going to use the fundamental theorem of arithmetic. This theorem states that every integer greater than 1 is either a prime number itself, or it can be written as the product of prime numbers. This product of prime number is unique, apart from the order of the factors.

An example. Take the number 42. 42 can be written as a product of prime numbers: 2x3x7. Apart from the order of the factors 2, 3 and 7, there is no other way to write 42 as a product of prime numbers. And this is true for every integer greater than 1.

This seems a trivial thing, but in fact it is not. Nevertheless, to keep this discussion on topic, I will assume that the fundamental theorem of arithmetic is valid.

Now. The proof that there are infinitely many prime numbers.

We will show that for any finite set of prime numbers, there exists at least one prime number not contained in this set. If I can prove this, it follows that the set of all prime numbers must be infinite.

So we take a random set of n prime numbers, we call those prime numbers p_1, p_2, p_3, and so on. The last one is called p_n.

Now we construct a new number, let's say q. We construct q by multiplying all those prime numbers, and add one.

Is p_1 is a divisor of q? When you divide q by p_1, the quotient equals p_2 times p_3 times p_4 and so on times p_n. The remainder is 1. This follows from how we constructed q. So p_1 is not a divisor of q.

The same is true for p_2, p_3, and so on. None of our n prime numbers is a divisor of q.

What if we apply the fundamental theorem of arithmetic to q? It says that we can write q as a product of prime numbers. So let's do that. None of those prime numbers is contained in our original set of n prime numbers, because the prime numbers in our product are divisors of q, and a_1, a_2 and so on are not. So there exists at least one other prime number, not in our finite set, which is a divisor of q.

There we are. We just proved that if you take a finite set of random prime numbers, there is always at least one prime number not contained in this set. This means the set of prime numbers is infinite.

I hope you enjoyed this proof. It is not impossible that I made a mistake, because I didn't do a lot of math for the last 10 years. If you have any comments, please let me know.

Comments

Ik heb mijn fiets-gps-systeem gevonden    Geplaatst:


Een tijd geleden moest ik eerst even langs Lier passeren, alvorens ik naar het werk in Antwerpen fietste. Ik ken de fietswegen tussen Lier en Antwerpen niet zo goed, en aangezien de navigatie-app van Google maps al een tijdje fietsroutes ondersteunt, had ik het idee om me gps-gewijze te laten begeleiden.

Slecht idee. Google stuurde me langs idyllische plaatsjes zoals de ring rond Lier, en via de N10 helemaal van Lier tot Wilrijk. Vervolgens mocht ik de binnensingel volgen tot in Berchem, en daar werd ik dan min of meer fatsoenlijk begeleid tot waar ik moest zijn.

Dit verhaal is er eentje van in de winter. Ik heb vandaag de fietsroute nog eens opgezocht, en het is al een beetje beter:

googles route

Vanaf Mortsel stuurt Google me nu via de spoorweg tot in Berchem. Maar om in Mortsel te geraken, moet ik nog wel langs de Antwerpsesteenweg. Of langs de Hagenbroeksesteenweg, maar dat is al even erg. Er moest toch iets beters bestaan.

En dat betere alternatief heb ik nu gevonden. Het heet YOURS, en je kunt het uitproberen op yournavigation.org. YOURS gebruikt de kaarten van openstreetmap. Openstreetmap is te vergelijken met wikipedia, maar verzamelt wegenkaarten.

Het YOURS-navigatiesysteem leidde me naast de spoorweg tot in Boechout. Vervolgens langs rustige wegen tot in Mortsel. En vandaar volgde ik de spoorweg verder tot in Antwerpen. Anderhalve kilometer meer trappen, dat wel. Maar zonder het continue geraas van de ochtendspits, wat het een pak aangenamer maakt.

yours route

Als je over een Androidtelefoon beschikt, kan je telefoon de weg voor je wijzen. Hiervoor installeer je Osmand, een toepassing die je kunt downloaden via Google Play.

Als je de toepassing voor het eerst start, stelt ze voor om een aantal kaartgegevens te downloaden. Ik heb dat gedaan voor België, dat leek me nog wel nuttig. Je moet Osmand nu nog wel instellen om YOURS te gebruiken, en dat doe je als volgt:

  • Instellingen
  • Navigatie
  • Fiets
  • Routeberekening
  • YOURS
  • Ik heb ook 'stembegeleiding' aangezet, omdat ik mijn telefoon niet kan zien terwijl ik fiets. Ik gebruik de 'oortjes'.

Het navigatiesysteem gebruiken, kun je bijvoorbeeld als volgt:

  • Zoeken
  • (selecteer het 'huisje')
  • Kies stad, straat, en een aangrenzende straat
  • (met het sterretje kun je de locatie toevoegen aan je favorieten)
  • Klik 'Aanwijzingen' (of het verkeersbordje met de pijl)
  • Dan 'Alleen tonen' om de weg te tonen, of 'Volg' om de navigatie te starten

Het is aangewezen om op voorhand eens te kijken hoe de route loopt. Want als je op de gesproken aanwijzingen rijdt, dan is het soms niet helemaal duidelijk welke richting je uit moet. Een aantal valkuilen:

  • 'Flauwe bocht naar rechts' moet je soms interpreteren als 'gewoon het pad volgen, dat wat afbuigt', soms als 'rechtsaf een straat inslaan', of soms als 'een klein bijna onzichtbaar fietspaadje inrijden'.
  • 'Volg de weg' wil soms zeggen: 'Rij rechtdoor het kleine fietspaadje in, terwijl de gewone weg afbuigt.'
  • Als een fietspad aan een kruispunt even rechtsaf draait, dan krijg je soms de aanwijzing 'sla linksaf' als je aan het kruispunt gewoon rechtdoor moet.

Maar dat is allemaal niet onoverkomelijk. Als je van de route raakt, word je gevraagd om terug te keren, of wordt ze herberekend.

Comments

Printing to a HP Laserjet CP1025 on a Linux print server with CUPS    Geplaatst:


Almost 2 years ago, I bought a HP Laserjet CP1025. It was rather cheap. And it has crappy Linux support. Damn.

Every time I install a new Linux distribution, I have forgotten how to use this printer. So I might as well document it here.

If you want to attach your printer to your PC using the USB cable. It's not that difficult. You can install the hplip package, which is available in the Debian and Fedora repositories. Probably other distributions' as well. When you run hp-setup, you get an ugly gui tool to configure your printer, that downloads the correct freedom hating driver. But it works. (You can also run it in the console: hp-setup -i)

Now, where I live, I don't have room for a printer in the living room. So the printer is in the attic, attached to a Raspberry Pi print server. But for that to work, I need a client side printer driver. And that is a problem. The hplip tools don't want to believe me that there is a HP Laserjet CP1025 on the network. And I could not find out how to configure a CUPS network printer with the hplip drivers.

What you need, is the 'foo2zjs' printer driver. Which might be in your package repository. Which might even be already installed on your system. But don't use the one from the repositories! It does not work. I don't know why either.

So uninstall foo2zjs from your system, and compile it yourself. It is not hard to do. Go to the foo2zjs website, and scroll down to 'Download and install'. You only need to download the firmware for your printer model (in this case 1025).

Be careful now when you add your printer. The driver name is: HP LaserJet Pro CP1025nw Foomatic/foo2zjs-z3 (en). Which is, becasuse of the Pro in the name, on a completely different place in the alphabetical list than the other (nonfunctional) cp1025 drivers, which don't mention 'Pro'.

Comments

Tax-on-web met fedora 19 beta    Geplaatst:


Wil je met Fedora 19 beta je EID-kaartlezer aan de praat krijgen om met Tax-On-Web je belastingaangifte te doen? Met wat chance lukt het in vijf stappen:

  1. Installeer de middleware. Ik downloadde die voor Fedora 16. Je krijgt een deel warnings; negeer die maar.
  2. Installeer de firefox-add-on.
  3. Zet in about:config security.ssl.allow_unrestricted_renego_everywhere__temporarily_available_pref op false.
  4. Herstart Firefox
  5. Plaats je EID in de kaartlezer
  6. Doe de test.
  7. Let op: er verschijnt ergens een popup om je pin-code in te tikken. In het slechtste geval verschijnt die 'onder' de browser, en lijkt alles vast te lopen.

Comments

Wireless networking on a Samsung Series 5 laptop with Linux Mint    Geplaatst:


Last weekend, I installed Linux Mint 15 (Olivia) on a collegue's Samsung notebook. (I did a lshw, and it showed me '535U34C, Samsung SENS').

At my place, everything seemed to work OK (WPA2 and all), but when I tried it at work, the wireless connection dropped every x seconds, and needed to reauthenticate.

So I guess the problem depends on the type of wireless router.

I did some googling, and often I read that I had to pass the nohwcrypt=1 option to the ath9k kernel module. Which did not work. So I cluelessly tried some more suggestions, and finally I found a set of options that do work: nohwcrypt=1 blink=1 btcoex_enable=1 enable_diversity=1.

To try this out, you can do the following:

rmmod ath9k
modprobe ath9k nohwcrypt=1 blink=1 btcoex_enable=1 enable_diversity=1

(Maybe you have to log out and log on again after this command).

If it works, you can persist the settings, adding the following line to /etc/modprobe.d/ath9k.conf

options ath9k nohwcrypt=1 blink=1 btcoex_enable=1 enable_diversity=1

That worked for me. So I hope this information is useful for someone else as wel.

Many thanks to Sergei Winitzki, who posted the comment that saved my day.

Comments

Sending e-mail from mutt using an Exchange server    Geplaatst:


My favourite mail client is mutt. At work we have an Exchange mail server, and I wanted to use mutt for my work e-mail. Today it works. (I am using Fedora 19 beta)

Our mail server supports IMAP access, so reading mail is easy. But the difficult part, is sending e-mails.

In a first attempt, I tried this in the mutt configuration:

set smtp_url="smtp://mylogin@our.local.mailserver:587"
set smtp_pass="mySecretPassword"

I think this has worked for me somewhere in the past, but now I get the error message 'No authenticators available.' So I had to try something else.

Now I am using msmtp to send my e-mails. I installed it from the repositories:

yum install mstmtp

I created a configuration file: ~/.msmtprc:

account myaccount
host webmail.our.domain
from my.email@our.domain
auth ntlm
tls on
tls_trust_file /etc/ssl/certs/ca-bundle.crt
user mylogin
ntlmdomain MYDOMAIN
password "mySecretPassword"
port 587
account default : myaccount

In the mutt configuration, I removed the smtp-setttings, and added

set sendmail="/usr/bin/msmtp"

And guess what. It just worked (TM) :-)

Comments

Een geprecompileerde .NET-toepassing deployen    Geplaatst:


Hoe installeer ik een geprecompileerde .NET-toepassing op een bestaande IIS-server?

Ik maakte hierover een screencast in lelijke tussentaal en met slechte geluidskwaliteit. Donanaties van fatsoenlijke headsets worden geapprecieerd ;-)

Comments

Word processors are overrated    Geplaatst:


Word processors are overrated

(This is the transcript of my second submission for Hacker Public Radio.)

Word processors are overrated. Too often they are used instead of better alternatives. For example: to write a report, to describe a workflow or a vision, a lot of people just grab Microsoft Word. Which is a bad idea. Should you use LibreOffice Writer then? OpenOffice? Maybe Google docs? They are not much better.

If the focus of your text is on its content, if the structure of your text is important, if the way the text is laid out is less important than the consistency of the lay-out, or if you want to collaborate with other people, you should not use a typical mainstream word processor.

Problems with mainstream word processors

Page layout errors

There are some major issues, and the first one is that you will probably end up with page layout errors.

If the way your text is laid out is unimportant, you should focus on the actual information, not on making your text looking good. With a mainstream word processor, you often end up with formatting inconsistencies: incorrectly indented list bullets, wrong fonts in a text after a copy-paste-operation, or inconsistencies in the formatting of section titles. Especially when you have to collaborate with other people, the result will be ugly. And if some contributors use a different word processor than you are using, all bets are loose.

When multiple people work together on a document using a mainstream word processor, and if those people don't really care about page layout, they will probably create an ugly document. And that's a shame. It does not have to be this way

The underestimated learning curve

Another problem with typical word processors, is that the learning curve is underestimated. Experienced users of a word processor will argue that you can avoid all those layout problems, if you use the software the right way. But that means that all the people working with you on the same document, should know how to use your word processor. Some of them might have to invest in training. And even then, it is easy to make mistakes.

A typical word processor has a WYSYWYG interface, which seems to be very easy to use. Even a 3 year old child can produce a text. But any advanced user of a certain word processor, will agree that there are many ways to use it in a wrong way.

It's not beatiful

The last problem I want to cover might be personal, but many texts that are created with software like Word are not to say beautiful. It is not difficult to produce ugly texts with Word alike systems, and it happens a lot. Some users need to be protected against the Comic Sanses of this world.

Expect more

If you don't care about your page layout, you should not spend time in laying out your pages.

You have a computer. Your computer should take care on the looks of your document, so that you can concentrate on what actually matters: the content.

So we have to look for are alternatives for Word. Not LibreOffice, not Google Docs not Abiword; they have the same problems. We should be looking for something completely different. Like for exampel LaTeX.

LaTeX

LaTeX is very good if you need mathematical formulas in your text. If you have to write a mathematical text (on your own or together with someone else), you obviously choose LaTeX anyway, because there is just nothing else. But if your text is not about mathematics, and you have to work with someone else, LaTeX is usually not an option. The majority of people are easily scared, because a LaTeX source document is rather hard to read.

Plain text

What else can we use? Plain text? It is an option as well, but the possibilities to format text are really limited. Plain text is good for quickly sending an e-mail, but as soon as you need some formatting, it just won't work.

HTML

Maybe HTML is an option. HTML is a whole lot richer than plain text. But also it comes with some disadvantages. The source code is still quite difficult to read. It requires some work to get a nice printout (without e.g. headers and footers from a browsers). And I personally find the HTML tags annoying to type.

Markdown

So now I come to the point I want to make: Markdown is a great alternative for writing texts. I won't pretend that it is the perfect solution, but it has some nice features, it has a decent user base, and the learning curve is quite low.

A Markdown file is a plain text document. Meta-information about the structure is added using symbols like the asterisk or the hash symbol. This way, the source text stays very readable, and you can easily see the structure.

Text documents can be opened by virtually everyone.

And because the possibilities to structure the text are limited, the possibilities to make mistakes are limited as well.

A Markdown document is a text file, but there are a lot of tools that render Markdown to a formatted text. A lot of blogs and forums accept Markdown as input format. And so does Github. If you work on a text with someone who understands the workings of git, Github renders your text, and you can easily look up the history of your files.

Next to those web applications, there are also some native applications, which show a live preview of the text you are typing, like e.g. ReText for Linux, and Markdownpad for Windows.

If you are comfortable using the command line, then you can use pandoc to convert your markdown documents to LaTeX (for pretty output), to Word (for conservative readers), to HTML and some Wiki formats. There is also a command line tool (which is called mcider) that converts a Markdown document to a html slideshow, but at this moment you will probably have to do some hacking to finetune the layout of your resulting slides.

Limitations of markdown

Markdown is not ideal, it has some limitations. Like for example: There is no clear Markdown standard. Putting tables or images in your document, is not always supported. Support for footnotes is often non-existent. And so on.

On that level, I think that dokuwiki does a better job. But unfortunately: the dokuwiki syntax is less used than Markdown. In fact, I don't think it is used anywhere except on dokuwiki itself.

Another disadvantage is that most people do not know Markdown. And even worse: Windows doesn't know Markdown (or doesn't want to). As said, markdown documents just plain text files, but they typically get the .md-extension. And if you try to open such a document in Windows, then you get the message that the file format is not recognized. So if you work on a document with a Windows user who does not know the difference between plain text and binary file formats, you probably better use the .txt-extension for your file name. And if you do not use Windows yourself, make sure that your Windows colleagues get a text file with Windows line endings, otherwise notepad is confused :-)

Comments

This text is also available on github. In markdown format, of course. :-) You can post comments (issues, or even pull requests) over there.

Comments

Contents © 2014 Johan Vervloet - Powered by Nikola Creative Commons License BY-SA
Flattr