Wikidata:Pywikibot - Python 3 Tutorial/Setting up Shop

The easiest way to follow along with the tutorial is to either run Linux or a virtual machine of Linux. Don't be alarmed by this announcement. Setting up a virtual environment is easy and there are many videos explaining the process down to the individual mouseclick.

The great advantage for the tutorial is that we can skip a lot of conditional sentences and multiplied code, because everyone will work with a similar environment. And it is an environment that will also make your life easier as you will soon discover.

If you are having trouble with this first step you can start a thread on the discussion page.

Installing dependencies edit

Now that you have some version of Linux running we have to make sure that you have all the required dependencies for the tutorial installed. Open a terminal window and check if Python 3 is installed:

$ python3 --version
Python 3.4.2

Other dependencies that might be missing are (use either pip or your system package manger to install):

Using Debian packages
$ apt-get install python3-requests
Using pip
$ python3 -m pip install requests


As long as the version is above 3.3 you should have no problems following the tutorial. If you wrote python --version you will see the Python 2 version installed on the system, but the tutorial will only use Python 3.

Next we will check if git is installed. Git is a version control system that will help you to keep your Pywikibot files up to date.

$ git --version
git version 2.4.3

The terminal usually executes commands in your Home-directory. If you want to create a folder to store the pywikibot folder then you can create a dir and go into it using these 2 commands:

$ mkdir pywiki
$ cd pywiki

After that we will download the newest version of the pywikibot, called 'pywikibot core' using this command ([STRG]+[SHFT]+[V] to paste in the terminal):

git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git

Or append a folder name to the command in order to name it something different than core:

git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git pywikibot

Using tree -d you can look at the folder structure of pywikibot. It should look like this:

$ tree -d

.
├── docs
│   └── api_ref
│       └── tests
├── logs
├── pywikibot
│   ├── comms
│   │   └── __pycache__
│   ├── compat
│   ├── data
│   │   └── __pycache__
│   ├── families
│   │   └── __pycache__
│   ├── __pycache__
│   ├── tools
│   │   └── __pycache__
│   └── userinterfaces
│       └── __pycache__
├── scripts
│   ├── archive
│   ├── i18n
│   │   ├── add_text
│   │   ├── archivebot
│   │   ├── basic
│   │   ├── blockpageschecker
│   │   ├── ...
│   └── maintenance
└── tests
    ├── data
    │   ├── djvu
    │   ├── images
    │   └── xml
    ├── i18n
    │   └── test
    ├── pages
    └── pwb

Configuration edit

Now that we have Python 3 and Pywikibot installed we still need to configure pywikibot. First of all make sure the terminal is in the base directory (core or any other name you chose for the folder). And then list the files in that folder.

$ cd core
$ ls

You will see the following files and folders:

ChangeLog                LICENSE                    scripts
CREDITS                  pwb.py                     setup.py
dev-requirements.txt     pywikibot                  tests
docs                     README-conversion.txt      tox.ini
ez_setup.py              README.rst                 user-config.py.sample
generate_family_file.py  requests-requirements.txt
generate_user_files.py   requirements.txt

Now we run the user configuration:

$ python3 generate_user_files.py

This will prompt the following questions. We will choose the options listed at the very end of each line:

Select family of sites we are working on, just enter the number or name (default: wikipedia): wikidata

The language code of the site we're working on: wikidata

Username on wikidata:wikidata: YOURUSERNAME

Do you want to add any other projects? ([y]es, [N]o): n

After that we will find a new user-config.py file in the folder. You can open and look at it, but understanding all the settings is not important at this point. Let us check if we can log into Wikidata now. For this tutorial it is not necessary to have a separate bot-account. You can just use your normal Wikimedia account:

$ python3 pwb.py login
Password for user YOURUSERNAME on wikidata:wikidata (no characters will be shown): TYPEYOURPASSWORD
Logging in to wikidata:wikidata as YOURUSERNAME.
Logged in on wikidata:wikidata as YOURUSERNAME.

And with that you have successfully setup Pywikibot and you can continue with the next chapter where we will harvest our first data.