Thursday, December 31, 2009

Installing Python 2.6

Install the prerequistes

http://www.talino.org/tutorials/install-python-261-without-trashing-ubuntu/

Download the source from

create a directory
/home/user/myusr/python/2.6

decompress the tar
./configure --prefix=/home/user/myusr/python/2.6

make
make test
make install

Tuesday, October 6, 2009

checking version

In RedHat Linux you can check the OS version by
cat /etc/redhat-release

to check the kernel type
uname -m
if it is 32bit you will see i386
if it is 64bit you will see x86_64

Monday, September 28, 2009

Printers! Printers! Printers!

A very good colleague of mine once said "if you want to be happy, never print"...
Installing a network printer today, reminded me of the trouble. Yes, there are lots of
information available on the web both by printer manufacturers and other parties about
how to get this simple matter achieved. But it often turns out there are some unpredictable difficulties on the way. Without further ado, I will write down the exact steps that helped me to achieve this seemingly simple task.
OS: Windows Xp
  1. First download the printer driver from the manufacturer. I had to download the driver for Canon LBP 5800. Extract the file.
  2. Now comes the tricky part of installing a network port. In printers and faxes window, select Add Printer and select a printer connected to a local port. Do not select check for connected printers because the printer is not actually locally connected to your computer but rather sits on some where inside the network. Yes, this sounds bit strange because why would you try to install a computer in the network "locally" in the first place. The quick answer is you are going to map a network IP to a local port in a short while. Anyway selecting this check box will do no harm other than taking some extra few seconds to check for printers which are anyway not connected to your computer.
  3. Then select create a new port and select standard TCP/IP port. Here you must enter the IP of your printer. Ask the network admin about this piece of info if you do not know this already.
  4. Then proceed to select a driver. Often the driver for your network printer might not be displayed under the list shown by Xp by default. So select have disk and browse to the folder where you extracted the DLL for the driver. 
  5. Now you will be able to print a test page. If everything is fine you will get your first printed page! congrats. If nothing happens or you do not come this far, then good luck for you!!!

Saturday, September 26, 2009

classias notes

There is an excellent software to train/predict a range of ML algorithms including logistic regression with L1 or L2 regularization, pagasos SVM L1/L2, average perceptron. It is called Classias and is by Naoaki Okazaki.
It is amazingly fast and can handle large datasets! It work directly on compressed formats such as bz2, tar.gz

Here are some quick how to note
  • running binary logistic regression 

    classias-train -tb -a lbfgs.logistic -m

    -tb says it is of -t type binary b.
    -a specifies the learning algorithm, which is lbfgs optimized logistic regression (note: logistic regression is NOT a regression model but a classification algorithm), current L1/L2 regularization is supported only with lbfgs.
    -m  specifies the model file.
    the final entry is the actual training file. The format being,
    label fid:fval ...

  • cross validation, regularization and help
    To perform cross validation use -g5 -x options (5 says 5-fold cross validation, can use any integer there.) If you have your held out data on a separate file then you can specify both training and heldout data files using another set of options [see the documentation of Classias]. If you want to enable L1 regularization use -pc1=1 This says set the parameter c1 to the value 1 (regularization coefficient) for the algorithm specified by -a. The default value is zero for L1 regularization and 1 for L2. If you are using L1 regularization only then you must set L2 to zero. i.e. -pc2=0. Otherwise you will end up using both L1 and L2 regularizations! For example, if you set both regularization coefficients to 1, then you end up having more features in the final trained model compared to what you get if trained only with L1 regularization. But still, it is far less features than what you would get if you used only L2 regularization. For the RCV1 dataset, I got 40628 features only usng L2 (accuracy being 0.95), where as those values were 491(@0.95) for L1 only and 1597(@0.94) using both.

    General help of classias-train can be seen by doing,
    classias-train --help
    and to see what parameters are available for a specific algorithm (e.g. lbfgs.logistic) do the following,
    classias-train -a lbfgs.logistic -H
    H indicates parameter specific help. -h is the normal help.
    Putting it all together the following command trains a binary logistic regression model with L1 regularization and also performs 5-fold cross validation.
     classias-train -tb -a lbfgs.logistic -pc1=1 -pc2=0  -m rcv1.binary.model -g5 -x rcv1_train.binary

  • Multi-class classification
  • Tagging  (prediction)
    Read test instances from stdin and output the class labels , weights (-w), and in the case of logistic regression models probabilities (-p). Specify the model file by -m. You can compute accuracies by using -t option. To suppress labels etc. when testing use quiet option (-q).

    cat rcv1_test_binary | classias-tag -m rcv1.binary.model -p cat rcv1_test_binary | classias-tag -m rcv1.binary.model -w cat rcv1_test_binary | classias-tag -m rcv1.binary.model -tq
    If your data is in bz2 the use bzcat instead of cat.  

Machine Learning Datasets

Some useful datasets to benchmark ML algorithms are here.

classification dataset
RCV1/V2

Sunday, September 20, 2009

tokyo dystopia

This is an excellent full text search engine.
It can be downloaded from here.
http://1978th.net/tokyodystopia/

There are other useful tools by the same author here.
http://1978th.net/
The installation procedure can be bit tricky.
First install tokyo cabinet.
http://1978th.net/tokyocabinet/
And then install the prerequisites for tokyo dystopia
such as zlib (Otherwise you will get an error when you do configure)
And then install dystopia.
To index a file
dystmgr importtsv indexfolder datafile
datafile should be tsv (tab separated values)
1This is a fun program.

To search do:
dystmgr search -pv indexfolder "program"



Tuesday, September 15, 2009

windows was unable to find a certificate to connect to this network

My laptop connects to the home wireless network successfully but a pop-up appears at the
system tray with the above title (windows xp). The problem here is because I had enabled
IEEE 802.1x but did not have a 802.1x (EAP) environment. The solution is very simple.
Select the corresponding wireless network in the laptop and on properties->authentication tab
disable this authentication. works like a charm!

Saturday, July 11, 2009

basic authentication

If you want to provide basic authentication to a folder on your web server, do the following.

1. create a password file
htpasswd -c mypasswdfile username

Now enter a new password. If this file already exists then you do not need to create it by "-c".
Instead you can straight away add to the existing file a new username and a passwd.

2. Setting access.
cd to the folder that you want to protect and create a .htaccess file.
Add the following lines there.

AuthType Basic
AuthName authentication name
AuthUserFile location_of_the_mypasswd_file
AuthGroupFile ifany
Require specific_users can be given by user or set to valid-user

Thats all folks!

Sunday, June 28, 2009

firefox proxy

To connect to a remote computer and use it as a proxy do the following.

Open Tools->Options->Advanced->Network
SOCKS Host 127.0.0.1 Port 8080
select SOCKS v5

Now open a tunnel in Cygwin+Poderosa by
ssh -C2qTnN -D 8080 usernam@server

Alternatively, you can do the same using Putty.
For this set up a putty connection with details like host, auto login name etc.
Then load your private key to the server in the pagent.
Now to create the tunnel in putty,
Connection->SSH->Tunnels
In the source port type 8080
select Auto and Dynamic and hit Add.
This will add D8080
Now save those settings and open the connection.

Sunday, June 21, 2009

Useful Links

http://www-nlp.stanford.edu/~mgalley/
Lexical Chains tool is available here.

Convert source code to pure HTML. Useful when pasting code on a blog.
http://pygments.org/

Wednesday, April 29, 2009

feisty upgrade

Feisty is no longer supported, other than through old-releases

deb http://old-releases.ubuntu.com/ubuntu/ feisty main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ feisty-updates main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ feisty-security main restricted universe multiverse

Continuously monitor GPU usage

 For nvidia GPUs do the follwing: nvidia-smi -l 1