Thursday, December 11, 2014

boost-python

Here is a minimum working example of boos-python from the official web site.

First create a hello.cpp file and put this.

#include YOU HAVE TO INCLUDE boost/python.hpp within tag markers.

char const* greet()
{
   return "hello, world";
}


BOOST_PYTHON_MODULE(hello)
{
    using namespace boost::python;
    def("greet", greet);
}

Now compline .so file by doing the following.

g++ -g -shared -fPIC -I/usr/include/python2.7 hello.cpp -lpython2.7 -lboost_python -o hello.so

Open IPython and import hello
now call hello.greet()


Monday, September 22, 2014

Distributed vs. Distributional Representations

Distributed Representations = Learnt using Deep Learning methods. Distributed over the parameter space. Dense, lower dimensional embeddings.

Distributional Representations = Counted from a corpus. Classical distributional hypothesis. Sparse, High dimensional.

Saturday, September 20, 2014

My referencing workflow

I use BibDesk for reference management. After a major academic event such as a conference, I access the proceedings website using BibDesk. ACL anthology indexes all papers for NLP conferences, IEEE Explorer can be used for IEEE conferences, ACM digital library works for ACM conferences, and Google Scholar in general for any. The BibTex entries can be imported to BibDesk from within BibDesk with a url to the paper. That is it. I then assign papers a To READ tag (keyword) or to a Prority Reading list. I then read those papers and assign keywords and annotations. I use Skim for reading and annotating PDFs on Mac.

For PDFs I find online, I send them to CiteULike and then import them to BibDesk from there. CiteULike can extract most of the attributes from PDFs (scientific papers) such as title and authors. I can use BibDesk again to access CiteULike and import these references.

Mendely is also a good tool if you want to find related papers.

This is my workflow.

Sunday, September 14, 2014

Remove duplicate entries from launch services

Run the following command as explained here

/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Support/lsregister -kill -r -domain local -domain system -domain user

Thursday, July 31, 2014

Debugging Python using ipdb

We can install ipdb (interactive python debugger) via pip
$ pip install ipdb
To insert a breakpoint (hard coded break points) in the code use ipdb.set_trace(). You can insert many breakpoints as you like. To run the code:
$ python mycode.py
When the breakpoint is hit, you will be dropped to ipdb prompt, similar to the ipython prompt. Use "c" (continue) command to continue the code until the next breakpoint or until the program ends.

For example,


import sparsesvd
import ipdb

def f(x):
    print x
    y = 2 *x 
    ipdb.set_trace()
    print y
    z = y * 3
    ipdb.set_trace()
    print z
    pass

if __name__ == "__main__":
    f(10)
    print "all done"

Wednesday, July 30, 2014

Bitbucket with SSH

If you want to access bitbucket via SSH instead of https, upload the SSH RSA public key to bitbucket. Then use the following path for example when cloning.

git clone ssh://git@bitbucket.org/Bollegala/SupEmb.git

Thursday, June 19, 2014

pbibtex Japanese processing

If you are using pbibtex in TexShop and have Japanese bib file encoded in sjis, then you need to specify the kanji code as sjis in order to properly format the Japanese bibtext entries. This can be done either by command line as
$ pbibtex projname -kanji=sjis
or by the TexShop Preferences --> Engine dialog under default bibtex engine specification by setting
pbibtex -kanji=sjis

Friday, April 18, 2014

Python nose testing

Nose provide additional testing functionalities that are not provided in unittests framework. Unlike unittests which is a standard python module you will have to easy_install nose as described here.


use test_mytest.py to name a test module and test_method to name any test methods. If you do so the nosetests will be able to discover and execute those tests automatically.

You will have to use assert_equal(A, B) to test whether A is equal to B.
For numpy arrays use numpy.testing.assert_array_equal instead to compare numpy arrays.

To test the unit tests you must call nosetests as follows.

To display the name of the test functions
nosetests -v test.py

To display the print statements as well
nosetests -v -s test.py

Execute a specific test function within a class in a test module
nosetests -v -s test.py:className:test_method 
 

Friday, February 21, 2014

Synchronized decorator for Python

The following decorator provides synchronization for Python codes.
Reference


def synchronized(lock):
    """ Synchronization decorator. """

    def wrap(f):
        def newFunction(*args, **kw):
            lock.acquire()
            try:
                return f(*args, **kw)
            finally:
                lock.release()
        return newFunction
    return wrap

if __name__ == '__main__':
    from threading import Thread, Lock
    import time

    myLock = Lock()

    class MyThread(Thread):
        def __init__(self, n):
            Thread.__init__(self)
            self.n = n

        @synchronized(myLock)
        def run(self):
            """ Print out some stuff.

            The method sleeps for a second each iteration.  If another thread
            were running, it would execute then.
            But since the only active threads are all synchronized on the same
            lock, no other thread will run.
            """

            for i in range(5):
                print 'Thread %d: Start %d...' % (self.n, i),
                time.sleep(1)
                print '...stop [%d].' % self.n

    threads = [MyThread(i) for i in range(10)]
    for t in threads:
        t.start()

    for t in threads:
        t.join()

Filter forwarded messages in GMail

If you want to filter a message that has been forwarded to your GMail account by one of your other email services, then you can create a filter with Has Words field set to

deliveredto:your_email_service_mail_accountname_that_forwarded_message. 

You can set filters for example to archive such forwarded messages without showing them in your Inbox. You might want to do this for example, if you want to keep a record of all your emails in your GMail account but do not want to display duplicates when you are using an email client that display emails received in multiple accounts.

Continuously monitor GPU usage

 For nvidia GPUs do the follwing: nvidia-smi -l 1