Thursday, February 25, 2010

Interesting WWW 2010 papers

  • EXPLORING WEB SCALE LANGUAGE MODELS FOR SEARCH QUERY PROCESSING
    Jian Huang, Jiangbo Miao, Xiaolong Li, Jianfeng Gao and Kuansan Wang
  • CROSS-DOMAIN SENTIMENT CLASSIFICATION VIA SPECTRAL FEATURE ALIGNMENT
    Sinno Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang and Zheng Chen
  • BUILDING TAXONOMY OF WEB SEARCH INTENTS FOR NAME ENTITY QUERIES
    Xiaoxin Yin and Sarthak Shah
  • MULTI-MODALITY IN ONE-CLASS CLASSIFICATION
    Boris Chidlovskii and Matthijs Hovelynck
  • A SCALABLE MACHINE LEARNING APPROACH FOR SEMI-STRUCTURED NAMED ENTITY RECOGNITION
    Utku Irmak and Reiner Kraft
  • FACETED EXPLORATION OF IMAGE SEARCH RESULTS
    Roelof van Zwol and Börkur Sigurbjörnsson
  • TOWARDS NATURAL QUESTION GUIDED SEARCH
    Alexander Kotov and ChengXiang Zhai
  • A LARGE SCALE ACTIVE LEARNING SYSTEM FOR TOPICAL CATEGORIZATION ON THE WEB
    Suju Rajan, Dragomir Yankov, Scott Gaffney and Adwait Ratnaparkhi
  • DIVERSIFYING WEB SEARCH RESULTS
    Davood Rafiei, Krishna Bharat and Anand Shukla
  • A GENERAL FRAMEWORK FOR EXPLORING CATEGORY INFORMATION FOR QUESTION RETRIEVAL IN COMMUNITY QUESTION ANSWER ARCHIVES
    Xin Cao, Gao Cong, Bin Cui and Christian Jensen
  • RANKING SPECIALIZATION FOR WEB SEARCH: A DIVIDE-AND-CONQUER APPROACH BY USING TOPICAL RANKSVM
    Jiang Bian, Xin Li, Fan Li, Zhaohui Zheng and Hongyuan Zha
  • GENERALIZED DISTANCES BETWEEN RANKINGS
    Ravi Kumar and Sergei Vassilvitskii
  • THE ANATOMY OF A LARGE-SCALE SOCIAL SEARCH ENGINE
    Damon Horowitz and Sepandar Kamvar
  • USE TWITTER DATA FOR RECENCY RANKING IMPROVEMENT IN WEB SEARCH
    Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Bai Jing, Yi Chang, Fernando Diaz, Zhaohui Zheng and Hongyuan Zha
  • A CHARACTERIZATION OF ONLINE SEARCH BEHAVIOR
    Ravi Kumar and Andrew Tomkins
  • WHAT IS TWITTER, A SOCIAL NETWORK OR A NEWS MEDIA?
    Haewoon Kwak, Changhyun Lee, Hosung Park and Sue Moon

Wednesday, February 24, 2010

sqlite in Python

Sqlite is a very useful server-less database system that can be accessed from Python as well.

By the way you can combine multiple insert statements in a single transaction to improve performace
as described here.

import sqlite3

def createDB():
    db = sqlite3.connect('people.db')
    db.execute('create table boys(name, age)')
    db.commit()
    db.close()
    pass

def writeDB():
    db = sqlite3.connect('people.db')
    db.execute('insert into boys(name, age) values ("danushka",30)')
    db.commit()
    db.close()
    pass

def readDB():
    db = sqlite3.connect('people.db')
    L = db.execute('select name, age from boys').fetchall()
    print L
    db.close()
    pass

if __name__ == "__main__":
    #createDB()
    #writeDB()
    readDB()

Snow Leopard NumPy, SciPy, matplotlib

Installing the above mentioned packages in Snow Leopard can be quite difficult.

An easy solution is to use the script mentioned here
There is a very good description here if you want to do it step by step.

Continuously monitor GPU usage

 For nvidia GPUs do the follwing: nvidia-smi -l 1