How to get google page rank in java

This post is intended to help webmasters with Java background. This is not related to day to day java programming or interviews. But, a little more always feels good, right?

Google page rank

Google page rank

Google provides its server’s URL which Google tool bar uses to display the page rank of currently displaying webpage in browser. You can use that server’s URL to get the data dynamically in java.

The URL is like this :

http://toolbarqueries.google.com/tbr?client=navclient-auto&features=Rank&ch=SITECHECKSUM&q=info:

Check-sum is calculated using Jenkins hash function. Lets see how it works.

Jenkins hash algorithm implementation

package com.howtodoinjava.google;

public class JenkinsHash {
	 
	// max value to limit it to 4 bytes
	private static final long MAX_VALUE = 0xFFFFFFFFL;
 
	// internal variables used in the various calculations
	long a;
	long b;
	long c;
 
	/**
	 * Convert a byte into a long value without making it negative.
	 */
	private long byteToLong(byte b) {
		long val = b & 0x7F;
		if ((b & 0x80) != 0) {
			val += 128;
		}
		return val;
	}
 
	/**
	 * Do addition and turn into 4 bytes.
	 */
	private long add(long val, long add) {
		return (val + add) & MAX_VALUE;
	}
 
	/**
	 * Do subtraction and turn into 4 bytes.
	 */
	private long subtract(long val, long subtract) {
		return (val - subtract) & MAX_VALUE;
	}
 
	/**
	 * Left shift val by shift bits and turn in 4 bytes.
	 */
	private long xor(long val, long xor) {
		return (val ^ xor) & MAX_VALUE;
	}
 
	/**
	 * Left shift val by shift bits. Cut down to 4 bytes.
	 */
	private long leftShift(long val, int shift) {
		return (val < < shift) & MAX_VALUE;
	}
 
	/**
	 * Convert 4 bytes from the buffer at offset into a long value.
	 */
	private long fourByteToLong(byte[] bytes, int offset) {
	    return (byteToLong(bytes[offset + 0])
		+ (byteToLong(bytes[offset + 1]) << 8)
		+ (byteToLong(bytes[offset + 2]) << 16) + (byteToLong(bytes[offset + 3]) << 24));
	}
 
	/**
	 * Mix up the values in the hash function.
	 */
	private void hashMix() {
		a = subtract(a, b);
		a = subtract(a, c);
		a = xor(a, c >> 13);
		b = subtract(b, c);
		b = subtract(b, a);
		b = xor(b, leftShift(a, 8));
		c = subtract(c, a);
		c = subtract(c, b);
		c = xor(c, (b >> 13));
		a = subtract(a, b);
		a = subtract(a, c);
		a = xor(a, (c >> 12));
		b = subtract(b, c);
		b = subtract(b, a);
		b = xor(b, leftShift(a, 16));
		c = subtract(c, a);
		c = subtract(c, b);
		c = xor(c, (b >> 5));
		a = subtract(a, b);
		a = subtract(a, c);
		a = xor(a, (c >> 3));
		b = subtract(b, c);
		b = subtract(b, a);
		b = xor(b, leftShift(a, 10));
		c = subtract(c, a);
		c = subtract(c, b);
		c = xor(c, (b >> 15));
	}
 
	/**
	 * Hash a variable-length key into a 32-bit value. Every bit of the key
	 * affects every bit of the return value. Every 1-bit and 2-bit delta
	 * achieves avalanche. The best hash table sizes are powers of 2.
	 * 
	 * @param buffer
	 *            Byte array that we are hashing on.
	 * @param initialValue
	 *            Initial value of the hash if we are continuing from a previous
	 *            run. 0 if none.
	 * @return Hash value for the buffer.
	 */
	public long hash(byte[] buffer, long initialValue) {
		int len, pos;
 
		// set up the internal state
		// the golden ratio; an arbitrary value
		a = 0x09e3779b9L;
		// the golden ratio; an arbitrary value
		b = 0x09e3779b9L;
		// the previous hash value
 
		//c = initialValue;
		c = 0x0E6359A60L;
 
		// handle most of the key
		pos = 0;
		for (len = buffer.length; len >= 12; len -= 12) {
			a = add(a, fourByteToLong(buffer, pos));
			b = add(b, fourByteToLong(buffer, pos + 4));
			c = add(c, fourByteToLong(buffer, pos + 8));
			hashMix();
			pos += 12;
		}
 
		c += buffer.length;
 
		// all the case statements fall through to the next on purpose
		switch (len) {
		case 11:
			c = add(c, leftShift(byteToLong(buffer[pos + 10]), 24));
		case 10:
			c = add(c, leftShift(byteToLong(buffer[pos + 9]), 16));
		case 9:
			c = add(c, leftShift(byteToLong(buffer[pos + 8]), 8));
			// the first byte of c is reserved for the length
		case 8:
			b = add(b, leftShift(byteToLong(buffer[pos + 7]), 24));
		case 7:
			b = add(b, leftShift(byteToLong(buffer[pos + 6]), 16));
		case 6:
			b = add(b, leftShift(byteToLong(buffer[pos + 5]), 8));
		case 5:
			b = add(b, byteToLong(buffer[pos + 4]));
		case 4:
			a = add(a, leftShift(byteToLong(buffer[pos + 3]), 24));
		case 3:
			a = add(a, leftShift(byteToLong(buffer[pos + 2]), 16));
		case 2:
			a = add(a, leftShift(byteToLong(buffer[pos + 1]), 8));
		case 1:
			a = add(a, byteToLong(buffer[pos + 0]));
			// case 0: nothing left to add
		}
		hashMix();
 
		return c;
	}
 
	/**
	 * See hash(byte[] buffer, long initialValue)
	 * 
	 * @param buffer
	 *            Byte array that we are hashing on.
	 * @return Hash value for the buffer.
	 */
	public long hash(byte[] buffer) {
		return hash(buffer, 0);
	}
}

Now its time to get the page-rank of a webpage. Lets write the code:

GooglePageRankUtil.java

package com.howtodoinjava.google;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;

public class GooglePageRankUtil {

	public static void main(String[] args) 
	{
		GooglePageRankUtil pagerankTool = new GooglePageRankUtil();
		System.out.println(pagerankTool.getPageRank("http://howtodoinjava.com/2012/10/22/singleton-design-pattern-in-java/"));
	}

	public int getPageRank(String pageAddress) {

		String result = "";

		JenkinsHash jenkinsHash = new JenkinsHash();
		long hash = jenkinsHash.hash(("info:" + pageAddress).getBytes());

		// Append a 6 in front of the hashing value.
		String url = "http://toolbarqueries.google.com/tbr?client=navclient-auto&hl=en&"
				+ "ch=6"
				+ hash
				+ "&ie=UTF-8&oe=UTF-8&features=Rank&q=info:"
				+ pageAddress;

		System.out.println("Requesting URL : " + url);

		try {
			URLConnection conn = new URL(url).openConnection();

			BufferedReader br = new BufferedReader(new InputStreamReader(
					conn.getInputStream()));

			String input;
			while ((input = br.readLine()) != null) {

				//If response Rank_1:1:2 then PR = 2
				System.out.println(input);

				result = input.substring(input.lastIndexOf(":") + 1);
			}

		} catch (Exception e) {
			System.out.println(e.getMessage());
		}

		if ("".equals(result)) {
			return 0;
		} else {
			return Integer.valueOf(result);
		}
	}
}


Output:

Requesting URL : http://toolbarqueries.google.com/tbr?client=navclient-auto&amp;hl=en&amp;ch=61932893626&amp;ie
=UTF-8&amp;oe=UTF-8&amp;features=Rank&amp;q=info:http://howtodoinjava.com/2012/10/22/singleton-design-pattern-in-java/
Rank_1:1:2
2

References:

http://en.wikipedia.org/wiki/Jenkins_hash_function
http://www.codeproject.com/Articles/20038/Request-Google-s-Page-rank-Programmatically

Happy Leaning !!

One thought on “How to get google page rank in java”

  1. hi,i think it is not an real implement.it get the pagerank value of a site by get web service of google.

Note:- In comment box, please put your code inside [java] ... [/java] OR [xml] ... [/xml] tags otherwise it may not appear as intended.

Leave a Reply

Your email address will not be published. Required fields are marked *


9 × eight =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>