Twitter search API- Get tweets and tweets count of hashtag using JAVA twitter client Twitter4j

Twitter search API- Get tweets and tweets count of hashtag using Twitter4j JAVA

I have searched many resources on internet for how to obtain tweets count for a particular hashtag. To may observation there is no API exposed by Twitter for getting counts data for a hashtag.But there is a search tweets API that provides tweets data(including retweets) for a particular hashtag. By default, this API provides 15 tweets, but this can be increased till 100(this is maximum) using count parameter. Because of Twitter’s realtime nature and the volume of data which is constantly being added to timelines, standard paging approaches are not always effective. Kindly read below link in order to understand timeline.

https://dev.twitter.com/rest/public/timelines

In order to achieve efficiency Twitter advises us to use max_id parameter and since_id parameter. Twitter always responses with recent tweets, and with max_id and since_id parameters we work with relative to tweet ids, due to this a considerable amount of efficiency is achieved.

Below are the logical steps required to capture tweets for a hashtag at a particular instance:-

  • Provide Twitter search API with the hashtag keyword. For this request Twitter will respond with 15 max recent tweets(default), but if count is set to 100 in the request parameter then max 100 recent tweets will be received.
  • Now, suppose you have thousands of tweets. So, max_id comes into the picture. With the use of it we can have our own cursor implemented. From the first 100 recent tweets just capture the lowest tweet id(which should be the last tweet data) and assign it to the max_id, and also capture the highest tweet id(which should be the first tweet) and assign it to the since_id. Next request will be same as above and additionally we will be adding one more request parameter max_id with its value. With max_id in place Twitter will respond with the tweets whose tweets ids are lower than max_id. Caution, max_id is inclusive so request parameter should be max_id-1.
  • Just make the above request in loop till there are tweets left.
  • You must be wondering that your job is done now and with the help of max_id you have achieved the goal, but wait, Twitter has a real-time nature so in the process of searching tweets using first 2 steps there may be loads of tweets generating on your hash tag. Now, since_id comes into the picture which we captured in the first step. Now make a new request and append since_id to it(max_id is not required now). Twitter will now respond with recent tweets, now you should again capture the max_id(required) and since_id(optional, depends on your requirement). Just loop through this request till you are left with no results (because you have traversed till the first tweet id which you got in the first request).

Now lets implement this logic in the form of JAVA program. Here we will write a JAVA twitter client using Twitter4j library. Twitter4j is a wrapper around Twitter APIs which takes care of all the low-level tasks like OAUTH security, making HTTP calls, JSON response paring. But before doing all this you need to create an app on Twitter using the account that will trend your hashtag.

https://dev.twitter.com/apps/new

After creating app on Twitter you will get OAUTH consumer key, consumer secret, also generate OAUTH access token and access token secret. Below is the readymade code for consuming Twitter search API using Twitter4j while using max_id and since_id parameter. Additionally if you have Fiddler software you can see the realtime HTTP request/response cycle. Just add twitter4j-core-4.0.4.jar(which you can download from Twitter4j website) to your application classpath.

package com.anoop.twitter;

import twitter4j.Query;
import twitter4j.QueryResult;
import twitter4j.Status;
import twitter4j.Twitter;
import twitter4j.TwitterException;
import twitter4j.TwitterFactory;
import twitter4j.conf.ConfigurationBuilder;

public class TwitterAPIconsumer_Twitter4J {
	static final String hashTag="#anoopKumarRai";
	static final int count = 100;
	static long sinceId = 0;
	static long numberOfTweets = 0;

	public static void main(String[] args){

		ConfigurationBuilder cb = new ConfigurationBuilder();
		cb.setDebugEnabled(true)
		  .setOAuthConsumerKey("*************************")
		  .setOAuthConsumerSecret("*************************")
		  .setOAuthAccessToken("*************************")
		  .setOAuthAccessTokenSecret("*************************");
		TwitterFactory tf = new TwitterFactory(cb.build());
		Twitter twitter = tf.getInstance();

		//get latest tweets as of now
		//At this point store sinceId in database
			Query queryMax = new Query(hashTag);
			queryMax.setCount(count);
			getTweets(queryMax, twitter, "maxId");
			queryMax = null;

			//get tweets that may have occurred while processing above data
			//Fetch sinceId from database and get tweets, also at this point store the sinceId
			do{
				Query querySince = new Query(hashTag);
				querySince.setCount(count);
				querySince.setSinceId(sinceId);
				getTweets(querySince, twitter, "sinceId");
				querySince = null;
			}while(checkIfSinceTweetsAreAvaliable(twitter));

	}

	private static boolean checkIfSinceTweetsAreAvaliable(Twitter twitter) {
		Query query = new Query(hashTag);
		query.setCount(count);
		query.setSinceId(sinceId);
		try {
			QueryResult result = twitter.search(query);
			if(result.getTweets()==null || result.getTweets().isEmpty()){
				query = null;
				return false;
			}
		} catch (TwitterException te) {
			System.out.println("Couldn't connect: " + te);
			System.exit(-1);
		}catch (Exception e) {
			System.out.println("Something went wrong: " + e);
			System.exit(-1);
		}
		return true;
	}

	private static void getTweets(Query query, Twitter twitter, String mode) {
		boolean getTweets=true;
		long maxId = 0;
		long whileCount=0;

		while (getTweets){
			try {
				QueryResult result = twitter.search(query);
			    if(result.getTweets()==null || result.getTweets().isEmpty()){
			    	getTweets=false;
			    }else{
			    	System.out.println("***********************************************");
				    System.out.println("Gathered " + result.getTweets().size() + " tweets");
				    int forCount=0;
				    for (Status status: result.getTweets()) {
				    	if(whileCount == 0 && forCount == 0){
				    		sinceId = status.getId();//Store sinceId in database
				    		System.out.println("sinceId= "+sinceId);
				    	}
				    	System.out.println("Id= "+status.getId());
				    	System.out.println("@" + status.getUser().getScreenName() + " : "+status.getUser().getName()+"--------"+status.getText());
				    	if(forCount == result.getTweets().size()-1){
				    		maxId = status.getId();
				    		System.out.println("maxId= "+maxId);
				    	}
				    	System.out.println("");
				    	forCount++;
				    }
				    numberOfTweets=numberOfTweets+result.getTweets().size();
				    query.setMaxId(maxId-1); 
			    }
			}catch (TwitterException te) {
				System.out.println("Couldn't connect: " + te);
				System.exit(-1);
			}catch (Exception e) {
				System.out.println("Something went wrong: " + e);
				System.exit(-1);
			}
			whileCount++;
		}
		System.out.println("Total tweets count======="+numberOfTweets);
	}	
}

Leave a Reply

Your email address will not be published. Required fields are marked *