Confused Coders is a place where we share lessons and thoughts with you. Feel free to fire you doubts straight on our face and we will try best to come back to you with the clarifications. We also have few pdf's which might be helpful to you for your interview preparations.

     Book shelf: Feel free to download and share. Cheers \m/


Have Fun !

Create / Update Drill storage plugin without drill browser UI – via rest api and curl request

Drill has got a great rest support and we can leverage the rest interface to create/update Drill Storage plugins via curl requests. Request to create a new storage plugin – curl -X POST -H “Content-Type: application/json” -d ‘{ “name”:”newplugin”, “config”:{“type”: “cassandra”, “host”: “localhost”, “port”: 9042, “enabled”: false}}’ http://localhost:8047/storage/newplugin.json This creates a new plugin named newplugin for us. This plugin can then be enabled by this get request – curl -X GET http://localhost:8047/storage/newplugin/enable/true We can now use the plugin as we would normally use in Drill. More details on all other API support can be fetched from the code here – Hope this small post was helpful. Cheers \m/


The following blog is a simple insight into a very helpful and widely used framework for logging purpose. Bingo !!! You are right – Log4j Log4j is designed to be highly configurable using simple configuration files at runtime. Using its main components viz. Loggers,appenders and layouts it provides a very extensive logging process in quiet many aspects like levels of priorities,logging destinations (console,database, file…) etc. What are the responsibilities of these core objects? Well the names of the objects actually give you a very intuitive definition itself – Loggers are responsible for capturing logging information. Appenders are responsible for publishing logging information to specific destinations. Layouts are responsible to format the […]

SQL on Cassandra : Querying Cassandra via Apache Drill

In this crisp post I would be talking about Drill’s Cassandra Storage plugin which would enable us to query Cassandra via Apache Drill. That also means that we would be able to issue ANSI SQL queries on Cassandra which is not inherently supported on Cassandra. All the code : Patch: There are couple of steps we would need to setup Cassandra storage before we can start playing with Cassandra and Drill. Download the patch and save in file. (Here:DRILL-92-CassandraStorage.patch) 1. Get Drill: Lets get the Drill source $> git clone 2. Get Cassandra Storage patch: Download the Patch file from 3. Apply the patch on top of […]

Dynamic Sorting Utility

Hi Friends !!! Recently I came across a requirement where in I was supposed to sort a Custom Class based on multiple parameters. Problem Statement :You are supposed to sort a Student class with parameters rollNo , name,age and weight. Now you want to sort it on name, if names match sort it on rollNo ,if rollNo matches sort it on age and so on. To add spice to the problem : The priority order may vary. If the priority of ordering was fixed and pre-defined the implementation of this shouldn’t be a very difficult task. What I needed was a functionality where in your code can dynamically take in […]

Installing Solr on ubuntu

Here is a quick dirty post on installing SOLR on your box. Hope its helpful. Download SOLR Get new Solr copy. I got my copy from Download a version you are interested in. Preferrably the latest version. – extract out solr – copy /examples contents to – /opt/solr – check for another solr dir inside the examples dir, rename it as solr_home (not required though, but avoids confusion with parent dir) – Add alias (for ease): SOLR_HOME=/opt/solr/solr_home alias startsolr=”cd /opt/solr; java -Dsolr.solr.home=$SOLR_HOME -jar start.jar” Start Solr Use command: $> cd /opt/solr $> java -Dsolr.solr.home=$SOLR_HOME -jar start.jar or alias directly: $> startsolr Solr Admin Check the Solr Admin web interface at- localhost:8983/solr […]

How to run pig latin scripts on apache drill

This is an initial work on supporting Pig scripts on Drill. It extends the PigServer to parse the Pig Latin script and to get a Pig logical plan corresponding to the pig script. It then converts the Pig logical plan to Drill logical plan. The code is not complete and supports limited number of Pig Operators like LOAD, STORE, FILTER, UNION, JOIN, DISTINCT, LIMIT etc. It serves as a starting point for the concept. Architecture Diagram: Code: Review Board: Operators Supported: LOAD, STORE, FILTER, UNION, JOIN, DISTINCT, LIMIT. Future work: FOREACH and GROUP is not supported yet. TestCases: org.apache.drill.exec.pigparser.TestPigLatinOperators. Pig Scripts can be tested on Drill’s web interface as well (localhost:8047/query). […]

Mahout usage IncompatibleClassChangeError Exception

The error pops up while using mahout collab filtering on Hadoop 2. Exception in thread “main” java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.mahout.common.HadoopUtil.getCustomJobName( at org.apache.mahout.common.AbstractJob.prepareJob( at at at at at at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( at sun.reflect.DelegatingMethodAccessorImpl.invoke( at java.lang.reflect.Method.invoke( at org.apache.hadoop.util.RunJar.main(   Fix: Specify hadoop 2 version explicitly mvn clean install -DskipTests=true -Dhadoop2.version=2.0.0

Hive Hangs unexpectedly and ends up with : Error in acquireLock..

Error in acquireLock… FAILED: Error in acquiring locks: Locks on the underlying objects cannot be acquired. Instant Patchy workaround: SET; Unlock table: unlock table my_table; Some other tricks that work – as suggested in cloudera forums – Hue leaves locks on tables sometimes. HUE_SER=`ls -alrt /var/run/cloudera-scm-agent/process | grep HUE | tail -1 | awk ‘{print $9}’` HUE_CONF_DIR=/var/run/cloudera-scm-agent/process/${HUE_SER} ls HUE_CONF_DIR /opt/cloudera/parcels/CDH/share/hue/build/env/bin/hue close_queries > tmp.log 2>> tmp.log Search is still On !!

How to convert mongo db json object to csv file

Quickly scribbled a function to get a plain csv out of mongo db json object. Use the script as you would call any shell script. sh > social_data_tmp.csv The has all the required mongo code, Something like – mongo << EOF function printUserDetails(user){ if (user == undefined){ return; } print(user._id+’,'+’,'+ user.birthday+’,'+ ((user.homeTown == undefined) ? ” : user.homeTown._id)+’,'+ cleanString((user.homeTown == undefined) ? ” :’,'+ ((user.location == undefined) ? ” : user.location._id) +’,'+ cleanString((user.location == undefined) ? ” :’,'+ getNames(user.likes)); } db.facebookUserData.find().forEach(function(user){ printUserDetails(user); }); EOF

Apache Drill – REST Support

This came as a pleasant surprise to me today when I found that Apache Drill now also has an embedded Jetty-Jersey based REST service interface exposed for tracking the status of the Drillbit along with the status of submitted queries. The interface can be checked out here once the Drillbit is running: http://localhost:8047/status